CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
In the Linux kernel, the following vulnerability has been resolved:
drm: Don't unref the same fb many times by mistake due to deadlock handling
If we get a deadlock after the fb lookup in drm_mode_page_flip_ioctl()
we proceed to unref the fb and then retry the whole thing from the top.
But we forget to reset the fb pointer back to NULL, and so if we then
get another error during the retry, before the fb lookup, we proceed
the unref the same fb again without having gotten another reference.
The end result is that the fb will (eventually) end up being freed
while it's still in use.
Reset fb to NULL once we've unreffed it to avoid doing it again
until we've done another fb lookup.
This turned out to be pretty easy to hit on a DG2 when doing async
flips (and CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y). The first symptom I
saw that drm_closefb() simply got stuck in a busy loop while walking
the framebuffer list. Fortunately I was able to convince it to oops
instead, and from there it was easier to track down the culprit. |
In the Linux kernel, the following vulnerability has been resolved:
serial: imx: fix tx statemachine deadlock
When using the serial port as RS485 port, the tx statemachine is used to
control the RTS pin to drive the RS485 transceiver TX_EN pin. When the
TTY port is closed in the middle of a transmission (for instance during
userland application crash), imx_uart_shutdown disables the interface
and disables the Transmission Complete interrupt. afer that,
imx_uart_stop_tx bails on an incomplete transmission, to be retriggered
by the TC interrupt. This interrupt is disabled and therefore the tx
statemachine never transitions out of SEND. The statemachine is in
deadlock now, and the TX_EN remains low, making the interface useless.
imx_uart_stop_tx now checks for incomplete transmission AND whether TC
interrupts are enabled before bailing to be retriggered. This makes sure
the state machine handling is reached, and is properly set to
WAIT_AFTER_SEND. |
In the Linux kernel, the following vulnerability has been resolved:
soc: qcom: pdr: Fix the potential deadlock
When some client process A call pdr_add_lookup() to add the look up for
the service and does schedule locator work, later a process B got a new
server packet indicating locator is up and call pdr_locator_new_server()
which eventually sets pdr->locator_init_complete to true which process A
sees and takes list lock and queries domain list but it will timeout due
to deadlock as the response will queued to the same qmi->wq and it is
ordered workqueue and process B is not able to complete new server
request work due to deadlock on list lock.
Fix it by removing the unnecessary list iteration as the list iteration
is already being done inside locator work, so avoid it here and just
call schedule_work() here.
Process A Process B
process_scheduled_works()
pdr_add_lookup() qmi_data_ready_work()
process_scheduled_works() pdr_locator_new_server()
pdr->locator_init_complete=true;
pdr_locator_work()
mutex_lock(&pdr->list_lock);
pdr_locate_service() mutex_lock(&pdr->list_lock);
pdr_get_domain_list()
pr_err("PDR: %s get domain list
txn wait failed: %d\n",
req->service_name,
ret);
Timeout error log due to deadlock:
"
PDR: tms/servreg get domain list txn wait failed: -110
PDR: service lookup for msm/adsp/sensor_pd:tms/servreg failed: -110
"
Thanks to Bjorn and Johan for letting me know that this commit also fixes
an audio regression when using the in-kernel pd-mapper as that makes it
easier to hit this race. [1] |
In the Linux kernel, the following vulnerability has been resolved:
bpf: avoid holding freeze_mutex during mmap operation
We use map->freeze_mutex to prevent races between map_freeze() and
memory mapping BPF map contents with writable permissions. The way we
naively do this means we'll hold freeze_mutex for entire duration of all
the mm and VMA manipulations, which is completely unnecessary. This can
potentially also lead to deadlocks, as reported by syzbot in [0].
So, instead, hold freeze_mutex only during writeability checks, bump
(proactively) "write active" count for the map, unlock the mutex and
proceed with mmap logic. And only if something went wrong during mmap
logic, then undo that "write active" counter increment.
[0] https://lore.kernel.org/bpf/678dcbc9.050a0220.303755.0066.GAE@google.com/ |
In the Linux kernel, the following vulnerability has been resolved:
tty: xilinx_uartps: split sysrq handling
lockdep detects the following circular locking dependency:
CPU 0 CPU 1
========================== ============================
cdns_uart_isr() printk()
uart_port_lock(port) console_lock()
cdns_uart_console_write()
if (!port->sysrq)
uart_port_lock(port)
uart_handle_break()
port->sysrq = ...
uart_handle_sysrq_char()
printk()
console_lock()
The fixed commit attempts to avoid this situation by only taking the
port lock in cdns_uart_console_write if port->sysrq unset. However, if
(as shown above) cdns_uart_console_write runs before port->sysrq is set,
then it will try to take the port lock anyway. This may result in a
deadlock.
Fix this by splitting sysrq handling into two parts. We use the prepare
helper under the port lock and defer handling until we release the lock. |
In the Linux kernel, the following vulnerability has been resolved:
scsi: ufs: Fix a deadlock in the error handler
The following deadlock has been observed on a test setup:
- All tags allocated
- The SCSI error handler calls ufshcd_eh_host_reset_handler()
- ufshcd_eh_host_reset_handler() queues work that calls
ufshcd_err_handler()
- ufshcd_err_handler() locks up as follows:
Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt
Call trace:
__switch_to+0x298/0x5d8
__schedule+0x6cc/0xa94
schedule+0x12c/0x298
blk_mq_get_tag+0x210/0x480
__blk_mq_alloc_request+0x1c8/0x284
blk_get_request+0x74/0x134
ufshcd_exec_dev_cmd+0x68/0x640
ufshcd_verify_dev_init+0x68/0x35c
ufshcd_probe_hba+0x12c/0x1cb8
ufshcd_host_reset_and_restore+0x88/0x254
ufshcd_reset_and_restore+0xd0/0x354
ufshcd_err_handler+0x408/0xc58
process_one_work+0x24c/0x66c
worker_thread+0x3e8/0xa4c
kthread+0x150/0x1b4
ret_from_fork+0x10/0x30
Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved
request. |
In the Linux kernel, the following vulnerability has been resolved:
mptcp: fix deadlock in __mptcp_push_pending()
__mptcp_push_pending() may call mptcp_flush_join_list() with subflow
socket lock held. If such call hits mptcp_sockopt_sync_all() then
subsequently __mptcp_sockopt_sync() could try to lock the subflow
socket for itself, causing a deadlock.
sysrq: Show Blocked State
task:ss-server state:D stack: 0 pid: 938 ppid: 1 flags:0x00000000
Call Trace:
<TASK>
__schedule+0x2d6/0x10c0
? __mod_memcg_state+0x4d/0x70
? csum_partial+0xd/0x20
? _raw_spin_lock_irqsave+0x26/0x50
schedule+0x4e/0xc0
__lock_sock+0x69/0x90
? do_wait_intr_irq+0xa0/0xa0
__lock_sock_fast+0x35/0x50
mptcp_sockopt_sync_all+0x38/0xc0
__mptcp_push_pending+0x105/0x200
mptcp_sendmsg+0x466/0x490
sock_sendmsg+0x57/0x60
__sys_sendto+0xf0/0x160
? do_wait_intr_irq+0xa0/0xa0
? fpregs_restore_userregs+0x12/0xd0
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f9ba546c2d0
RSP: 002b:00007ffdc3b762d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f9ba56c8060 RCX: 00007f9ba546c2d0
RDX: 000000000000077a RSI: 0000000000e5e180 RDI: 0000000000000234
RBP: 0000000000cc57f0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9ba56c8060
R13: 0000000000b6ba60 R14: 0000000000cc7840 R15: 41d8685b1d7901b8
</TASK>
Fix the issue by using __mptcp_flush_join_list() instead of plain
mptcp_flush_join_list() inside __mptcp_push_pending(), as suggested by
Florian. The sockopt sync will be deferred to the workqueue. |
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error path
Prior to this patch in case mlx5_core_destroy_cq() failed it returns
without completing all destroy operations and that leads to memory leak.
Instead, complete the destroy flow before return error.
Also move mlx5_debug_cq_remove() to the beginning of mlx5_core_destroy_cq()
to be symmetrical with mlx5_core_create_cq().
kmemleak complains on:
unreferenced object 0xc000000038625100 (size 64):
comm "ethtool", pid 28301, jiffies 4298062946 (age 785.380s)
hex dump (first 32 bytes):
60 01 48 94 00 00 00 c0 b8 05 34 c3 00 00 00 c0 `.H.......4.....
02 00 00 00 00 00 00 00 00 db 7d c1 00 00 00 c0 ..........}.....
backtrace:
[<000000009e8643cb>] add_res_tree+0xd0/0x270 [mlx5_core]
[<00000000e7cb8e6c>] mlx5_debug_cq_add+0x5c/0xc0 [mlx5_core]
[<000000002a12918f>] mlx5_core_create_cq+0x1d0/0x2d0 [mlx5_core]
[<00000000cef0a696>] mlx5e_create_cq+0x210/0x3f0 [mlx5_core]
[<000000009c642c26>] mlx5e_open_cq+0xb4/0x130 [mlx5_core]
[<0000000058dfa578>] mlx5e_ptp_open+0x7f4/0xe10 [mlx5_core]
[<0000000081839561>] mlx5e_open_channels+0x9cc/0x13e0 [mlx5_core]
[<0000000009cf05d4>] mlx5e_switch_priv_channels+0xa4/0x230
[mlx5_core]
[<0000000042bbedd8>] mlx5e_safe_switch_params+0x14c/0x300
[mlx5_core]
[<0000000004bc9db8>] set_pflag_tx_port_ts+0x9c/0x160 [mlx5_core]
[<00000000a0553443>] mlx5e_set_priv_flags+0xd0/0x1b0 [mlx5_core]
[<00000000a8f3d84b>] ethnl_set_privflags+0x234/0x2d0
[<00000000fd27f27c>] genl_family_rcv_msg_doit+0x108/0x1d0
[<00000000f495e2bb>] genl_family_rcv_msg+0xe4/0x1f0
[<00000000646c5c2c>] genl_rcv_msg+0x78/0x120
[<00000000d53e384e>] netlink_rcv_skb+0x74/0x1a0 |
In the Linux kernel, the following vulnerability has been resolved:
iio: adis16475: fix deadlock on frequency set
With commit 39c024b51b560
("iio: adis16475: improve sync scale mode handling"), two deadlocks were
introduced:
1) The call to 'adis_write_reg_16()' was not changed to it's unlocked
version.
2) The lock was not being released on the success path of the function.
This change fixes both these issues. |
In the Linux kernel, the following vulnerability has been resolved:
mwifiex: bring down link before deleting interface
We can deadlock when rmmod'ing the driver or going through firmware
reset, because the cfg80211_unregister_wdev() has to bring down the link
for us, ... which then grab the same wiphy lock.
nl80211_del_interface() already handles a very similar case, with a nice
description:
/*
* We hold RTNL, so this is safe, without RTNL opencount cannot
* reach 0, and thus the rdev cannot be deleted.
*
* We need to do it for the dev_close(), since that will call
* the netdev notifiers, and we need to acquire the mutex there
* but don't know if we get there from here or from some other
* place (e.g. "ip link set ... down").
*/
mutex_unlock(&rdev->wiphy.mtx);
...
Do similarly for mwifiex teardown, by ensuring we bring the link down
first.
Sample deadlock trace:
[ 247.103516] INFO: task rmmod:2119 blocked for more than 123 seconds.
[ 247.110630] Not tainted 5.12.4 #5
[ 247.115796] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 247.124557] task:rmmod state:D stack: 0 pid: 2119 ppid: 2114 flags:0x00400208
[ 247.133905] Call trace:
[ 247.136644] __switch_to+0x130/0x170
[ 247.140643] __schedule+0x714/0xa0c
[ 247.144548] schedule_preempt_disabled+0x88/0xf4
[ 247.149714] __mutex_lock_common+0x43c/0x750
[ 247.154496] mutex_lock_nested+0x5c/0x68
[ 247.158884] cfg80211_netdev_notifier_call+0x280/0x4e0 [cfg80211]
[ 247.165769] raw_notifier_call_chain+0x4c/0x78
[ 247.170742] call_netdevice_notifiers_info+0x68/0xa4
[ 247.176305] __dev_close_many+0x7c/0x138
[ 247.180693] dev_close_many+0x7c/0x10c
[ 247.184893] unregister_netdevice_many+0xfc/0x654
[ 247.190158] unregister_netdevice_queue+0xb4/0xe0
[ 247.195424] _cfg80211_unregister_wdev+0xa4/0x204 [cfg80211]
[ 247.201816] cfg80211_unregister_wdev+0x20/0x2c [cfg80211]
[ 247.208016] mwifiex_del_virtual_intf+0xc8/0x188 [mwifiex]
[ 247.214174] mwifiex_uninit_sw+0x158/0x1b0 [mwifiex]
[ 247.219747] mwifiex_remove_card+0x38/0xa0 [mwifiex]
[ 247.225316] mwifiex_pcie_remove+0xd0/0xe0 [mwifiex_pcie]
[ 247.231451] pci_device_remove+0x50/0xe0
[ 247.235849] device_release_driver_internal+0x110/0x1b0
[ 247.241701] driver_detach+0x5c/0x9c
[ 247.245704] bus_remove_driver+0x84/0xb8
[ 247.250095] driver_unregister+0x3c/0x60
[ 247.254486] pci_unregister_driver+0x2c/0x90
[ 247.259267] cleanup_module+0x18/0xcdc [mwifiex_pcie] |
In the Linux kernel, the following vulnerability has been resolved:
usb: cdnsp: Fix deadlock issue in cdnsp_thread_irq_handler
Patch fixes the following critical issue caused by deadlock which has been
detected during testing NCM class:
smp: csd: Detected non-responsive CSD lock (#1) on CPU#0
smp: csd: CSD lock (#1) unresponsive.
....
RIP: 0010:native_queued_spin_lock_slowpath+0x61/0x1d0
RSP: 0018:ffffbc494011cde0 EFLAGS: 00000002
RAX: 0000000000000101 RBX: ffff9ee8116b4a68 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9ee8116b4658
RBP: ffffbc494011cde0 R08: 0000000000000001 R09: 0000000000000000
R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658
R13: ffff9ee8116b4670 R14: 0000000000000246 R15: ffff9ee8116b4658
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7bcc41a830 CR3: 000000007a612003 CR4: 00000000001706e0
Call Trace:
<IRQ>
do_raw_spin_lock+0xc0/0xd0
_raw_spin_lock_irqsave+0x95/0xa0
cdnsp_gadget_ep_queue.cold+0x88/0x107 [cdnsp_udc_pci]
usb_ep_queue+0x35/0x110
eth_start_xmit+0x220/0x3d0 [u_ether]
ncm_tx_timeout+0x34/0x40 [usb_f_ncm]
? ncm_free_inst+0x50/0x50 [usb_f_ncm]
__hrtimer_run_queues+0xac/0x440
hrtimer_run_softirq+0x8c/0xb0
__do_softirq+0xcf/0x428
asm_call_irq_on_stack+0x12/0x20
</IRQ>
do_softirq_own_stack+0x61/0x70
irq_exit_rcu+0xc1/0xd0
sysvec_apic_timer_interrupt+0x52/0xb0
asm_sysvec_apic_timer_interrupt+0x12/0x20
RIP: 0010:do_raw_spin_trylock+0x18/0x40
RSP: 0018:ffffbc494138bda8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9ee8116b4658 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9ee8116b4658
RBP: ffffbc494138bda8 R08: 0000000000000001 R09: 0000000000000000
R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658
R13: ffff9ee8116b4670 R14: ffff9ee7b5c73d80 R15: ffff9ee8116b4000
_raw_spin_lock+0x3d/0x70
? cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci]
cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci]
? cdnsp_remove_request+0x1f0/0x1f0 [cdnsp_udc_pci]
? cdnsp_thread_irq_handler+0x5/0xa0 [cdnsp_udc_pci]
? irq_thread+0xa0/0x1c0
irq_thread_fn+0x28/0x60
irq_thread+0x105/0x1c0
? __kthread_parkme+0x42/0x90
? irq_forced_thread_fn+0x90/0x90
? wake_threads_waitq+0x30/0x30
? irq_thread_check_affinity+0xe0/0xe0
kthread+0x12a/0x160
? kthread_park+0x90/0x90
ret_from_fork+0x22/0x30
The root cause of issue is spin_lock/spin_unlock instruction instead
spin_lock_irqsave/spin_lock_irqrestore in cdnsp_thread_irq_handler
function. |
In the Linux kernel, the following vulnerability has been resolved:
mac80211: fix deadlock in AP/VLAN handling
Syzbot reports that when you have AP_VLAN interfaces that are up
and close the AP interface they belong to, we get a deadlock. No
surprise - since we dev_close() them with the wiphy mutex held,
which goes back into the netdev notifier in cfg80211 and tries to
acquire the wiphy mutex there.
To fix this, we need to do two things:
1) prevent changing iftype while AP_VLANs are up, we can't
easily fix this case since cfg80211 already calls us with
the wiphy mutex held, but change_interface() is relatively
rare in drivers anyway, so changing iftype isn't used much
(and userspace has to fall back to down/change/up anyway)
2) pull the dev_close() loop over VLANs out of the wiphy mutex
section in the normal stop case |
In the Linux kernel, the following vulnerability has been resolved:
bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks
Commit 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown")
added an implementation of the locked_down LSM hook to SELinux, with the aim
to restrict which domains are allowed to perform operations that would breach
lockdown. This is indirectly also getting audit subsystem involved to report
events. The latter is problematic, as reported by Ondrej and Serhei, since it
can bring down the whole system via audit:
1) The audit events that are triggered due to calls to security_locked_down()
can OOM kill a machine, see below details [0].
2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit()
when trying to wake up kauditd, for example, when using trace_sched_switch()
tracepoint, see details in [1]. Triggering this was not via some hypothetical
corner case, but with existing tools like runqlat & runqslower from bcc, for
example, which make use of this tracepoint. Rough call sequence goes like:
rq_lock(rq) -> -------------------------+
trace_sched_switch() -> |
bpf_prog_xyz() -> +-> deadlock
selinux_lockdown() -> |
audit_log_end() -> |
wake_up_interruptible() -> |
try_to_wake_up() -> |
rq_lock(rq) --------------+
What's worse is that the intention of 59438b46471a to further restrict lockdown
settings for specific applications in respect to the global lockdown policy is
completely broken for BPF. The SELinux policy rule for the current lockdown check
looks something like this:
allow <who> <who> : lockdown { <reason> };
However, this doesn't match with the 'current' task where the security_locked_down()
is executed, example: httpd does a syscall. There is a tracing program attached
to the syscall which triggers a BPF program to run, which ends up doing a
bpf_probe_read_kernel{,_str}() helper call. The selinux_lockdown() hook does
the permission check against 'current', that is, httpd in this example. httpd
has literally zero relation to this tracing program, and it would be nonsensical
having to write an SELinux policy rule against httpd to let the tracing helper
pass. The policy in this case needs to be against the entity that is installing
the BPF program. For example, if bpftrace would generate a histogram of syscall
counts by user space application:
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'
bpftrace would then go and generate a BPF program from this internally. One way
of doing it [for the sake of the example] could be to call bpf_get_current_task()
helper and then access current->comm via one of bpf_probe_read_kernel{,_str}()
helpers. So the program itself has nothing to do with httpd or any other random
app doing a syscall here. The BPF program _explicitly initiated_ the lockdown
check. The allow/deny policy belongs in the context of bpftrace: meaning, you
want to grant bpftrace access to use these helpers, but other tracers on the
system like my_random_tracer _not_.
Therefore fix all three issues at the same time by taking a completely different
approach for the security_locked_down() hook, that is, move the check into the
program verification phase where we actually retrieve the BPF func proto. This
also reliably gets the task (current) that is trying to install the BPF tracing
program, e.g. bpftrace/bcc/perf/systemtap/etc, and it also fixes the OOM since
we're moving this out of the BPF helper's fast-path which can be called several
millions of times per second.
The check is then also in line with other security_locked_down() hooks in the
system where the enforcement is performed at open/load time, for example,
open_kcore() for /proc/kcore access or module_sig_check() for module signatures
just to pick f
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
nvmet-tcp: fix incorrect locking in state_change sk callback
We are not changing anything in the TCP connection state so
we should not take a write_lock but rather a read lock.
This caused a deadlock when running nvmet-tcp and nvme-tcp
on the same system, where state_change callbacks on the
host and on the controller side have causal relationship
and made lockdep report on this with blktests:
================================
WARNING: inconsistent lock state
5.12.0-rc3 #1 Tainted: G I
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-R} usage.
nvme/1324 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff888363151000 (clock-AF_INET){++-?}-{2:2}, at: nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
{IN-SOFTIRQ-W} state was registered at:
__lock_acquire+0x79b/0x18d0
lock_acquire+0x1ca/0x480
_raw_write_lock_bh+0x39/0x80
nvmet_tcp_state_change+0x21/0x170 [nvmet_tcp]
tcp_fin+0x2a8/0x780
tcp_data_queue+0xf94/0x1f20
tcp_rcv_established+0x6ba/0x1f00
tcp_v4_do_rcv+0x502/0x760
tcp_v4_rcv+0x257e/0x3430
ip_protocol_deliver_rcu+0x69/0x6a0
ip_local_deliver_finish+0x1e2/0x2f0
ip_local_deliver+0x1a2/0x420
ip_rcv+0x4fb/0x6b0
__netif_receive_skb_one_core+0x162/0x1b0
process_backlog+0x1ff/0x770
__napi_poll.constprop.0+0xa9/0x5c0
net_rx_action+0x7b3/0xb30
__do_softirq+0x1f0/0x940
do_softirq+0xa1/0xd0
__local_bh_enable_ip+0xd8/0x100
ip_finish_output2+0x6b7/0x18a0
__ip_queue_xmit+0x706/0x1aa0
__tcp_transmit_skb+0x2068/0x2e20
tcp_write_xmit+0xc9e/0x2bb0
__tcp_push_pending_frames+0x92/0x310
inet_shutdown+0x158/0x300
__nvme_tcp_stop_queue+0x36/0x270 [nvme_tcp]
nvme_tcp_stop_queue+0x87/0xb0 [nvme_tcp]
nvme_tcp_teardown_admin_queue+0x69/0xe0 [nvme_tcp]
nvme_do_delete_ctrl+0x100/0x10c [nvme_core]
nvme_sysfs_delete.cold+0x8/0xd [nvme_core]
kernfs_fop_write_iter+0x2c7/0x460
new_sync_write+0x36c/0x610
vfs_write+0x5c0/0x870
ksys_write+0xf9/0x1d0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
irq event stamp: 10687
hardirqs last enabled at (10687): [<ffffffff9ec376bd>] _raw_spin_unlock_irqrestore+0x2d/0x40
hardirqs last disabled at (10686): [<ffffffff9ec374d8>] _raw_spin_lock_irqsave+0x68/0x90
softirqs last enabled at (10684): [<ffffffff9f000608>] __do_softirq+0x608/0x940
softirqs last disabled at (10649): [<ffffffff9cdedd31>] do_softirq+0xa1/0xd0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(clock-AF_INET);
<Interrupt>
lock(clock-AF_INET);
*** DEADLOCK ***
5 locks held by nvme/1324:
#0: ffff8884a01fe470 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xf9/0x1d0
#1: ffff8886e435c090 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x216/0x460
#2: ffff888104d90c38 (kn->active#255){++++}-{0:0}, at: kernfs_remove_self+0x22d/0x330
#3: ffff8884634538d0 (&queue->queue_lock){+.+.}-{3:3}, at: nvme_tcp_stop_queue+0x52/0xb0 [nvme_tcp]
#4: ffff888363150d30 (sk_lock-AF_INET){+.+.}-{0:0}, at: inet_shutdown+0x59/0x300
stack backtrace:
CPU: 26 PID: 1324 Comm: nvme Tainted: G I 5.12.0-rc3 #1
Hardware name: Dell Inc. PowerEdge R640/06NR82, BIOS 2.10.0 11/12/2020
Call Trace:
dump_stack+0x93/0xc2
mark_lock_irq.cold+0x2c/0xb3
? verify_lock_unused+0x390/0x390
? stack_trace_consume_entry+0x160/0x160
? lock_downgrade+0x100/0x100
? save_trace+0x88/0x5e0
? _raw_spin_unlock_irqrestore+0x2d/0x40
mark_lock+0x530/0x1470
? mark_lock_irq+0x1d10/0x1d10
? enqueue_timer+0x660/0x660
mark_usage+0x215/0x2a0
__lock_acquire+0x79b/0x18d0
? tcp_schedule_loss_probe.part.0+0x38c/0x520
lock_acquire+0x1ca/0x480
? nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
? rcu_read_unlock+0x40/0x40
? tcp_mtu_probe+0x1ae0/0x1ae0
? kmalloc_reserve+0xa0/0xa0
? sysfs_file_ops+0x170/0x170
_raw_read_lock+0x3d/0xa0
? nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
? sysfs_file_ops
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: avoid deadlock between hci_dev->lock and socket lock
Commit eab2404ba798 ("Bluetooth: Add BT_PHY socket option") added a
dependency between socket lock and hci_dev->lock that could lead to
deadlock.
It turns out that hci_conn_get_phy() is not in any way relying on hdev
being immutable during the runtime of this function, neither does it even
look at any of the members of hdev, and as such there is no need to hold
that lock.
This fixes the lockdep splat below:
======================================================
WARNING: possible circular locking dependency detected
5.12.0-rc1-00026-g73d464503354 #10 Not tainted
------------------------------------------------------
bluetoothd/1118 is trying to acquire lock:
ffff8f078383c078 (&hdev->lock){+.+.}-{3:3}, at: hci_conn_get_phy+0x1c/0x150 [bluetooth]
but task is already holding lock:
ffff8f07e831d920 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}, at: l2cap_sock_getsockopt+0x8b/0x610
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}:
lock_sock_nested+0x72/0xa0
l2cap_sock_ready_cb+0x18/0x70 [bluetooth]
l2cap_config_rsp+0x27a/0x520 [bluetooth]
l2cap_sig_channel+0x658/0x1330 [bluetooth]
l2cap_recv_frame+0x1ba/0x310 [bluetooth]
hci_rx_work+0x1cc/0x640 [bluetooth]
process_one_work+0x244/0x5f0
worker_thread+0x3c/0x380
kthread+0x13e/0x160
ret_from_fork+0x22/0x30
-> #2 (&chan->lock#2/1){+.+.}-{3:3}:
__mutex_lock+0xa3/0xa10
l2cap_chan_connect+0x33a/0x940 [bluetooth]
l2cap_sock_connect+0x141/0x2a0 [bluetooth]
__sys_connect+0x9b/0xc0
__x64_sys_connect+0x16/0x20
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #1 (&conn->chan_lock){+.+.}-{3:3}:
__mutex_lock+0xa3/0xa10
l2cap_chan_connect+0x322/0x940 [bluetooth]
l2cap_sock_connect+0x141/0x2a0 [bluetooth]
__sys_connect+0x9b/0xc0
__x64_sys_connect+0x16/0x20
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #0 (&hdev->lock){+.+.}-{3:3}:
__lock_acquire+0x147a/0x1a50
lock_acquire+0x277/0x3d0
__mutex_lock+0xa3/0xa10
hci_conn_get_phy+0x1c/0x150 [bluetooth]
l2cap_sock_getsockopt+0x5a9/0x610 [bluetooth]
__sys_getsockopt+0xcc/0x200
__x64_sys_getsockopt+0x20/0x30
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
&hdev->lock --> &chan->lock#2/1 --> sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP);
lock(&chan->lock#2/1);
lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP);
lock(&hdev->lock);
*** DEADLOCK ***
1 lock held by bluetoothd/1118:
#0: ffff8f07e831d920 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}, at: l2cap_sock_getsockopt+0x8b/0x610 [bluetooth]
stack backtrace:
CPU: 3 PID: 1118 Comm: bluetoothd Not tainted 5.12.0-rc1-00026-g73d464503354 #10
Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017
Call Trace:
dump_stack+0x7f/0xa1
check_noncircular+0x105/0x120
? __lock_acquire+0x147a/0x1a50
__lock_acquire+0x147a/0x1a50
lock_acquire+0x277/0x3d0
? hci_conn_get_phy+0x1c/0x150 [bluetooth]
? __lock_acquire+0x2e1/0x1a50
? lock_is_held_type+0xb4/0x120
? hci_conn_get_phy+0x1c/0x150 [bluetooth]
__mutex_lock+0xa3/0xa10
? hci_conn_get_phy+0x1c/0x150 [bluetooth]
? lock_acquire+0x277/0x3d0
? mark_held_locks+0x49/0x70
? mark_held_locks+0x49/0x70
? hci_conn_get_phy+0x1c/0x150 [bluetooth]
hci_conn_get_phy+0x
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
tracing: Restructure trace_clock_global() to never block
It was reported that a fix to the ring buffer recursion detection would
cause a hung machine when performing suspend / resume testing. The
following backtrace was extracted from debugging that case:
Call Trace:
trace_clock_global+0x91/0xa0
__rb_reserve_next+0x237/0x460
ring_buffer_lock_reserve+0x12a/0x3f0
trace_buffer_lock_reserve+0x10/0x50
__trace_graph_return+0x1f/0x80
trace_graph_return+0xb7/0xf0
? trace_clock_global+0x91/0xa0
ftrace_return_to_handler+0x8b/0xf0
? pv_hash+0xa0/0xa0
return_to_handler+0x15/0x30
? ftrace_graph_caller+0xa0/0xa0
? trace_clock_global+0x91/0xa0
? __rb_reserve_next+0x237/0x460
? ring_buffer_lock_reserve+0x12a/0x3f0
? trace_event_buffer_lock_reserve+0x3c/0x120
? trace_event_buffer_reserve+0x6b/0xc0
? trace_event_raw_event_device_pm_callback_start+0x125/0x2d0
? dpm_run_callback+0x3b/0xc0
? pm_ops_is_empty+0x50/0x50
? platform_get_irq_byname_optional+0x90/0x90
? trace_device_pm_callback_start+0x82/0xd0
? dpm_run_callback+0x49/0xc0
With the following RIP:
RIP: 0010:native_queued_spin_lock_slowpath+0x69/0x200
Since the fix to the recursion detection would allow a single recursion to
happen while tracing, this lead to the trace_clock_global() taking a spin
lock and then trying to take it again:
ring_buffer_lock_reserve() {
trace_clock_global() {
arch_spin_lock() {
queued_spin_lock_slowpath() {
/* lock taken */
(something else gets traced by function graph tracer)
ring_buffer_lock_reserve() {
trace_clock_global() {
arch_spin_lock() {
queued_spin_lock_slowpath() {
/* DEAD LOCK! */
Tracing should *never* block, as it can lead to strange lockups like the
above.
Restructure the trace_clock_global() code to instead of simply taking a
lock to update the recorded "prev_time" simply use it, as two events
happening on two different CPUs that calls this at the same time, really
doesn't matter which one goes first. Use a trylock to grab the lock for
updating the prev_time, and if it fails, simply try again the next time.
If it failed to be taken, that means something else is already updating
it.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212761 |
Visual Studio Denial of Service Vulnerability |
systemd 250 and 251 allows local users to achieve a systemd-coredump deadlock by triggering a crash that has a long backtrace. This occurs in parse_elf_object in shared/elf-util.c. The exploitation methodology is to crash a binary calling the same function recursively, and put it in a deeply nested directory to make its backtrace large enough to cause the deadlock. This must be done 16 times when MaxConnections=16 is set for the systemd/units/systemd-coredump.socket file. |
Guests can trigger deadlock in Linux netback driver T[his CNA information record relates to multiple CVEs; the text explains which aspects/vulnerabilities correspond to which CVE.] The patch for XSA-392 introduced another issue which might result in a deadlock when trying to free the SKB of a packet dropped due to the XSA-392 handling (CVE-2022-42328). Additionally when dropping packages for other reasons the same deadlock could occur in case of netpoll being active for the interface the xen-netback driver is connected to (CVE-2022-42329). |
Guests can trigger deadlock in Linux netback driver T[his CNA information record relates to multiple CVEs; the text explains which aspects/vulnerabilities correspond to which CVE.] The patch for XSA-392 introduced another issue which might result in a deadlock when trying to free the SKB of a packet dropped due to the XSA-392 handling (CVE-2022-42328). Additionally when dropping packages for other reasons the same deadlock could occur in case of netpoll being active for the interface the xen-netback driver is connected to (CVE-2022-42329). |