| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8192bs: Fix deadlock in rtw_joinbss_event_prehandle()
There is a deadlock in rtw_joinbss_event_prehandle(), which is shown
below:
(Thread 1) | (Thread 2)
| _set_timer()
rtw_joinbss_event_prehandle()| mod_timer()
spin_lock_bh() //(1) | (wait a time)
... | _rtw_join_timeout_handler()
del_timer_sync() | spin_lock_bh() //(2)
(wait timer to stop) | ...
We hold pmlmepriv->lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need pmlmepriv->lock in position (2) of thread 2.
As a result, rtw_joinbss_event_prehandle() will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_bh(), which could let timer handler to obtain
the needed lock. What`s more, we change spin_lock_bh() to
spin_lock_irq() in _rtw_join_timeout_handler() in order to
prevent deadlock. |
| In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8192u: Fix deadlock in ieee80211_beacons_stop()
There is a deadlock in ieee80211_beacons_stop(), which is shown below:
(Thread 1) | (Thread 2)
| ieee80211_send_beacon()
ieee80211_beacons_stop() | mod_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | ieee80211_send_beacon_cb()
del_timer_sync() | spin_lock_irqsave() //(2)
(wait timer to stop) | ...
We hold ieee->beacon_lock in position (1) of thread 1 and use
del_timer_sync() to wait timer to stop, but timer handler
also need ieee->beacon_lock in position (2) of thread 2.
As a result, ieee80211_beacons_stop() will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_irqsave(), which could let timer handler to obtain
the needed lock. |
| In the Linux kernel, the following vulnerability has been resolved:
drivers: tty: serial: Fix deadlock in sa1100_set_termios()
There is a deadlock in sa1100_set_termios(), which is shown
below:
(Thread 1) | (Thread 2)
| sa1100_enable_ms()
sa1100_set_termios() | mod_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | sa1100_timeout()
del_timer_sync() | spin_lock_irqsave() //(2)
(wait timer to stop) | ...
We hold sport->port.lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need sport->port.lock in position (2) of thread 2. As a result,
sa1100_set_termios() will block forever.
This patch moves del_timer_sync() before spin_lock_irqsave()
in order to prevent the deadlock. |
| In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8192eu: Fix deadlock in rtw_joinbss_event_prehandle
There is a deadlock in rtw_joinbss_event_prehandle(), which is shown below:
(Thread 1) | (Thread 2)
| _set_timer()
rtw_joinbss_event_prehandle()| mod_timer()
spin_lock_bh() //(1) | (wait a time)
... | rtw_join_timeout_handler()
| _rtw_join_timeout_handler()
del_timer_sync() | spin_lock_bh() //(2)
(wait timer to stop) | ...
We hold pmlmepriv->lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need pmlmepriv->lock in position (2) of thread 2.
As a result, rtw_joinbss_event_prehandle() will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_bh(), which could let timer handler to obtain
the needed lock. What`s more, we change spin_lock_bh() to
spin_lock_irq() in _rtw_join_timeout_handler() in order to
prevent deadlock. |
| In the Linux kernel, the following vulnerability has been resolved:
ceph: fix possible deadlock when holding Fwb to get inline_data
1, mount with wsync.
2, create a file with O_RDWR, and the request was sent to mds.0:
ceph_atomic_open()-->
ceph_mdsc_do_request(openc)
finish_open(file, dentry, ceph_open)-->
ceph_open()-->
ceph_init_file()-->
ceph_init_file_info()-->
ceph_uninline_data()-->
{
...
if (inline_version == 1 || /* initial version, no data */
inline_version == CEPH_INLINE_NONE)
goto out_unlock;
...
}
The inline_version will be 1, which is the initial version for the
new create file. And here the ci->i_inline_version will keep with 1,
it's buggy.
3, buffer write to the file immediately:
ceph_write_iter()-->
ceph_get_caps(file, need=Fw, want=Fb, ...);
generic_perform_write()-->
a_ops->write_begin()-->
ceph_write_begin()-->
netfs_write_begin()-->
netfs_begin_read()-->
netfs_rreq_submit_slice()-->
netfs_read_from_server()-->
rreq->netfs_ops->issue_read()-->
ceph_netfs_issue_read()-->
{
...
if (ci->i_inline_version != CEPH_INLINE_NONE &&
ceph_netfs_issue_op_inline(subreq))
return;
...
}
ceph_put_cap_refs(ci, Fwb);
The ceph_netfs_issue_op_inline() will send a getattr(Fsr) request to
mds.1.
4, then the mds.1 will request the rd lock for CInode::filelock from
the auth mds.0, the mds.0 will do the CInode::filelock state transation
from excl --> sync, but it need to revoke the Fxwb caps back from the
clients.
While the kernel client has aleady held the Fwb caps and waiting for
the getattr(Fsr).
It's deadlock!
URL: https://tracker.ceph.com/issues/55377 |
| In the Linux kernel, the following vulnerability has been resolved:
ath11k: Fix frames flush failure caused by deadlock
We are seeing below warnings:
kernel: [25393.301506] ath11k_pci 0000:01:00.0: failed to flush mgmt transmit queue 0
kernel: [25398.421509] ath11k_pci 0000:01:00.0: failed to flush mgmt transmit queue 0
kernel: [25398.421831] ath11k_pci 0000:01:00.0: dropping mgmt frame for vdev 0, is_started 0
this means ath11k fails to flush mgmt. frames because wmi_mgmt_tx_work
has no chance to run in 5 seconds.
By setting /proc/sys/kernel/hung_task_timeout_secs to 20 and increasing
ATH11K_FLUSH_TIMEOUT to 50 we get below warnings:
kernel: [ 120.763160] INFO: task wpa_supplicant:924 blocked for more than 20 seconds.
kernel: [ 120.763169] Not tainted 5.10.90 #12
kernel: [ 120.763177] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: [ 120.763186] task:wpa_supplicant state:D stack: 0 pid: 924 ppid: 1 flags:0x000043a0
kernel: [ 120.763201] Call Trace:
kernel: [ 120.763214] __schedule+0x785/0x12fa
kernel: [ 120.763224] ? lockdep_hardirqs_on_prepare+0xe2/0x1bb
kernel: [ 120.763242] schedule+0x7e/0xa1
kernel: [ 120.763253] schedule_timeout+0x98/0xfe
kernel: [ 120.763266] ? run_local_timers+0x4a/0x4a
kernel: [ 120.763291] ath11k_mac_flush_tx_complete+0x197/0x2b1 [ath11k 13c3a9bf37790f4ac8103b3decf7ab4008ac314a]
kernel: [ 120.763306] ? init_wait_entry+0x2e/0x2e
kernel: [ 120.763343] __ieee80211_flush_queues+0x167/0x21f [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763378] __ieee80211_recalc_idle+0x105/0x125 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763411] ieee80211_recalc_idle+0x14/0x27 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763441] ieee80211_free_chanctx+0x77/0xa2 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763473] __ieee80211_vif_release_channel+0x100/0x131 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763540] ieee80211_vif_release_channel+0x66/0x81 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763572] ieee80211_destroy_auth_data+0xa3/0xe6 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763612] ieee80211_mgd_deauth+0x178/0x29b [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763654] cfg80211_mlme_deauth+0x1a8/0x22c [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [ 120.763697] nl80211_deauthenticate+0xfa/0x123 [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [ 120.763715] genl_rcv_msg+0x392/0x3c2
kernel: [ 120.763750] ? nl80211_associate+0x432/0x432 [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [ 120.763782] ? nl80211_associate+0x432/0x432 [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [ 120.763802] ? genl_rcv+0x36/0x36
kernel: [ 120.763814] netlink_rcv_skb+0x89/0xf7
kernel: [ 120.763829] genl_rcv+0x28/0x36
kernel: [ 120.763840] netlink_unicast+0x179/0x24b
kernel: [ 120.763854] netlink_sendmsg+0x393/0x401
kernel: [ 120.763872] sock_sendmsg+0x72/0x76
kernel: [ 120.763886] ____sys_sendmsg+0x170/0x1e6
kernel: [ 120.763897] ? copy_msghdr_from_user+0x7a/0xa2
kernel: [ 120.763914] ___sys_sendmsg+0x95/0xd1
kernel: [ 120.763940] __sys_sendmsg+0x85/0xbf
kernel: [ 120.763956] do_syscall_64+0x43/0x55
kernel: [ 120.763966] entry_SYSCALL_64_after_hwframe+0x44/0xa9
kernel: [ 120.763977] RIP: 0033:0x79089f3fcc83
kernel: [ 120.763986] RSP: 002b:00007ffe604f0508 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
kernel: [ 120.763997] RAX: ffffffffffffffda RBX: 000059b40e987690 RCX: 000079089f3fcc83
kernel: [ 120.764006] RDX: 0000000000000000 RSI: 00007ffe604f0558 RDI: 0000000000000009
kernel: [ 120.764014] RBP: 00007ffe604f0540 R08: 0000000000000004 R09: 0000000000400000
kernel: [ 120.764023] R10: 00007ffe604f0638 R11: 0000000000000246 R12: 000059b40ea04980
kernel: [ 120.764032] R13: 00007ffe604
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
ubifs: Fix deadlock in concurrent rename whiteout and inode writeback
Following hung tasks:
[ 77.028764] task:kworker/u8:4 state:D stack: 0 pid: 132
[ 77.028820] Call Trace:
[ 77.029027] schedule+0x8c/0x1b0
[ 77.029067] mutex_lock+0x50/0x60
[ 77.029074] ubifs_write_inode+0x68/0x1f0 [ubifs]
[ 77.029117] __writeback_single_inode+0x43c/0x570
[ 77.029128] writeback_sb_inodes+0x259/0x740
[ 77.029148] wb_writeback+0x107/0x4d0
[ 77.029163] wb_workfn+0x162/0x7b0
[ 92.390442] task:aa state:D stack: 0 pid: 1506
[ 92.390448] Call Trace:
[ 92.390458] schedule+0x8c/0x1b0
[ 92.390461] wb_wait_for_completion+0x82/0xd0
[ 92.390469] __writeback_inodes_sb_nr+0xb2/0x110
[ 92.390472] writeback_inodes_sb_nr+0x14/0x20
[ 92.390476] ubifs_budget_space+0x705/0xdd0 [ubifs]
[ 92.390503] do_rename.cold+0x7f/0x187 [ubifs]
[ 92.390549] ubifs_rename+0x8b/0x180 [ubifs]
[ 92.390571] vfs_rename+0xdb2/0x1170
[ 92.390580] do_renameat2+0x554/0x770
, are caused by concurrent rename whiteout and inode writeback processes:
rename_whiteout(Thread 1) wb_workfn(Thread2)
ubifs_rename
do_rename
lock_4_inodes (Hold ui_mutex)
ubifs_budget_space
make_free_space
shrink_liability
__writeback_inodes_sb_nr
bdi_split_work_to_wbs (Queue new wb work)
wb_do_writeback(wb work)
__writeback_single_inode
ubifs_write_inode
LOCK(ui_mutex)
↑
wb_wait_for_completion (Wait wb work) <-- deadlock!
Reproducer (Detail program in [Link]):
1. SYS_renameat2("/mp/dir/file", "/mp/dir/whiteout", RENAME_WHITEOUT)
2. Consume out of space before kernel(mdelay) doing budget for whiteout
Fix it by doing whiteout space budget before locking ubifs inodes.
BTW, it also fixes wrong goto tag 'out_release' in whiteout budget
error handling path(It should at least recover dir i_size and unlock
4 ubifs inodes). |
| In the Linux kernel, the following vulnerability has been resolved:
NFSv4: Fix a deadlock when recovering state on a sillyrenamed file
If the file is sillyrenamed, and slated for delete on close, it is
possible for a server reboot to triggeer an open reclaim, with can again
race with the application call to close(). When that happens, the call
to put_nfs_open_context() can trigger a synchronous delegreturn call
which deadlocks because it is not marked as privileged.
Instead, ensure that the call to nfs4_inode_return_delegation_on_close()
catches the delegreturn, and schedules it asynchronously. |
| In the Linux kernel, the following vulnerability has been resolved:
net: enetc: avoid deadlock in enetc_tx_onestep_tstamp()
This lockdep splat says it better than I could:
================================
WARNING: inconsistent lock state
6.2.0-rc2-07010-ga9b9500ffaac-dirty #967 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
kworker/1:3/179 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff3ec4036ce098 (_xmit_ETHER#2){+.?.}-{3:3}, at: netif_freeze_queues+0x5c/0xc0
{IN-SOFTIRQ-W} state was registered at:
_raw_spin_lock+0x5c/0xc0
sch_direct_xmit+0x148/0x37c
__dev_queue_xmit+0x528/0x111c
ip6_finish_output2+0x5ec/0xb7c
ip6_finish_output+0x240/0x3f0
ip6_output+0x78/0x360
ndisc_send_skb+0x33c/0x85c
ndisc_send_rs+0x54/0x12c
addrconf_rs_timer+0x154/0x260
call_timer_fn+0xb8/0x3a0
__run_timers.part.0+0x214/0x26c
run_timer_softirq+0x3c/0x74
__do_softirq+0x14c/0x5d8
____do_softirq+0x10/0x20
call_on_irq_stack+0x2c/0x5c
do_softirq_own_stack+0x1c/0x30
__irq_exit_rcu+0x168/0x1a0
irq_exit_rcu+0x10/0x40
el1_interrupt+0x38/0x64
irq event stamp: 7825
hardirqs last enabled at (7825): [<ffffdf1f7200cae4>] exit_to_kernel_mode+0x34/0x130
hardirqs last disabled at (7823): [<ffffdf1f708105f0>] __do_softirq+0x550/0x5d8
softirqs last enabled at (7824): [<ffffdf1f7081050c>] __do_softirq+0x46c/0x5d8
softirqs last disabled at (7811): [<ffffdf1f708166e0>] ____do_softirq+0x10/0x20
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(_xmit_ETHER#2);
<Interrupt>
lock(_xmit_ETHER#2);
*** DEADLOCK ***
3 locks held by kworker/1:3/179:
#0: ffff3ec400004748 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c0
#1: ffff80000a0bbdc8 ((work_completion)(&priv->tx_onestep_tstamp)){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c0
#2: ffff3ec4036cd438 (&dev->tx_global_lock){+.+.}-{3:3}, at: netif_tx_lock+0x1c/0x34
Workqueue: events enetc_tx_onestep_tstamp
Call trace:
print_usage_bug.part.0+0x208/0x22c
mark_lock+0x7f0/0x8b0
__lock_acquire+0x7c4/0x1ce0
lock_acquire.part.0+0xe0/0x220
lock_acquire+0x68/0x84
_raw_spin_lock+0x5c/0xc0
netif_freeze_queues+0x5c/0xc0
netif_tx_lock+0x24/0x34
enetc_tx_onestep_tstamp+0x20/0x100
process_one_work+0x28c/0x6c0
worker_thread+0x74/0x450
kthread+0x118/0x11c
but I'll say it anyway: the enetc_tx_onestep_tstamp() work item runs in
process context, therefore with softirqs enabled (i.o.w., it can be
interrupted by a softirq). If we hold the netif_tx_lock() when there is
an interrupt, and the NET_TX softirq then gets scheduled, this will take
the netif_tx_lock() a second time and deadlock the kernel.
To solve this, use netif_tx_lock_bh(), which blocks softirqs from
running. |
| In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
syzbot reports a possible deadlock in rfcomm_sk_state_change [1].
While rfcomm_sock_connect acquires the sk lock and waits for
the rfcomm lock, rfcomm_sock_release could have the rfcomm
lock and hit a deadlock for acquiring the sk lock.
Here's a simplified flow:
rfcomm_sock_connect:
lock_sock(sk)
rfcomm_dlc_open:
rfcomm_lock()
rfcomm_sock_release:
rfcomm_sock_shutdown:
rfcomm_lock()
__rfcomm_dlc_close:
rfcomm_k_state_change:
lock_sock(sk)
This patch drops the sk lock before calling rfcomm_dlc_open to
avoid the possible deadlock and holds sk's reference count to
prevent use-after-free after rfcomm_dlc_open completes. |
| In the Linux kernel, the following vulnerability has been resolved:
VMCI: Use threaded irqs instead of tasklets
The vmci_dispatch_dgs() tasklet function calls vmci_read_data()
which uses wait_event() resulting in invalid sleep in an atomic
context (and therefore potentially in a deadlock).
Use threaded irqs to fix this issue and completely remove usage
of tasklets.
[ 20.264639] BUG: sleeping function called from invalid context at drivers/misc/vmw_vmci/vmci_guest.c:145
[ 20.264643] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 762, name: vmtoolsd
[ 20.264645] preempt_count: 101, expected: 0
[ 20.264646] RCU nest depth: 0, expected: 0
[ 20.264647] 1 lock held by vmtoolsd/762:
[ 20.264648] #0: ffff0000874ae440 (sk_lock-AF_VSOCK){+.+.}-{0:0}, at: vsock_connect+0x60/0x330 [vsock]
[ 20.264658] Preemption disabled at:
[ 20.264659] [<ffff80000151d7d8>] vmci_send_datagram+0x44/0xa0 [vmw_vmci]
[ 20.264665] CPU: 0 PID: 762 Comm: vmtoolsd Not tainted 5.19.0-0.rc8.20220727git39c3c396f813.60.fc37.aarch64 #1
[ 20.264667] Hardware name: VMware, Inc. VBSA/VBSA, BIOS VEFI 12/31/2020
[ 20.264668] Call trace:
[ 20.264669] dump_backtrace+0xc4/0x130
[ 20.264672] show_stack+0x24/0x80
[ 20.264673] dump_stack_lvl+0x88/0xb4
[ 20.264676] dump_stack+0x18/0x34
[ 20.264677] __might_resched+0x1a0/0x280
[ 20.264679] __might_sleep+0x58/0x90
[ 20.264681] vmci_read_data+0x74/0x120 [vmw_vmci]
[ 20.264683] vmci_dispatch_dgs+0x64/0x204 [vmw_vmci]
[ 20.264686] tasklet_action_common.constprop.0+0x13c/0x150
[ 20.264688] tasklet_action+0x40/0x50
[ 20.264689] __do_softirq+0x23c/0x6b4
[ 20.264690] __irq_exit_rcu+0x104/0x214
[ 20.264691] irq_exit_rcu+0x1c/0x50
[ 20.264693] el1_interrupt+0x38/0x6c
[ 20.264695] el1h_64_irq_handler+0x18/0x24
[ 20.264696] el1h_64_irq+0x68/0x6c
[ 20.264697] preempt_count_sub+0xa4/0xe0
[ 20.264698] _raw_spin_unlock_irqrestore+0x64/0xb0
[ 20.264701] vmci_send_datagram+0x7c/0xa0 [vmw_vmci]
[ 20.264703] vmci_datagram_dispatch+0x84/0x100 [vmw_vmci]
[ 20.264706] vmci_datagram_send+0x2c/0x40 [vmw_vmci]
[ 20.264709] vmci_transport_send_control_pkt+0xb8/0x120 [vmw_vsock_vmci_transport]
[ 20.264711] vmci_transport_connect+0x40/0x7c [vmw_vsock_vmci_transport]
[ 20.264713] vsock_connect+0x278/0x330 [vsock]
[ 20.264715] __sys_connect_file+0x8c/0xc0
[ 20.264718] __sys_connect+0x84/0xb4
[ 20.264720] __arm64_sys_connect+0x2c/0x3c
[ 20.264721] invoke_syscall+0x78/0x100
[ 20.264723] el0_svc_common.constprop.0+0x68/0x124
[ 20.264724] do_el0_svc+0x38/0x4c
[ 20.264725] el0_svc+0x60/0x180
[ 20.264726] el0t_64_sync_handler+0x11c/0x150
[ 20.264728] el0t_64_sync+0x190/0x194 |
| In the Linux kernel, the following vulnerability has been resolved:
f2fs: initialize locks earlier in f2fs_fill_super()
syzbot is reporting lockdep warning at f2fs_handle_error() [1], for
spin_lock(&sbi->error_lock) is called before spin_lock_init() is called.
For safe locking in error handling, move initialization of locks (and
obvious structures) in f2fs_fill_super() to immediately after memory
allocation. |
| In the Linux kernel, the following vulnerability has been resolved:
ALSA: timer: Don't take register_mutex with copy_from/to_user()
The infamous mmap_lock taken in copy_from/to_user() can be often
problematic when it's called inside another mutex, as they might lead
to deadlocks.
In the case of ALSA timer code, the bad pattern is with
guard(mutex)(®ister_mutex) that covers copy_from/to_user() -- which
was mistakenly introduced at converting to guard(), and it had been
carefully worked around in the past.
This patch fixes those pieces simply by moving copy_from/to_user() out
of the register mutex lock again. |
| In the Linux kernel, the following vulnerability has been resolved:
Revert "arm64: dts: qcom: sdm845: Affirm IDR0.CCTW on apps_smmu"
There are reports that the pagetable walker cache coherency is not a
given across the spectrum of SDM845/850 devices, leading to lock-ups
and resets. It works fine on some devices (like the Dragonboard 845c,
but not so much on the Lenovo Yoga C630).
This unfortunately looks like a fluke in firmware development, where
likely somewhere in the vast hypervisor stack, a change to accommodate
for this was only introduced after the initial software release (which
often serves as a baseline for products).
Revert the change to avoid additional guesswork around crashes.
This reverts commit 6b31a9744b8726c69bb0af290f8475a368a4b805. |
| In the Linux kernel, the following vulnerability has been resolved:
nilfs2: fix deadlock in nilfs_count_free_blocks()
A semaphore deadlock can occur if nilfs_get_block() detects metadata
corruption while locating data blocks and a superblock writeback occurs at
the same time:
task 1 task 2
------ ------
* A file operation *
nilfs_truncate()
nilfs_get_block()
down_read(rwsem A) <--
nilfs_bmap_lookup_contig()
... generic_shutdown_super()
nilfs_put_super()
* Prepare to write superblock *
down_write(rwsem B) <--
nilfs_cleanup_super()
* Detect b-tree corruption * nilfs_set_log_cursor()
nilfs_bmap_convert_error() nilfs_count_free_blocks()
__nilfs_error() down_read(rwsem A) <--
nilfs_set_error()
down_write(rwsem B) <--
*** DEADLOCK ***
Here, nilfs_get_block() readlocks rwsem A (= NILFS_MDT(dat_inode)->mi_sem)
and then calls nilfs_bmap_lookup_contig(), but if it fails due to metadata
corruption, __nilfs_error() is called from nilfs_bmap_convert_error()
inside the lock section.
Since __nilfs_error() calls nilfs_set_error() unless the filesystem is
read-only and nilfs_set_error() attempts to writelock rwsem B (=
nilfs->ns_sem) to write back superblock exclusively, hierarchical lock
acquisition occurs in the order rwsem A -> rwsem B.
Now, if another task starts updating the superblock, it may writelock
rwsem B during the lock sequence above, and can deadlock trying to
readlock rwsem A in nilfs_count_free_blocks().
However, there is actually no need to take rwsem A in
nilfs_count_free_blocks() because it, within the lock section, only reads
a single integer data on a shared struct with
nilfs_sufile_get_ncleansegs(). This has been the case after commit
aa474a220180 ("nilfs2: add local variable to cache the number of clean
segments"), that is, even before this bug was introduced.
So, this resolves the deadlock problem by just not taking the semaphore in
nilfs_count_free_blocks(). |
| In the Linux kernel, the following vulnerability has been resolved:
net: ks8851: Queue RX packets in IRQ handler instead of disabling BHs
Currently the driver uses local_bh_disable()/local_bh_enable() in its
IRQ handler to avoid triggering net_rx_action() softirq on exit from
netif_rx(). The net_rx_action() could trigger this driver .start_xmit
callback, which is protected by the same lock as the IRQ handler, so
calling the .start_xmit from netif_rx() from the IRQ handler critical
section protected by the lock could lead to an attempt to claim the
already claimed lock, and a hang.
The local_bh_disable()/local_bh_enable() approach works only in case
the IRQ handler is protected by a spinlock, but does not work if the
IRQ handler is protected by mutex, i.e. this works for KS8851 with
Parallel bus interface, but not for KS8851 with SPI bus interface.
Remove the BH manipulation and instead of calling netif_rx() inside
the IRQ handler code protected by the lock, queue all the received
SKBs in the IRQ handler into a queue first, and once the IRQ handler
exits the critical section protected by the lock, dequeue all the
queued SKBs and push them all into netif_rx(). At this point, it is
safe to trigger the net_rx_action() softirq, since the netif_rx()
call is outside of the lock that protects the IRQ handler. |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: make cow_file_range_inline() honor locked_page on error
The btrfs buffered write path runs through __extent_writepage() which
has some tricky return value handling for writepage_delalloc().
Specifically, when that returns 1, we exit, but for other return values
we continue and end up calling btrfs_folio_end_all_writers(). If the
folio has been unlocked (note that we check the PageLocked bit at the
start of __extent_writepage()), this results in an assert panic like
this one from syzbot:
BTRFS: error (device loop0 state EAL) in free_log_tree:3267: errno=-5 IO failure
BTRFS warning (device loop0 state EAL): Skipping commit of aborted transaction.
BTRFS: error (device loop0 state EAL) in cleanup_transaction:2018: errno=-5 IO failure
assertion failed: folio_test_locked(folio), in fs/btrfs/subpage.c:871
------------[ cut here ]------------
kernel BUG at fs/btrfs/subpage.c:871!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 1 PID: 5090 Comm: syz-executor225 Not tainted
6.10.0-syzkaller-05505-gb1bc554e009e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 06/27/2024
RIP: 0010:btrfs_folio_end_all_writers+0x55b/0x610 fs/btrfs/subpage.c:871
Code: e9 d3 fb ff ff e8 25 22 c2 fd 48 c7 c7 c0 3c 0e 8c 48 c7 c6 80 3d
0e 8c 48 c7 c2 60 3c 0e 8c b9 67 03 00 00 e8 66 47 ad 07 90 <0f> 0b e8
6e 45 b0 07 4c 89 ff be 08 00 00 00 e8 21 12 25 fe 4c 89
RSP: 0018:ffffc900033d72e0 EFLAGS: 00010246
RAX: 0000000000000045 RBX: 00fff0000000402c RCX: 663b7a08c50a0a00
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
RBP: ffffc900033d73b0 R08: ffffffff8176b98c R09: 1ffff9200067adfc
R10: dffffc0000000000 R11: fffff5200067adfd R12: 0000000000000001
R13: dffffc0000000000 R14: 0000000000000000 R15: ffffea0001cbee80
FS: 0000000000000000(0000) GS:ffff8880b9500000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f5f076012f8 CR3: 000000000e134000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__extent_writepage fs/btrfs/extent_io.c:1597 [inline]
extent_write_cache_pages fs/btrfs/extent_io.c:2251 [inline]
btrfs_writepages+0x14d7/0x2760 fs/btrfs/extent_io.c:2373
do_writepages+0x359/0x870 mm/page-writeback.c:2656
filemap_fdatawrite_wbc+0x125/0x180 mm/filemap.c:397
__filemap_fdatawrite_range mm/filemap.c:430 [inline]
__filemap_fdatawrite mm/filemap.c:436 [inline]
filemap_flush+0xdf/0x130 mm/filemap.c:463
btrfs_release_file+0x117/0x130 fs/btrfs/file.c:1547
__fput+0x24a/0x8a0 fs/file_table.c:422
task_work_run+0x24f/0x310 kernel/task_work.c:222
exit_task_work include/linux/task_work.h:40 [inline]
do_exit+0xa2f/0x27f0 kernel/exit.c:877
do_group_exit+0x207/0x2c0 kernel/exit.c:1026
__do_sys_exit_group kernel/exit.c:1037 [inline]
__se_sys_exit_group kernel/exit.c:1035 [inline]
__x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1035
x64_sys_call+0x2634/0x2640
arch/x86/include/generated/asm/syscalls_64.h:232
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5f075b70c9
Code: Unable to access opcode bytes at
0x7f5f075b709f.
I was hitting the same issue by doing hundreds of accelerated runs of
generic/475, which also hits IO errors by design.
I instrumented that reproducer with bpftrace and found that the
undesirable folio_unlock was coming from the following callstack:
folio_unlock+5
__process_pages_contig+475
cow_file_range_inline.constprop.0+230
cow_file_range+803
btrfs_run_delalloc_range+566
writepage_delalloc+332
__extent_writepage # inlined in my stacktrace, but I added it here
extent_write_cache_pages+622
Looking at the bisected-to pa
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: Add a lock when accessing the buddy trim function
When running YouTube videos and Steam games simultaneously,
the tester found a system hang / race condition issue with
the multi-display configuration setting. Adding a lock to
the buddy allocator's trim function would be the solution.
<log snip>
[ 7197.250436] general protection fault, probably for non-canonical address 0xdead000000000108
[ 7197.250447] RIP: 0010:__alloc_range+0x8b/0x340 [amddrm_buddy]
[ 7197.250470] Call Trace:
[ 7197.250472] <TASK>
[ 7197.250475] ? show_regs+0x6d/0x80
[ 7197.250481] ? die_addr+0x37/0xa0
[ 7197.250483] ? exc_general_protection+0x1db/0x480
[ 7197.250488] ? drm_suballoc_new+0x13c/0x93d [drm_suballoc_helper]
[ 7197.250493] ? asm_exc_general_protection+0x27/0x30
[ 7197.250498] ? __alloc_range+0x8b/0x340 [amddrm_buddy]
[ 7197.250501] ? __alloc_range+0x109/0x340 [amddrm_buddy]
[ 7197.250506] amddrm_buddy_block_trim+0x1b5/0x260 [amddrm_buddy]
[ 7197.250511] amdgpu_vram_mgr_new+0x4f5/0x590 [amdgpu]
[ 7197.250682] amdttm_resource_alloc+0x46/0xb0 [amdttm]
[ 7197.250689] ttm_bo_alloc_resource+0xe4/0x370 [amdttm]
[ 7197.250696] amdttm_bo_validate+0x9d/0x180 [amdttm]
[ 7197.250701] amdgpu_bo_pin+0x15a/0x2f0 [amdgpu]
[ 7197.250831] amdgpu_dm_plane_helper_prepare_fb+0xb2/0x360 [amdgpu]
[ 7197.251025] ? try_wait_for_completion+0x59/0x70
[ 7197.251030] drm_atomic_helper_prepare_planes.part.0+0x2f/0x1e0
[ 7197.251035] drm_atomic_helper_prepare_planes+0x5d/0x70
[ 7197.251037] drm_atomic_helper_commit+0x84/0x160
[ 7197.251040] drm_atomic_nonblocking_commit+0x59/0x70
[ 7197.251043] drm_mode_atomic_ioctl+0x720/0x850
[ 7197.251047] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
[ 7197.251049] drm_ioctl_kernel+0xb9/0x120
[ 7197.251053] ? srso_alias_return_thunk+0x5/0xfbef5
[ 7197.251056] drm_ioctl+0x2d4/0x550
[ 7197.251058] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
[ 7197.251063] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu]
[ 7197.251186] __x64_sys_ioctl+0xa0/0xf0
[ 7197.251190] x64_sys_call+0x143b/0x25c0
[ 7197.251193] do_syscall_64+0x7f/0x180
[ 7197.251197] ? srso_alias_return_thunk+0x5/0xfbef5
[ 7197.251199] ? amdgpu_display_user_framebuffer_create+0x215/0x320 [amdgpu]
[ 7197.251329] ? drm_internal_framebuffer_create+0xb7/0x1a0
[ 7197.251332] ? srso_alias_return_thunk+0x5/0xfbef5
(cherry picked from commit 3318ba94e56b9183d0304577c74b33b6b01ce516) |
| In the Linux kernel, the following vulnerability has been resolved:
nvme: fix reconnection fail due to reserved tag allocation
We found a issue on production environment while using NVMe over RDMA,
admin_q reconnect failed forever while remote target and network is ok.
After dig into it, we found it may caused by a ABBA deadlock due to tag
allocation. In my case, the tag was hold by a keep alive request
waiting inside admin_q, as we quiesced admin_q while reset ctrl, so the
request maked as idle and will not process before reset success. As
fabric_q shares tagset with admin_q, while reconnect remote target, we
need a tag for connect command, but the only one reserved tag was held
by keep alive command which waiting inside admin_q. As a result, we
failed to reconnect admin_q forever. In order to fix this issue, I
think we should keep two reserved tags for admin queue. |
| In the Linux kernel, the following vulnerability has been resolved:
lib: alloc_tag_module_unload must wait for pending kfree_rcu calls
Ben Greear reports following splat:
------------[ cut here ]------------
net/netfilter/nf_nat_core.c:1114 module nf_nat func:nf_nat_register_fn has 256 allocated at module unload
WARNING: CPU: 1 PID: 10421 at lib/alloc_tag.c:168 alloc_tag_module_unload+0x22b/0x3f0
Modules linked in: nf_nat(-) btrfs ufs qnx4 hfsplus hfs minix vfat msdos fat
...
Hardware name: Default string Default string/SKYBAY, BIOS 5.12 08/04/2020
RIP: 0010:alloc_tag_module_unload+0x22b/0x3f0
codetag_unload_module+0x19b/0x2a0
? codetag_load_module+0x80/0x80
nf_nat module exit calls kfree_rcu on those addresses, but the free
operation is likely still pending by the time alloc_tag checks for leaks.
Wait for outstanding kfree_rcu operations to complete before checking
resolves this warning.
Reproducer:
unshare -n iptables-nft -t nat -A PREROUTING -p tcp
grep nf_nat /proc/allocinfo # will list 4 allocations
rmmod nft_chain_nat
rmmod nf_nat # will WARN.
[akpm@linux-foundation.org: add comment] |