| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: fix recursive lock when verdict program return SK_PASS
When the stream_verdict program returns SK_PASS, it places the received skb
into its own receive queue, but a recursive lock eventually occurs, leading
to an operating system deadlock. This issue has been present since v6.9.
'''
sk_psock_strp_data_ready
write_lock_bh(&sk->sk_callback_lock)
strp_data_ready
strp_read_sock
read_sock -> tcp_read_sock
strp_recv
cb.rcv_msg -> sk_psock_strp_read
# now stream_verdict return SK_PASS without peer sock assign
__SK_PASS = sk_psock_map_verd(SK_PASS, NULL)
sk_psock_verdict_apply
sk_psock_skb_ingress_self
sk_psock_skb_ingress_enqueue
sk_psock_data_ready
read_lock_bh(&sk->sk_callback_lock) <= dead lock
'''
This topic has been discussed before, but it has not been fixed.
Previous discussion:
https://lore.kernel.org/all/6684a5864ec86_403d20898@john.notmuch |
| In the Linux kernel, the following vulnerability has been resolved:
usb: musb: Fix hardware lockup on first Rx endpoint request
There is a possibility that a request's callback could be invoked from
usb_ep_queue() (call trace below, supplemented with missing calls):
req->complete from usb_gadget_giveback_request
(drivers/usb/gadget/udc/core.c:999)
usb_gadget_giveback_request from musb_g_giveback
(drivers/usb/musb/musb_gadget.c:147)
musb_g_giveback from rxstate
(drivers/usb/musb/musb_gadget.c:784)
rxstate from musb_ep_restart
(drivers/usb/musb/musb_gadget.c:1169)
musb_ep_restart from musb_ep_restart_resume_work
(drivers/usb/musb/musb_gadget.c:1176)
musb_ep_restart_resume_work from musb_queue_resume_work
(drivers/usb/musb/musb_core.c:2279)
musb_queue_resume_work from musb_gadget_queue
(drivers/usb/musb/musb_gadget.c:1241)
musb_gadget_queue from usb_ep_queue
(drivers/usb/gadget/udc/core.c:300)
According to the docstring of usb_ep_queue(), this should not happen:
"Note that @req's ->complete() callback must never be called from within
usb_ep_queue() as that can create deadlock situations."
In fact, a hardware lockup might occur in the following sequence:
1. The gadget is initialized using musb_gadget_enable().
2. Meanwhile, a packet arrives, and the RXPKTRDY flag is set, raising an
interrupt.
3. If IRQs are enabled, the interrupt is handled, but musb_g_rx() finds an
empty queue (next_request() returns NULL). The interrupt flag has
already been cleared by the glue layer handler, but the RXPKTRDY flag
remains set.
4. The first request is enqueued using usb_ep_queue(), leading to the call
of req->complete(), as shown in the call trace above.
5. If the callback enables IRQs and another packet is waiting, step (3)
repeats. The request queue is empty because usb_g_giveback() removes the
request before invoking the callback.
6. The endpoint remains locked up, as the interrupt triggered by hardware
setting the RXPKTRDY flag has been handled, but the flag itself remains
set.
For this scenario to occur, it is only necessary for IRQs to be enabled at
some point during the complete callback. This happens with the USB Ethernet
gadget, whose rx_complete() callback calls netif_rx(). If called in the
task context, netif_rx() disables the bottom halves (BHs). When the BHs are
re-enabled, IRQs are also enabled to allow soft IRQs to be processed. The
gadget itself is initialized at module load (or at boot if built-in), but
the first request is enqueued when the network interface is brought up,
triggering rx_complete() in the task context via ioctl(). If a packet
arrives while the interface is down, it can prevent the interface from
receiving any further packets from the USB host.
The situation is quite complicated with many parties involved. This
particular issue can be resolved in several possible ways:
1. Ensure that callbacks never enable IRQs. This would be difficult to
enforce, as discovering how netif_rx() interacts with interrupts was
already quite challenging and u_ether is not the only function driver.
Similar "bugs" could be hidden in other drivers as well.
2. Disable MUSB interrupts in musb_g_giveback() before calling the callback
and re-enable them afterwars (by calling musb_{dis,en}able_interrupts(),
for example). This would ensure that MUSB interrupts are not handled
during the callback, even if IRQs are enabled. In fact, it would allow
IRQs to be enabled when releasing the lock. However, this feels like an
inelegant hack.
3. Modify the interrupt handler to clear the RXPKTRDY flag if the request
queue is empty. While this approach also feels like a hack, it wastes
CPU time by attempting to handle incoming packets when the software is
not ready to process them.
4. Flush the Rx FIFO instead of calling rxstate() in musb_ep_restart().
This ensures that the hardware can receive packets when there is at
least one request in the queue. Once I
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
LoongArch: Fix sleeping in atomic context for PREEMPT_RT
Commit bab1c299f3945ffe79 ("LoongArch: Fix sleeping in atomic context in
setup_tlb_handler()") changes the gfp flag from GFP_KERNEL to GFP_ATOMIC
for alloc_pages_node(). However, for PREEMPT_RT kernels we can still get
a "sleeping in atomic context" error:
[ 0.372259] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[ 0.372266] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
[ 0.372268] preempt_count: 1, expected: 0
[ 0.372270] RCU nest depth: 1, expected: 1
[ 0.372272] 3 locks held by swapper/1/0:
[ 0.372274] #0: 900000000c9f5e60 (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x524/0x1c60
[ 0.372294] #1: 90000000087013b8 (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x50/0x140
[ 0.372305] #2: 900000047fffd388 (&zone->lock){+.+.}-{3:3}, at: __rmqueue_pcplist+0x30c/0xea0
[ 0.372314] irq event stamp: 0
[ 0.372316] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 0.372322] hardirqs last disabled at (0): [<9000000005947320>] copy_process+0x9c0/0x26e0
[ 0.372329] softirqs last enabled at (0): [<9000000005947320>] copy_process+0x9c0/0x26e0
[ 0.372335] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 0.372341] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.12.0-rc7+ #1891
[ 0.372346] Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0-prebeta9 10/21/2022
[ 0.372349] Stack : 0000000000000089 9000000005a0db9c 90000000071519c8 9000000100388000
[ 0.372486] 900000010038b890 0000000000000000 900000010038b898 9000000007e53788
[ 0.372492] 900000000815bcc8 900000000815bcc0 900000010038b700 0000000000000001
[ 0.372498] 0000000000000001 4b031894b9d6b725 00000000055ec000 9000000100338fc0
[ 0.372503] 00000000000000c4 0000000000000001 000000000000002d 0000000000000003
[ 0.372509] 0000000000000030 0000000000000003 00000000055ec000 0000000000000003
[ 0.372515] 900000000806d000 9000000007e53788 00000000000000b0 0000000000000004
[ 0.372521] 0000000000000000 0000000000000000 900000000c9f5f10 0000000000000000
[ 0.372526] 90000000076f12d8 9000000007e53788 9000000005924778 0000000000000000
[ 0.372532] 00000000000000b0 0000000000000004 0000000000000000 0000000000070000
[ 0.372537] ...
[ 0.372540] Call Trace:
[ 0.372542] [<9000000005924778>] show_stack+0x38/0x180
[ 0.372548] [<90000000071519c4>] dump_stack_lvl+0x94/0xe4
[ 0.372555] [<900000000599b880>] __might_resched+0x1a0/0x260
[ 0.372561] [<90000000071675cc>] rt_spin_lock+0x4c/0x140
[ 0.372565] [<9000000005cbb768>] __rmqueue_pcplist+0x308/0xea0
[ 0.372570] [<9000000005cbed84>] get_page_from_freelist+0x564/0x1c60
[ 0.372575] [<9000000005cc0d98>] __alloc_pages_noprof+0x218/0x1820
[ 0.372580] [<900000000593b36c>] tlb_init+0x1ac/0x298
[ 0.372585] [<9000000005924b74>] per_cpu_trap_init+0x114/0x140
[ 0.372589] [<9000000005921964>] cpu_probe+0x4e4/0xa60
[ 0.372592] [<9000000005934874>] start_secondary+0x34/0xc0
[ 0.372599] [<900000000715615c>] smpboot_entry+0x64/0x6c
This is because in PREEMPT_RT kernels normal spinlocks are replaced by
rt spinlocks and rt_spin_lock() will cause sleeping. Fix it by disabling
NUMA optimization completely for PREEMPT_RT kernels. |
| In the Linux kernel, the following vulnerability has been resolved:
ALSA: usx2y: Use snd_card_free_when_closed() at disconnection
The USB disconnect callback is supposed to be short and not too-long
waiting. OTOH, the current code uses snd_card_free() at
disconnection, but this waits for the close of all used fds, hence it
can take long. It eventually blocks the upper layer USB ioctls, which
may trigger a soft lockup.
An easy workaround is to replace snd_card_free() with
snd_card_free_when_closed(). This variant returns immediately while
the release of resources is done asynchronously by the card device
release at the last close. |
| In the Linux kernel, the following vulnerability has been resolved:
ALSA: us122l: Use snd_card_free_when_closed() at disconnection
The USB disconnect callback is supposed to be short and not too-long
waiting. OTOH, the current code uses snd_card_free() at
disconnection, but this waits for the close of all used fds, hence it
can take long. It eventually blocks the upper layer USB ioctls, which
may trigger a soft lockup.
An easy workaround is to replace snd_card_free() with
snd_card_free_when_closed(). This variant returns immediately while
the release of resources is done asynchronously by the card device
release at the last close.
The loop of us122l->mmap_count check is dropped as well. The check is
useless for the asynchronous operation with *_when_closed(). |
| In the Linux kernel, the following vulnerability has been resolved:
ALSA: caiaq: Use snd_card_free_when_closed() at disconnection
The USB disconnect callback is supposed to be short and not too-long
waiting. OTOH, the current code uses snd_card_free() at
disconnection, but this waits for the close of all used fds, hence it
can take long. It eventually blocks the upper layer USB ioctls, which
may trigger a soft lockup.
An easy workaround is to replace snd_card_free() with
snd_card_free_when_closed(). This variant returns immediately while
the release of resources is done asynchronously by the card device
release at the last close.
This patch also splits the code to the disconnect and the free phases;
the former is called immediately at the USB disconnect callback while
the latter is called from the card destructor. |
| In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: MGMT: Fix possible deadlocks
This fixes possible deadlocks like the following caused by
hci_cmd_sync_dequeue causing the destroy function to run:
INFO: task kworker/u19:0:143 blocked for more than 120 seconds.
Tainted: G W O 6.8.0-2024-03-19-intel-next-iLS-24ww14 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u19:0 state:D stack:0 pid:143 tgid:143 ppid:2 flags:0x00004000
Workqueue: hci0 hci_cmd_sync_work [bluetooth]
Call Trace:
<TASK>
__schedule+0x374/0xaf0
schedule+0x3c/0xf0
schedule_preempt_disabled+0x1c/0x30
__mutex_lock.constprop.0+0x3ef/0x7a0
__mutex_lock_slowpath+0x13/0x20
mutex_lock+0x3c/0x50
mgmt_set_connectable_complete+0xa4/0x150 [bluetooth]
? kfree+0x211/0x2a0
hci_cmd_sync_dequeue+0xae/0x130 [bluetooth]
? __pfx_cmd_complete_rsp+0x10/0x10 [bluetooth]
cmd_complete_rsp+0x26/0x80 [bluetooth]
mgmt_pending_foreach+0x4d/0x70 [bluetooth]
__mgmt_power_off+0x8d/0x180 [bluetooth]
? _raw_spin_unlock_irq+0x23/0x40
hci_dev_close_sync+0x445/0x5b0 [bluetooth]
hci_set_powered_sync+0x149/0x250 [bluetooth]
set_powered_sync+0x24/0x60 [bluetooth]
hci_cmd_sync_work+0x90/0x150 [bluetooth]
process_one_work+0x13e/0x300
worker_thread+0x2f7/0x420
? __pfx_worker_thread+0x10/0x10
kthread+0x107/0x140
? __pfx_kthread+0x10/0x10
ret_from_fork+0x3d/0x60
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK> |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: rtw89: avoid to add interface to list twice when SER
If SER L2 occurs during the WoWLAN resume flow, the add interface flow
is triggered by ieee80211_reconfig(). However, due to
rtw89_wow_resume() return failure, it will cause the add interface flow
to be executed again, resulting in a double add list and causing a kernel
panic. Therefore, we have added a check to prevent double adding of the
list.
list_add double add: new=ffff99d6992e2010, prev=ffff99d6992e2010, next=ffff99d695302628.
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:37!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W O 6.6.30-02659-gc18865c4dfbd #1 770df2933251a0e3c888ba69d1053a817a6376a7
Hardware name: HP Grunt/Grunt, BIOS Google_Grunt.11031.169.0 06/24/2021
Workqueue: events_freezable ieee80211_restart_work [mac80211]
RIP: 0010:__list_add_valid_or_report+0x5e/0xb0
Code: c7 74 18 48 39 ce 74 13 b0 01 59 5a 5e 5f 41 58 41 59 41 5a 5d e9 e2 d6 03 00 cc 48 c7 c7 8d 4f 17 83 48 89 c2 e8 02 c0 00 00 <0f> 0b 48 c7 c7 aa 8c 1c 83 e8 f4 bf 00 00 0f 0b 48 c7 c7 c8 bc 12
RSP: 0018:ffffa91b8007bc50 EFLAGS: 00010246
RAX: 0000000000000058 RBX: ffff99d6992e0900 RCX: a014d76c70ef3900
RDX: ffffa91b8007bae8 RSI: 00000000ffffdfff RDI: 0000000000000001
RBP: ffffa91b8007bc88 R08: 0000000000000000 R09: ffffa91b8007bae0
R10: 00000000ffffdfff R11: ffffffff83a79800 R12: ffff99d695302060
R13: ffff99d695300900 R14: ffff99d6992e1be0 R15: ffff99d6992e2010
FS: 0000000000000000(0000) GS:ffff99d6aac00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000078fbdba43480 CR3: 000000010e464000 CR4: 00000000001506f0
Call Trace:
<TASK>
? __die_body+0x1f/0x70
? die+0x3d/0x60
? do_trap+0xa4/0x110
? __list_add_valid_or_report+0x5e/0xb0
? do_error_trap+0x6d/0x90
? __list_add_valid_or_report+0x5e/0xb0
? handle_invalid_op+0x30/0x40
? __list_add_valid_or_report+0x5e/0xb0
? exc_invalid_op+0x3c/0x50
? asm_exc_invalid_op+0x16/0x20
? __list_add_valid_or_report+0x5e/0xb0
rtw89_ops_add_interface+0x309/0x310 [rtw89_core 7c32b1ee6854761c0321027c8a58c5160e41f48f]
drv_add_interface+0x5c/0x130 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc]
ieee80211_reconfig+0x241/0x13d0 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc]
? finish_wait+0x3e/0x90
? synchronize_rcu_expedited+0x174/0x260
? sync_rcu_exp_done_unlocked+0x50/0x50
? wake_bit_function+0x40/0x40
ieee80211_restart_work+0xf0/0x140 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc]
process_scheduled_works+0x1e5/0x480
worker_thread+0xea/0x1e0
kthread+0xdb/0x110
? move_linked_works+0x90/0x90
? kthread_associate_blkcg+0xa0/0xa0
ret_from_fork+0x3b/0x50
? kthread_associate_blkcg+0xa0/0xa0
ret_from_fork_asm+0x11/0x20
</TASK>
Modules linked in: dm_integrity async_xor xor async_tx lz4 lz4_compress zstd zstd_compress zram zsmalloc rfcomm cmac uinput algif_hash algif_skcipher af_alg btusb btrtl iio_trig_hrtimer industrialio_sw_trigger btmtk industrialio_configfs btbcm btintel uvcvideo videobuf2_vmalloc iio_trig_sysfs videobuf2_memops videobuf2_v4l2 videobuf2_common uvc snd_hda_codec_hdmi veth snd_hda_intel snd_intel_dspcfg acpi_als snd_hda_codec industrialio_triggered_buffer kfifo_buf snd_hwdep industrialio i2c_piix4 snd_hda_core designware_i2s ip6table_nat snd_soc_max98357a xt_MASQUERADE xt_cgroup snd_soc_acp_rt5682_mach fuse rtw89_8922ae(O) rtw89_8922a(O) rtw89_pci(O) rtw89_core(O) 8021q mac80211(O) bluetooth ecdh_generic ecc cfg80211 r8152 mii joydev
gsmi: Log Shutdown Reason 0x03
---[ end trace 0000000000000000 ]--- |
| In the Linux kernel, the following vulnerability has been resolved:
dma-debug: fix a possible deadlock on radix_lock
radix_lock() shouldn't be held while holding dma_hash_entry[idx].lock
otherwise, there's a possible deadlock scenario when
dma debug API is called holding rq_lock():
CPU0 CPU1 CPU2
dma_free_attrs()
check_unmap() add_dma_entry() __schedule() //out
(A) rq_lock()
get_hash_bucket()
(A) dma_entry_hash
check_sync()
(A) radix_lock() (W) dma_entry_hash
dma_entry_free()
(W) radix_lock()
// CPU2's one
(W) rq_lock()
CPU1 situation can happen when it extending radix tree and
it tries to wake up kswapd via wake_all_kswapd().
CPU2 situation can happen while perf_event_task_sched_out()
(i.e. dma sync operation is called while deleting perf_event using
etm and etr tmc which are Arm Coresight hwtracing driver backends).
To remove this possible situation, call dma_entry_free() after
put_hash_bucket() in check_unmap(). |
| In the Linux kernel, the following vulnerability has been resolved:
serial: sc16is7xx: fix invalid FIFO access with special register set
When enabling access to the special register set, Receiver time-out and
RHR interrupts can happen. In this case, the IRQ handler will try to read
from the FIFO thru the RHR register at address 0x00, but address 0x00 is
mapped to DLL register, resulting in erroneous FIFO reading.
Call graph example:
sc16is7xx_startup(): entry
sc16is7xx_ms_proc(): entry
sc16is7xx_set_termios(): entry
sc16is7xx_set_baud(): DLH/DLL = $009C --> access special register set
sc16is7xx_port_irq() entry --> IIR is 0x0C
sc16is7xx_handle_rx() entry
sc16is7xx_fifo_read(): --> unable to access FIFO (RHR) because it is
mapped to DLL (LCR=LCR_CONF_MODE_A)
sc16is7xx_set_baud(): exit --> Restore access to general register set
Fix the problem by claiming the efr_lock mutex when accessing the Special
register set. |
| In the Linux kernel, the following vulnerability has been resolved:
i3c: Use i3cdev->desc->info instead of calling i3c_device_get_info() to avoid deadlock
A deadlock may happen since the i3c_master_register() acquires
&i3cbus->lock twice. See the log below.
Use i3cdev->desc->info instead of calling i3c_device_info() to
avoid acquiring the lock twice.
v2:
- Modified the title and commit message
============================================
WARNING: possible recursive locking detected
6.11.0-mainline
--------------------------------------------
init/1 is trying to acquire lock:
f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_bus_normaluse_lock
but task is already holding lock:
f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_master_register
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&i3cbus->lock);
lock(&i3cbus->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by init/1:
#0: fcffff809b6798f8 (&dev->mutex){....}-{3:3}, at: __driver_attach
#1: f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_master_register
stack backtrace:
CPU: 6 UID: 0 PID: 1 Comm: init
Call trace:
dump_backtrace+0xfc/0x17c
show_stack+0x18/0x28
dump_stack_lvl+0x40/0xc0
dump_stack+0x18/0x24
print_deadlock_bug+0x388/0x390
__lock_acquire+0x18bc/0x32ec
lock_acquire+0x134/0x2b0
down_read+0x50/0x19c
i3c_bus_normaluse_lock+0x14/0x24
i3c_device_get_info+0x24/0x58
i3c_device_uevent+0x34/0xa4
dev_uevent+0x310/0x384
kobject_uevent_env+0x244/0x414
kobject_uevent+0x14/0x20
device_add+0x278/0x460
device_register+0x20/0x34
i3c_master_register_new_i3c_devs+0x78/0x154
i3c_master_register+0x6a0/0x6d4
mtk_i3c_master_probe+0x3b8/0x4d8
platform_probe+0xa0/0xe0
really_probe+0x114/0x454
__driver_probe_device+0xa0/0x15c
driver_probe_device+0x3c/0x1ac
__driver_attach+0xc4/0x1f0
bus_for_each_dev+0x104/0x160
driver_attach+0x24/0x34
bus_add_driver+0x14c/0x294
driver_register+0x68/0x104
__platform_driver_register+0x20/0x30
init_module+0x20/0xfe4
do_one_initcall+0x184/0x464
do_init_module+0x58/0x1ec
load_module+0xefc/0x10c8
__arm64_sys_finit_module+0x238/0x33c
invoke_syscall+0x58/0x10c
el0_svc_common+0xa8/0xdc
do_el0_svc+0x1c/0x28
el0_svc+0x50/0xac
el0t_64_sync_handler+0x70/0xbc
el0t_64_sync+0x1a8/0x1ac |
| In the Linux kernel, the following vulnerability has been resolved:
exfat: fix potential deadlock on __exfat_get_dentry_set
When accessing a file with more entries than ES_MAX_ENTRY_NUM, the bh-array
is allocated in __exfat_get_entry_set. The problem is that the bh-array is
allocated with GFP_KERNEL. It does not make sense. In the following cases,
a deadlock for sbi->s_lock between the two processes may occur.
CPU0 CPU1
---- ----
kswapd
balance_pgdat
lock(fs_reclaim)
exfat_iterate
lock(&sbi->s_lock)
exfat_readdir
exfat_get_uniname_from_ext_entry
exfat_get_dentry_set
__exfat_get_dentry_set
kmalloc_array
...
lock(fs_reclaim)
...
evict
exfat_evict_inode
lock(&sbi->s_lock)
To fix this, let's allocate bh-array with GFP_NOFS. |
| In the Linux kernel, the following vulnerability has been resolved:
RDMA/hns: Fix deadlock on SRQ async events.
xa_lock for SRQ table may be required in AEQ. Use xa_store_irq()/
xa_erase_irq() to avoid deadlock. |
| Improper initialization in UEFI firmware OutOfBandXML module in some Intel(R) Processors may allow a privileged user to potentially enable information disclosure via local access. |
| In the Linux kernel, the following vulnerability has been resolved:
soc: qcom: pdr: Fix the potential deadlock
When some client process A call pdr_add_lookup() to add the look up for
the service and does schedule locator work, later a process B got a new
server packet indicating locator is up and call pdr_locator_new_server()
which eventually sets pdr->locator_init_complete to true which process A
sees and takes list lock and queries domain list but it will timeout due
to deadlock as the response will queued to the same qmi->wq and it is
ordered workqueue and process B is not able to complete new server
request work due to deadlock on list lock.
Fix it by removing the unnecessary list iteration as the list iteration
is already being done inside locator work, so avoid it here and just
call schedule_work() here.
Process A Process B
process_scheduled_works()
pdr_add_lookup() qmi_data_ready_work()
process_scheduled_works() pdr_locator_new_server()
pdr->locator_init_complete=true;
pdr_locator_work()
mutex_lock(&pdr->list_lock);
pdr_locate_service() mutex_lock(&pdr->list_lock);
pdr_get_domain_list()
pr_err("PDR: %s get domain list
txn wait failed: %d\n",
req->service_name,
ret);
Timeout error log due to deadlock:
"
PDR: tms/servreg get domain list txn wait failed: -110
PDR: service lookup for msm/adsp/sensor_pd:tms/servreg failed: -110
"
Thanks to Bjorn and Johan for letting me know that this commit also fixes
an audio regression when using the in-kernel pd-mapper as that makes it
easier to hit this race. [1] |
| In the Linux kernel, the following vulnerability has been resolved:
RDMA/hns: Fix soft lockup during bt pages loop
Driver runs a for-loop when allocating bt pages and mapping them with
buffer pages. When a large buffer (e.g. MR over 100GB) is being allocated,
it may require a considerable loop count. This will lead to soft lockup:
watchdog: BUG: soft lockup - CPU#27 stuck for 22s!
...
Call trace:
hem_list_alloc_mid_bt+0x124/0x394 [hns_roce_hw_v2]
hns_roce_hem_list_request+0xf8/0x160 [hns_roce_hw_v2]
hns_roce_mtr_create+0x2e4/0x360 [hns_roce_hw_v2]
alloc_mr_pbl+0xd4/0x17c [hns_roce_hw_v2]
hns_roce_reg_user_mr+0xf8/0x190 [hns_roce_hw_v2]
ib_uverbs_reg_mr+0x118/0x290
watchdog: BUG: soft lockup - CPU#35 stuck for 23s!
...
Call trace:
hns_roce_hem_list_find_mtt+0x7c/0xb0 [hns_roce_hw_v2]
mtr_map_bufs+0xc4/0x204 [hns_roce_hw_v2]
hns_roce_mtr_create+0x31c/0x3c4 [hns_roce_hw_v2]
alloc_mr_pbl+0xb0/0x160 [hns_roce_hw_v2]
hns_roce_reg_user_mr+0x108/0x1c0 [hns_roce_hw_v2]
ib_uverbs_reg_mr+0x120/0x2bc
Add a cond_resched() to fix soft lockup during these loops. In order not
to affect the allocation performance of normal-size buffer, set the loop
count of a 100GB MR as the threshold to call cond_resched(). |
| In the Linux kernel, the following vulnerability has been resolved:
net: switchdev: Convert blocking notification chain to a raw one
A blocking notification chain uses a read-write semaphore to protect the
integrity of the chain. The semaphore is acquired for writing when
adding / removing notifiers to / from the chain and acquired for reading
when traversing the chain and informing notifiers about an event.
In case of the blocking switchdev notification chain, recursive
notifications are possible which leads to the semaphore being acquired
twice for reading and to lockdep warnings being generated [1].
Specifically, this can happen when the bridge driver processes a
SWITCHDEV_BRPORT_UNOFFLOADED event which causes it to emit notifications
about deferred events when calling switchdev_deferred_process().
Fix this by converting the notification chain to a raw notification
chain in a similar fashion to the netdev notification chain. Protect
the chain using the RTNL mutex by acquiring it when modifying the chain.
Events are always informed under the RTNL mutex, but add an assertion in
call_switchdev_blocking_notifiers() to make sure this is not violated in
the future.
Maintain the "blocking" prefix as events are always emitted from process
context and listeners are allowed to block.
[1]:
WARNING: possible recursive locking detected
6.14.0-rc4-custom-g079270089484 #1 Not tainted
--------------------------------------------
ip/52731 is trying to acquire lock:
ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
but task is already holding lock:
ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock((switchdev_blocking_notif_chain).rwsem);
lock((switchdev_blocking_notif_chain).rwsem);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by ip/52731:
#0: ffffffff84f795b0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x727/0x1dc0
#1: ffffffff8731f628 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x790/0x1dc0
#2: ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
stack backtrace:
...
? __pfx_down_read+0x10/0x10
? __pfx_mark_lock+0x10/0x10
? __pfx_switchdev_port_attr_set_deferred+0x10/0x10
blocking_notifier_call_chain+0x58/0xa0
switchdev_port_attr_notify.constprop.0+0xb3/0x1b0
? __pfx_switchdev_port_attr_notify.constprop.0+0x10/0x10
? mark_held_locks+0x94/0xe0
? switchdev_deferred_process+0x11a/0x340
switchdev_port_attr_set_deferred+0x27/0xd0
switchdev_deferred_process+0x164/0x340
br_switchdev_port_unoffload+0xc8/0x100 [bridge]
br_switchdev_blocking_event+0x29f/0x580 [bridge]
notifier_call_chain+0xa2/0x440
blocking_notifier_call_chain+0x6e/0xa0
switchdev_bridge_port_unoffload+0xde/0x1a0
... |
| In the Linux kernel, the following vulnerability has been resolved:
bus: mhi: host: pci_generic: Use pci_try_reset_function() to avoid deadlock
There are multiple places from where the recovery work gets scheduled
asynchronously. Also, there are multiple places where the caller waits
synchronously for the recovery to be completed. One such place is during
the PM shutdown() callback.
If the device is not alive during recovery_work, it will try to reset the
device using pci_reset_function(). This function internally will take the
device_lock() first before resetting the device. By this time, if the lock
has already been acquired, then recovery_work will get stalled while
waiting for the lock. And if the lock was already acquired by the caller
which waits for the recovery_work to be completed, it will lead to
deadlock.
This is what happened on the X1E80100 CRD device when the device died
before shutdown() callback. Driver core calls the driver's shutdown()
callback while holding the device_lock() leading to deadlock.
And this deadlock scenario can occur on other paths as well, like during
the PM suspend() callback, where the driver core would hold the
device_lock() before calling driver's suspend() callback. And if the
recovery_work was already started, it could lead to deadlock. This is also
observed on the X1E80100 CRD.
So to fix both issues, use pci_try_reset_function() in recovery_work. This
function first checks for the availability of the device_lock() before
trying to reset the device. If the lock is available, it will acquire it
and reset the device. Otherwise, it will return -EAGAIN. If that happens,
recovery_work will fail with the error message "Recovery failed" as not
much could be done. |
| In the Linux kernel, the following vulnerability has been resolved:
ksmbd: fix bug on trap in smb2_lock
If lock count is greater than 1, flags could be old value.
It should be checked with flags of smb_lock, not flags.
It will cause bug-on trap from locks_free_lock in error handling
routine. |
| In the Linux kernel, the following vulnerability has been resolved:
hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio
Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to
be offlined) add page poison checks in do_migrate_range in order to make
offline hwpoisoned page possible by introducing isolate_lru_page and
try_to_unmap for hwpoisoned page. However folio lock must be held before
calling try_to_unmap. Add it to fix this problem.
Warning will be produced if folio is not locked during unmap:
------------[ cut here ]------------
kernel BUG at ./include/linux/swapops.h:400!
Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
Modules linked in:
CPU: 4 UID: 0 PID: 411 Comm: bash Tainted: G W 6.13.0-rc1-00016-g3c434c7ee82a-dirty #41
Tainted: [W]=WARN
Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : try_to_unmap_one+0xb08/0xd3c
lr : try_to_unmap_one+0x3dc/0xd3c
Call trace:
try_to_unmap_one+0xb08/0xd3c (P)
try_to_unmap_one+0x3dc/0xd3c (L)
rmap_walk_anon+0xdc/0x1f8
rmap_walk+0x3c/0x58
try_to_unmap+0x88/0x90
unmap_poisoned_folio+0x30/0xa8
do_migrate_range+0x4a0/0x568
offline_pages+0x5a4/0x670
memory_block_action+0x17c/0x374
memory_subsys_offline+0x3c/0x78
device_offline+0xa4/0xd0
state_store+0x8c/0xf0
dev_attr_store+0x18/0x2c
sysfs_kf_write+0x44/0x54
kernfs_fop_write_iter+0x118/0x1a8
vfs_write+0x3a8/0x4bc
ksys_write+0x6c/0xf8
__arm64_sys_write+0x1c/0x28
invoke_syscall+0x44/0x100
el0_svc_common.constprop.0+0x40/0xe0
do_el0_svc+0x1c/0x28
el0_svc+0x30/0xd0
el0t_64_sync_handler+0xc8/0xcc
el0t_64_sync+0x198/0x19c
Code: f9407be0 b5fff320 d4210000 17ffff97 (d4210000)
---[ end trace 0000000000000000 ]--- |