CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
In the Linux kernel, the following vulnerability has been resolved:
pinctrl: fix deadlock in create_pinctrl() when handling -EPROBE_DEFER
In create_pinctrl(), pinctrl_maps_mutex is acquired before calling
add_setting(). If add_setting() returns -EPROBE_DEFER, create_pinctrl()
calls pinctrl_free(). However, pinctrl_free() attempts to acquire
pinctrl_maps_mutex, which is already held by create_pinctrl(), leading to
a potential deadlock.
This patch resolves the issue by releasing pinctrl_maps_mutex before
calling pinctrl_free(), preventing the deadlock.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc. |
In the Linux kernel, the following vulnerability has been resolved:
io_uring: fix possible deadlock in io_register_iowq_max_workers()
The io_register_iowq_max_workers() function calls io_put_sq_data(),
which acquires the sqd->lock without releasing the uring_lock.
Similar to the commit 009ad9f0c6ee ("io_uring: drop ctx->uring_lock
before acquiring sqd->lock"), this can lead to a potential deadlock
situation.
To resolve this issue, the uring_lock is released before calling
io_put_sq_data(), and then it is re-acquired after the function call.
This change ensures that the locks are acquired in the correct
order, preventing the possibility of a deadlock. |
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: hci_core: cancel all works upon hci_unregister_dev()
syzbot is reporting that calling hci_release_dev() from hci_error_reset()
due to hci_dev_put() from hci_error_reset() can cause deadlock at
destroy_workqueue(), for hci_error_reset() is called from
hdev->req_workqueue which destroy_workqueue() needs to flush.
We need to make sure that hdev->{rx_work,cmd_work,tx_work} which are
queued into hdev->workqueue and hdev->{power_on,error_reset} which are
queued into hdev->req_workqueue are no longer running by the moment
destroy_workqueue(hdev->workqueue);
destroy_workqueue(hdev->req_workqueue);
are called from hci_release_dev().
Call cancel_work_sync() on these work items from hci_unregister_dev()
as soon as hdev->list is removed from hci_dev_list. |
In the Linux kernel, the following vulnerability has been resolved:
net/sched: act_api: fix possible infinite loop in tcf_idr_check_alloc()
syzbot found hanging tasks waiting on rtnl_lock [1]
A reproducer is available in the syzbot bug.
When a request to add multiple actions with the same index is sent, the
second request will block forever on the first request. This holds
rtnl_lock, and causes tasks to hang.
Return -EAGAIN to prevent infinite looping, while keeping documented
behavior.
[1]
INFO: task kworker/1:0:5088 blocked for more than 143 seconds.
Not tainted 6.9.0-rc4-syzkaller-00173-g3cdb45594619 #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/1:0 state:D stack:23744 pid:5088 tgid:5088 ppid:2 flags:0x00004000
Workqueue: events_power_efficient reg_check_chans_work
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5409 [inline]
__schedule+0xf15/0x5d00 kernel/sched/core.c:6746
__schedule_loop kernel/sched/core.c:6823 [inline]
schedule+0xe7/0x350 kernel/sched/core.c:6838
schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6895
__mutex_lock_common kernel/locking/mutex.c:684 [inline]
__mutex_lock+0x5b8/0x9c0 kernel/locking/mutex.c:752
wiphy_lock include/net/cfg80211.h:5953 [inline]
reg_leave_invalid_chans net/wireless/reg.c:2466 [inline]
reg_check_chans_work+0x10a/0x10e0 net/wireless/reg.c:2481 |
In the Linux kernel, the following vulnerability has been resolved:
batman-adv: bypass empty buckets in batadv_purge_orig_ref()
Many syzbot reports are pointing to soft lockups in
batadv_purge_orig_ref() [1]
Root cause is unknown, but we can avoid spending too much
time there and perhaps get more interesting reports.
[1]
watchdog: BUG: soft lockup - CPU#0 stuck for 27s! [kworker/u4:6:621]
Modules linked in:
irq event stamp: 6182794
hardirqs last enabled at (6182793): [<ffff8000801dae10>] __local_bh_enable_ip+0x224/0x44c kernel/softirq.c:386
hardirqs last disabled at (6182794): [<ffff80008ad66a78>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline]
hardirqs last disabled at (6182794): [<ffff80008ad66a78>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551
softirqs last enabled at (6182792): [<ffff80008aab71c4>] spin_unlock_bh include/linux/spinlock.h:396 [inline]
softirqs last enabled at (6182792): [<ffff80008aab71c4>] batadv_purge_orig_ref+0x114c/0x1228 net/batman-adv/originator.c:1287
softirqs last disabled at (6182790): [<ffff80008aab61dc>] spin_lock_bh include/linux/spinlock.h:356 [inline]
softirqs last disabled at (6182790): [<ffff80008aab61dc>] batadv_purge_orig_ref+0x164/0x1228 net/batman-adv/originator.c:1271
CPU: 0 PID: 621 Comm: kworker/u4:6 Not tainted 6.8.0-rc7-syzkaller-g707081b61156 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Workqueue: bat_events batadv_purge_orig
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : should_resched arch/arm64/include/asm/preempt.h:79 [inline]
pc : __local_bh_enable_ip+0x228/0x44c kernel/softirq.c:388
lr : __local_bh_enable_ip+0x224/0x44c kernel/softirq.c:386
sp : ffff800099007970
x29: ffff800099007980 x28: 1fffe00018fce1bd x27: dfff800000000000
x26: ffff0000d2620008 x25: ffff0000c7e70de8 x24: 0000000000000001
x23: 1fffe00018e57781 x22: dfff800000000000 x21: ffff80008aab71c4
x20: ffff0001b40136c0 x19: ffff0000c72bbc08 x18: 1fffe0001a817bb0
x17: ffff800125414000 x16: ffff80008032116c x15: 0000000000000001
x14: 1fffe0001ee9d610 x13: 0000000000000000 x12: 0000000000000003
x11: 0000000000000000 x10: 0000000000ff0100 x9 : 0000000000000000
x8 : 00000000005e5789 x7 : ffff80008aab61dc x6 : 0000000000000000
x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000
x2 : 0000000000000006 x1 : 0000000000000080 x0 : ffff800125414000
Call trace:
__daif_local_irq_enable arch/arm64/include/asm/irqflags.h:27 [inline]
arch_local_irq_enable arch/arm64/include/asm/irqflags.h:49 [inline]
__local_bh_enable_ip+0x228/0x44c kernel/softirq.c:386
__raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline]
_raw_spin_unlock_bh+0x3c/0x4c kernel/locking/spinlock.c:210
spin_unlock_bh include/linux/spinlock.h:396 [inline]
batadv_purge_orig_ref+0x114c/0x1228 net/batman-adv/originator.c:1287
batadv_purge_orig+0x20/0x70 net/batman-adv/originator.c:1300
process_one_work+0x694/0x1204 kernel/workqueue.c:2633
process_scheduled_works kernel/workqueue.c:2706 [inline]
worker_thread+0x938/0xef4 kernel/workqueue.c:2787
kthread+0x288/0x310 kernel/kthread.c:388
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.8.0-rc7-syzkaller-g707081b61156 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : arch_local_irq_enable+0x8/0xc arch/arm64/include/asm/irqflags.h:51
lr : default_idle_call+0xf8/0x128 kernel/sched/idle.c:103
sp : ffff800093a17d30
x29: ffff800093a17d30 x28: dfff800000000000 x27: 1ffff00012742fb4
x26: ffff80008ec9d000 x25: 0000000000000000 x24: 0000000000000002
x23: 1ffff00011d93a74 x22: ffff80008ec9d3a0 x21: 0000000000000000
x20: ffff0000c19dbc00 x19: ffff8000802d0fd8 x18: 1fffe00036804396
x17: ffff80008ec9d000 x16: ffff8000802d089c x15: 0000000000000001
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
wifi: mt76: mt7921s: fix potential hung tasks during chip recovery
During chip recovery (e.g. chip reset), there is a possible situation that
kernel worker reset_work is holding the lock and waiting for kernel thread
stat_worker to be parked, while stat_worker is waiting for the release of
the same lock.
It causes a deadlock resulting in the dumping of hung tasks messages and
possible rebooting of the device.
This patch prevents the execution of stat_worker during the chip recovery. |
In the Linux kernel, the following vulnerability has been resolved:
ext4: do not create EA inode under buffer lock
ext4_xattr_set_entry() creates new EA inodes while holding buffer lock
on the external xattr block. This is problematic as it nests all the
allocation locking (which acquires locks on other buffers) under the
buffer lock. This can even deadlock when the filesystem is corrupted and
e.g. quota file is setup to contain xattr block as data block. Move the
allocation of EA inode out of ext4_xattr_set_entry() into the callers. |
In the Linux kernel, the following vulnerability has been resolved:
serial: imx: Introduce timeout when waiting on transmitter empty
By waiting at most 1 second for USR2_TXDC to be set, we avoid a potential
deadlock.
In case of the timeout, there is not much we can do, so we simply ignore
the transmitter state and optimistically try to continue. |
In the Linux kernel, the following vulnerability has been resolved:
i2c: lpi2c: Avoid calling clk_get_rate during transfer
Instead of repeatedly calling clk_get_rate for each transfer, lock
the clock rate and cache the value.
A deadlock has been observed while adding tlv320aic32x4 audio codec to
the system. When this clock provider adds its clock, the clk mutex is
locked already, it needs to access i2c, which in return needs the mutex
for clk_get_rate as well. |
In the Linux kernel, the following vulnerability has been resolved:
eth: sungem: remove .ndo_poll_controller to avoid deadlocks
Erhard reports netpoll warnings from sungem:
netpoll_send_skb_on_dev(): eth0 enabled interrupts in poll (gem_start_xmit+0x0/0x398)
WARNING: CPU: 1 PID: 1 at net/core/netpoll.c:370 netpoll_send_skb+0x1fc/0x20c
gem_poll_controller() disables interrupts, which may sleep.
We can't sleep in netpoll, it has interrupts disabled completely.
Strangely, gem_poll_controller() doesn't even poll the completions,
and instead acts as if an interrupt has fired so it just schedules
NAPI and exits. None of this has been necessary for years, since
netpoll invokes NAPI directly. |
In the Linux kernel, the following vulnerability has been resolved:
RDMA/hns: Fix deadlock on SRQ async events.
xa_lock for SRQ table may be required in AEQ. Use xa_store_irq()/
xa_erase_irq() to avoid deadlock. |
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Reload only IB representors upon lag disable/enable
On lag disable, the bond IB device along with all of its
representors are destroyed, and then the slaves' representors get reloaded.
In case the slave IB representor load fails, the eswitch error flow
unloads all representors, including ethernet representors, where the
netdevs get detached and removed from lag bond. Such flow is inaccurate
as the lag driver is not responsible for loading/unloading ethernet
representors. Furthermore, the flow described above begins by holding
lag lock to prevent bond changes during disable flow. However, when
reaching the ethernet representors detachment from lag, the lag lock is
required again, triggering the following deadlock:
Call trace:
__switch_to+0xf4/0x148
__schedule+0x2c8/0x7d0
schedule+0x50/0xe0
schedule_preempt_disabled+0x18/0x28
__mutex_lock.isra.13+0x2b8/0x570
__mutex_lock_slowpath+0x1c/0x28
mutex_lock+0x4c/0x68
mlx5_lag_remove_netdev+0x3c/0x1a0 [mlx5_core]
mlx5e_uplink_rep_disable+0x70/0xa0 [mlx5_core]
mlx5e_detach_netdev+0x6c/0xb0 [mlx5_core]
mlx5e_netdev_change_profile+0x44/0x138 [mlx5_core]
mlx5e_netdev_attach_nic_profile+0x28/0x38 [mlx5_core]
mlx5e_vport_rep_unload+0x184/0x1b8 [mlx5_core]
mlx5_esw_offloads_rep_load+0xd8/0xe0 [mlx5_core]
mlx5_eswitch_reload_reps+0x74/0xd0 [mlx5_core]
mlx5_disable_lag+0x130/0x138 [mlx5_core]
mlx5_lag_disable_change+0x6c/0x70 [mlx5_core] // hold ldev->lock
mlx5_devlink_eswitch_mode_set+0xc0/0x410 [mlx5_core]
devlink_nl_cmd_eswitch_set_doit+0xdc/0x180
genl_family_rcv_msg_doit.isra.17+0xe8/0x138
genl_rcv_msg+0xe4/0x220
netlink_rcv_skb+0x44/0x108
genl_rcv+0x40/0x58
netlink_unicast+0x198/0x268
netlink_sendmsg+0x1d4/0x418
sock_sendmsg+0x54/0x60
__sys_sendto+0xf4/0x120
__arm64_sys_sendto+0x30/0x40
el0_svc_common+0x8c/0x120
do_el0_svc+0x30/0xa0
el0_svc+0x20/0x30
el0_sync_handler+0x90/0xb8
el0_sync+0x160/0x180
Thus, upon lag enable/disable, load and unload only the IB representors
of the slaves preventing the deadlock mentioned above.
While at it, refactor the mlx5_esw_offloads_rep_load() function to have
a static helper method for its internal logic, in symmetry with the
representor unload design. |
In the Linux kernel, the following vulnerability has been resolved:
net: fec: remove .ndo_poll_controller to avoid deadlocks
There is a deadlock issue found in sungem driver, please refer to the
commit ac0a230f719b ("eth: sungem: remove .ndo_poll_controller to avoid
deadlocks"). The root cause of the issue is that netpoll is in atomic
context and disable_irq() is called by .ndo_poll_controller interface
of sungem driver, however, disable_irq() might sleep. After analyzing
the implementation of fec_poll_controller(), the fec driver should have
the same issue. Due to the fec driver uses NAPI for TX completions, the
.ndo_poll_controller is unnecessary to be implemented in the fec driver,
so fec_poll_controller() can be safely removed. |
In the Linux kernel, the following vulnerability has been resolved:
Revert "media: v4l2-ctrls: show all owned controls in log_status"
This reverts commit 9801b5b28c6929139d6fceeee8d739cc67bb2739.
This patch introduced a potential deadlock scenario:
[Wed May 8 10:02:06 2024] Possible unsafe locking scenario:
[Wed May 8 10:02:06 2024] CPU0 CPU1
[Wed May 8 10:02:06 2024] ---- ----
[Wed May 8 10:02:06 2024] lock(vivid_ctrls:1620:(hdl_vid_cap)->_lock);
[Wed May 8 10:02:06 2024] lock(vivid_ctrls:1608:(hdl_user_vid)->_lock);
[Wed May 8 10:02:06 2024] lock(vivid_ctrls:1620:(hdl_vid_cap)->_lock);
[Wed May 8 10:02:06 2024] lock(vivid_ctrls:1608:(hdl_user_vid)->_lock);
For now just revert. |
In the Linux kernel, the following vulnerability has been resolved:
Reapply "drm/qxl: simplify qxl_fence_wait"
This reverts commit 07ed11afb68d94eadd4ffc082b97c2331307c5ea.
Stephen Rostedt reports:
"I went to run my tests on my VMs and the tests hung on boot up.
Unfortunately, the most I ever got out was:
[ 93.607888] Testing event system initcall: OK
[ 93.667730] Running tests on all trace events:
[ 93.669757] Testing all events: OK
[ 95.631064] ------------[ cut here ]------------
Timed out after 60 seconds"
and further debugging points to a possible circular locking dependency
between the console_owner locking and the worker pool locking.
Reverting the commit allows Steve's VM to boot to completion again.
[ This may obviously result in the "[TTM] Buffer eviction failed"
messages again, which was the reason for that original revert. But at
this point this seems preferable to a non-booting system... ] |
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Release hbalock before calling lpfc_worker_wake_up()
lpfc_worker_wake_up() calls the lpfc_work_done() routine, which takes the
hbalock. Thus, lpfc_worker_wake_up() should not be called while holding the
hbalock to avoid potential deadlock. |
In the Linux kernel, the following vulnerability has been resolved:
mm: use memalloc_nofs_save() in page_cache_ra_order()
See commit f2c817bed58d ("mm: use memalloc_nofs_save in readahead path"),
ensure that page_cache_ra_order() do not attempt to reclaim file-backed
pages too, or it leads to a deadlock, found issue when test ext4 large
folio.
INFO: task DataXceiver for:7494 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:DataXceiver for state:D stack:0 pid:7494 ppid:1 flags:0x00000200
Call trace:
__switch_to+0x14c/0x240
__schedule+0x82c/0xdd0
schedule+0x58/0xf0
io_schedule+0x24/0xa0
__folio_lock+0x130/0x300
migrate_pages_batch+0x378/0x918
migrate_pages+0x350/0x700
compact_zone+0x63c/0xb38
compact_zone_order+0xc0/0x118
try_to_compact_pages+0xb0/0x280
__alloc_pages_direct_compact+0x98/0x248
__alloc_pages+0x510/0x1110
alloc_pages+0x9c/0x130
folio_alloc+0x20/0x78
filemap_alloc_folio+0x8c/0x1b0
page_cache_ra_order+0x174/0x308
ondemand_readahead+0x1c8/0x2b8
page_cache_async_ra+0x68/0xb8
filemap_readahead.isra.0+0x64/0xa8
filemap_get_pages+0x3fc/0x5b0
filemap_splice_read+0xf4/0x280
ext4_file_splice_read+0x2c/0x48 [ext4]
vfs_splice_read.part.0+0xa8/0x118
splice_direct_to_actor+0xbc/0x288
do_splice_direct+0x9c/0x108
do_sendfile+0x328/0x468
__arm64_sys_sendfile64+0x8c/0x148
invoke_syscall+0x4c/0x118
el0_svc_common.constprop.0+0xc8/0xf0
do_el0_svc+0x24/0x38
el0_svc+0x4c/0x1f8
el0t_64_sync_handler+0xc0/0xc8
el0t_64_sync+0x188/0x190 |
In the Linux kernel, the following vulnerability has been resolved:
media: usbtv: Remove useless locks in usbtv_video_free()
Remove locks calls in usbtv_video_free() because
are useless and may led to a deadlock as reported here:
https://syzkaller.appspot.com/x/bisect.txt?x=166dc872180000
Also remove usbtv_stop() call since it will be called when
unregistering the device.
Before 'c838530d230b' this issue would only be noticed if you
disconnect while streaming and now it is noticeable even when
disconnecting while not streaming.
[hverkuil: fix minor spelling mistake in log message] |
In the Linux kernel, the following vulnerability has been resolved:
block: fix deadlock between bd_link_disk_holder and partition scan
'open_mutex' of gendisk is used to protect open/close block devices. But
in bd_link_disk_holder(), it is used to protect the creation of symlink
between holding disk and slave bdev, which introduces some issues.
When bd_link_disk_holder() is called, the driver is usually in the process
of initialization/modification and may suspend submitting io. At this
time, any io hold 'open_mutex', such as scanning partitions, can cause
deadlocks. For example, in raid:
T1 T2
bdev_open_by_dev
lock open_mutex [1]
...
efi_partition
...
md_submit_bio
md_ioctl mddev_syspend
-> suspend all io
md_add_new_disk
bind_rdev_to_array
bd_link_disk_holder
try lock open_mutex [2]
md_handle_request
-> wait mddev_resume
T1 scan partition, T2 add a new device to raid. T1 waits for T2 to resume
mddev, but T2 waits for open_mutex held by T1. Deadlock occurs.
Fix it by introducing a local mutex 'blk_holder_mutex' to replace
'open_mutex'. |
In the Linux kernel, the following vulnerability has been resolved:
scsi: hisi_sas: Fix a deadlock issue related to automatic dump
If we issue a disabling PHY command, the device attached with it will go
offline, if a 2 bit ECC error occurs at the same time, a hung task may be
found:
[ 4613.652388] INFO: task kworker/u256:0:165233 blocked for more than 120 seconds.
[ 4613.666297] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4613.674809] task:kworker/u256:0 state:D stack: 0 pid:165233 ppid: 2 flags:0x00000208
[ 4613.683959] Workqueue: 0000:74:02.0_disco_q sas_revalidate_domain [libsas]
[ 4613.691518] Call trace:
[ 4613.694678] __switch_to+0xf8/0x17c
[ 4613.698872] __schedule+0x660/0xee0
[ 4613.703063] schedule+0xac/0x240
[ 4613.706994] schedule_timeout+0x500/0x610
[ 4613.711705] __down+0x128/0x36c
[ 4613.715548] down+0x240/0x2d0
[ 4613.719221] hisi_sas_internal_abort_timeout+0x1bc/0x260 [hisi_sas_main]
[ 4613.726618] sas_execute_internal_abort+0x144/0x310 [libsas]
[ 4613.732976] sas_execute_internal_abort_dev+0x44/0x60 [libsas]
[ 4613.739504] hisi_sas_internal_task_abort_dev.isra.0+0xbc/0x1b0 [hisi_sas_main]
[ 4613.747499] hisi_sas_dev_gone+0x174/0x250 [hisi_sas_main]
[ 4613.753682] sas_notify_lldd_dev_gone+0xec/0x2e0 [libsas]
[ 4613.759781] sas_unregister_common_dev+0x4c/0x7a0 [libsas]
[ 4613.765962] sas_destruct_devices+0xb8/0x120 [libsas]
[ 4613.771709] sas_do_revalidate_domain.constprop.0+0x1b8/0x31c [libsas]
[ 4613.778930] sas_revalidate_domain+0x60/0xa4 [libsas]
[ 4613.784716] process_one_work+0x248/0x950
[ 4613.789424] worker_thread+0x318/0x934
[ 4613.793878] kthread+0x190/0x200
[ 4613.797810] ret_from_fork+0x10/0x18
[ 4613.802121] INFO: task kworker/u256:4:316722 blocked for more than 120 seconds.
[ 4613.816026] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4613.824538] task:kworker/u256:4 state:D stack: 0 pid:316722 ppid: 2 flags:0x00000208
[ 4613.833670] Workqueue: 0000:74:02.0 hisi_sas_rst_work_handler [hisi_sas_main]
[ 4613.841491] Call trace:
[ 4613.844647] __switch_to+0xf8/0x17c
[ 4613.848852] __schedule+0x660/0xee0
[ 4613.853052] schedule+0xac/0x240
[ 4613.856984] schedule_timeout+0x500/0x610
[ 4613.861695] __down+0x128/0x36c
[ 4613.865542] down+0x240/0x2d0
[ 4613.869216] hisi_sas_controller_prereset+0x58/0x1fc [hisi_sas_main]
[ 4613.876324] hisi_sas_rst_work_handler+0x40/0x8c [hisi_sas_main]
[ 4613.883019] process_one_work+0x248/0x950
[ 4613.887732] worker_thread+0x318/0x934
[ 4613.892204] kthread+0x190/0x200
[ 4613.896118] ret_from_fork+0x10/0x18
[ 4613.900423] INFO: task kworker/u256:1:348985 blocked for more than 121 seconds.
[ 4613.914341] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4613.922852] task:kworker/u256:1 state:D stack: 0 pid:348985 ppid: 2 flags:0x00000208
[ 4613.931984] Workqueue: 0000:74:02.0_event_q sas_port_event_worker [libsas]
[ 4613.939549] Call trace:
[ 4613.942702] __switch_to+0xf8/0x17c
[ 4613.946892] __schedule+0x660/0xee0
[ 4613.951083] schedule+0xac/0x240
[ 4613.955015] schedule_timeout+0x500/0x610
[ 4613.959725] wait_for_common+0x200/0x610
[ 4613.964349] wait_for_completion+0x3c/0x5c
[ 4613.969146] flush_workqueue+0x198/0x790
[ 4613.973776] sas_porte_broadcast_rcvd+0x1e8/0x320 [libsas]
[ 4613.979960] sas_port_event_worker+0x54/0xa0 [libsas]
[ 4613.985708] process_one_work+0x248/0x950
[ 4613.990420] worker_thread+0x318/0x934
[ 4613.994868] kthread+0x190/0x200
[ 4613.998800] ret_from_fork+0x10/0x18
This is because when the device goes offline, we obtain the hisi_hba
semaphore and send the ABORT_DEV command to the device. However, the
internal abort timed out due to the 2 bit ECC error and triggers automatic
dump. In addition, since the hisi_hba semaphore has been obtained, the dump
cannot be executed and the controller cannot be reset.
Therefore, the deadlocks occur on the following circular dependencies
---truncated--- |