CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
In libtirpc before 1.3.3rc1, remote attackers could exhaust the file descriptors of a process that uses libtirpc because idle TCP connections are mishandled. This can, in turn, lead to an svc_run infinite loop without accepting new connections. |
Issue summary: Checking excessively long DH keys or parameters may be very slow.
Impact summary: Applications that use the functions DH_check(), DH_check_ex()
or EVP_PKEY_param_check() to check a DH key or DH parameters may experience long
delays. Where the key or parameters that are being checked have been obtained
from an untrusted source this may lead to a Denial of Service.
The function DH_check() performs various checks on DH parameters. After fixing
CVE-2023-3446 it was discovered that a large q parameter value can also trigger
an overly long computation during some of these checks. A correct q value,
if present, cannot be larger than the modulus p parameter, thus it is
unnecessary to perform these checks if q is larger than p.
An application that calls DH_check() and supplies a key or parameters obtained
from an untrusted source could be vulnerable to a Denial of Service attack.
The function DH_check() is itself called by a number of other OpenSSL functions.
An application calling any of those other functions may similarly be affected.
The other functions affected by this are DH_check_ex() and
EVP_PKEY_param_check().
Also vulnerable are the OpenSSL dhparam and pkeyparam command line applications
when using the "-check" option.
The OpenSSL SSL/TLS implementation is not affected by this issue.
The OpenSSL 3.0 and 3.1 FIPS providers are not affected by this issue. |
In the Linux kernel, the following vulnerability has been resolved:
net: switchdev: Convert blocking notification chain to a raw one
A blocking notification chain uses a read-write semaphore to protect the
integrity of the chain. The semaphore is acquired for writing when
adding / removing notifiers to / from the chain and acquired for reading
when traversing the chain and informing notifiers about an event.
In case of the blocking switchdev notification chain, recursive
notifications are possible which leads to the semaphore being acquired
twice for reading and to lockdep warnings being generated [1].
Specifically, this can happen when the bridge driver processes a
SWITCHDEV_BRPORT_UNOFFLOADED event which causes it to emit notifications
about deferred events when calling switchdev_deferred_process().
Fix this by converting the notification chain to a raw notification
chain in a similar fashion to the netdev notification chain. Protect
the chain using the RTNL mutex by acquiring it when modifying the chain.
Events are always informed under the RTNL mutex, but add an assertion in
call_switchdev_blocking_notifiers() to make sure this is not violated in
the future.
Maintain the "blocking" prefix as events are always emitted from process
context and listeners are allowed to block.
[1]:
WARNING: possible recursive locking detected
6.14.0-rc4-custom-g079270089484 #1 Not tainted
--------------------------------------------
ip/52731 is trying to acquire lock:
ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
but task is already holding lock:
ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock((switchdev_blocking_notif_chain).rwsem);
lock((switchdev_blocking_notif_chain).rwsem);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by ip/52731:
#0: ffffffff84f795b0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x727/0x1dc0
#1: ffffffff8731f628 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x790/0x1dc0
#2: ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
stack backtrace:
...
? __pfx_down_read+0x10/0x10
? __pfx_mark_lock+0x10/0x10
? __pfx_switchdev_port_attr_set_deferred+0x10/0x10
blocking_notifier_call_chain+0x58/0xa0
switchdev_port_attr_notify.constprop.0+0xb3/0x1b0
? __pfx_switchdev_port_attr_notify.constprop.0+0x10/0x10
? mark_held_locks+0x94/0xe0
? switchdev_deferred_process+0x11a/0x340
switchdev_port_attr_set_deferred+0x27/0xd0
switchdev_deferred_process+0x164/0x340
br_switchdev_port_unoffload+0xc8/0x100 [bridge]
br_switchdev_blocking_event+0x29f/0x580 [bridge]
notifier_call_chain+0xa2/0x440
blocking_notifier_call_chain+0x6e/0xa0
switchdev_bridge_port_unoffload+0xde/0x1a0
... |
In the Linux kernel, the following vulnerability has been resolved:
openvswitch: fix lockup on tx to unregistering netdev with carrier
Commit in a fixes tag attempted to fix the issue in the following
sequence of calls:
do_output
-> ovs_vport_send
-> dev_queue_xmit
-> __dev_queue_xmit
-> netdev_core_pick_tx
-> skb_tx_hash
When device is unregistering, the 'dev->real_num_tx_queues' goes to
zero and the 'while (unlikely(hash >= qcount))' loop inside the
'skb_tx_hash' becomes infinite, locking up the core forever.
But unfortunately, checking just the carrier status is not enough to
fix the issue, because some devices may still be in unregistering
state while reporting carrier status OK.
One example of such device is a net/dummy. It sets carrier ON
on start, but it doesn't implement .ndo_stop to set the carrier off.
And it makes sense, because dummy doesn't really have a carrier.
Therefore, while this device is unregistering, it's still easy to hit
the infinite loop in the skb_tx_hash() from the OVS datapath. There
might be other drivers that do the same, but dummy by itself is
important for the OVS ecosystem, because it is frequently used as a
packet sink for tcpdump while debugging OVS deployments. And when the
issue is hit, the only way to recover is to reboot.
Fix that by also checking if the device is running. The running
state is handled by the net core during unregistering, so it covers
unregistering case better, and we don't really need to send packets
to devices that are not running anyway.
While only checking the running state might be enough, the carrier
check is preserved. The running and the carrier states seem disjoined
throughout the code and different drivers. And other core functions
like __dev_direct_xmit() check both before attempting to transmit
a packet. So, it seems safer to check both flags in OVS as well. |
In the Linux kernel, the following vulnerability has been resolved:
filemap: Fix bounds checking in filemap_read()
If the caller supplies an iocb->ki_pos value that is close to the
filesystem upper limit, and an iterator with a count that causes us to
overflow that limit, then filemap_read() enters an infinite loop.
This behaviour was discovered when testing xfstests generic/525 with the
"localio" optimisation for loopback NFS mounts. |
In the Linux kernel, the following vulnerability has been resolved:
md: fix deadlock between mddev_suspend and flush bio
Deadlock occurs when mddev is being suspended while some flush bio is in
progress. It is a complex issue.
T1. the first flush is at the ending stage, it clears 'mddev->flush_bio'
and tries to submit data, but is blocked because mddev is suspended
by T4.
T2. the second flush sets 'mddev->flush_bio', and attempts to queue
md_submit_flush_data(), which is already running (T1) and won't
execute again if on the same CPU as T1.
T3. the third flush inc active_io and tries to flush, but is blocked because
'mddev->flush_bio' is not NULL (set by T2).
T4. mddev_suspend() is called and waits for active_io dec to 0 which is inc
by T3.
T1 T2 T3 T4
(flush 1) (flush 2) (third 3) (suspend)
md_submit_flush_data
mddev->flush_bio = NULL;
.
. md_flush_request
. mddev->flush_bio = bio
. queue submit_flushes
. .
. . md_handle_request
. . active_io + 1
. . md_flush_request
. . wait !mddev->flush_bio
. .
. . mddev_suspend
. . wait !active_io
. .
. submit_flushes
. queue_work md_submit_flush_data
. //md_submit_flush_data is already running (T1)
.
md_handle_request
wait resume
The root issue is non-atomic inc/dec of active_io during flush process.
active_io is dec before md_submit_flush_data is queued, and inc soon
after md_submit_flush_data() run.
md_flush_request
active_io + 1
submit_flushes
active_io - 1
md_submit_flush_data
md_handle_request
active_io + 1
make_request
active_io - 1
If active_io is dec after md_handle_request() instead of within
submit_flushes(), make_request() can be called directly intead of
md_handle_request() in md_submit_flush_data(), and active_io will
only inc and dec once in the whole flush process. Deadlock will be
fixed.
Additionally, the only difference between fixing the issue and before is
that there is no return error handling of make_request(). But after
previous patch cleaned md_write_start(), make_requst() only return error
in raid5_make_request() by dm-raid, see commit 41425f96d7aa ("dm-raid456,
md/raid456: fix a deadlock for dm-raid456 while io concurrent with
reshape)". Since dm always splits data and flush operation into two
separate io, io size of flush submitted by dm always is 0, make_request()
will not be called in md_submit_flush_data(). To prevent future
modifications from introducing issues, add WARN_ON to ensure
make_request() no error is returned in this context. |
In the Linux kernel, the following vulnerability has been resolved:
dm-raid: Fix WARN_ON_ONCE check for sync_thread in raid_resume
rm-raid devices will occasionally trigger the following warning when
being resumed after a table load because DM_RECOVERY_RUNNING is set:
WARNING: CPU: 7 PID: 5660 at drivers/md/dm-raid.c:4105 raid_resume+0xee/0x100 [dm_raid]
The failing check is:
WARN_ON_ONCE(test_bit(MD_RECOVERY_RUNNING, &mddev->recovery));
This check is designed to make sure that the sync thread isn't
registered, but md_check_recovery can set MD_RECOVERY_RUNNING without
the sync_thread ever getting registered. Instead of checking if
MD_RECOVERY_RUNNING is set, check if sync_thread is non-NULL. |
In the Linux kernel, the following vulnerability has been resolved:
f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid
mkdir /mnt/test/comp
f2fs_io setflags compression /mnt/test/comp
dd if=/dev/zero of=/mnt/test/comp/testfile bs=16k count=1
truncate --size 13 /mnt/test/comp/testfile
In the above scenario, we can get a BUG_ON.
kernel BUG at fs/f2fs/segment.c:3589!
Call Trace:
do_write_page+0x78/0x390 [f2fs]
f2fs_outplace_write_data+0x62/0xb0 [f2fs]
f2fs_do_write_data_page+0x275/0x740 [f2fs]
f2fs_write_single_data_page+0x1dc/0x8f0 [f2fs]
f2fs_write_multi_pages+0x1e5/0xae0 [f2fs]
f2fs_write_cache_pages+0xab1/0xc60 [f2fs]
f2fs_write_data_pages+0x2d8/0x330 [f2fs]
do_writepages+0xcf/0x270
__writeback_single_inode+0x44/0x350
writeback_sb_inodes+0x242/0x530
__writeback_inodes_wb+0x54/0xf0
wb_writeback+0x192/0x310
wb_workfn+0x30d/0x400
The reason is we gave CURSEG_ALL_DATA_ATGC to COMPR_ADDR where the
page was set the gcing flag by set_cluster_dirty(). |
In the Linux kernel, the following vulnerability has been resolved:
x86/bhi: Avoid warning in #DB handler due to BHI mitigation
When BHI mitigation is enabled, if SYSENTER is invoked with the TF flag set
then entry_SYSENTER_compat() uses CLEAR_BRANCH_HISTORY and calls the
clear_bhb_loop() before the TF flag is cleared. This causes the #DB handler
(exc_debug_kernel()) to issue a warning because single-step is used outside the
entry_SYSENTER_compat() function.
To address this issue, entry_SYSENTER_compat() should use CLEAR_BRANCH_HISTORY
after making sure the TF flag is cleared.
The problem can be reproduced with the following sequence:
$ cat sysenter_step.c
int main()
{ asm("pushf; pop %ax; bts $8,%ax; push %ax; popf; sysenter"); }
$ gcc -o sysenter_step sysenter_step.c
$ ./sysenter_step
Segmentation fault (core dumped)
The program is expected to crash, and the #DB handler will issue a warning.
Kernel log:
WARNING: CPU: 27 PID: 7000 at arch/x86/kernel/traps.c:1009 exc_debug_kernel+0xd2/0x160
...
RIP: 0010:exc_debug_kernel+0xd2/0x160
...
Call Trace:
<#DB>
? show_regs+0x68/0x80
? __warn+0x8c/0x140
? exc_debug_kernel+0xd2/0x160
? report_bug+0x175/0x1a0
? handle_bug+0x44/0x90
? exc_invalid_op+0x1c/0x70
? asm_exc_invalid_op+0x1f/0x30
? exc_debug_kernel+0xd2/0x160
exc_debug+0x43/0x50
asm_exc_debug+0x1e/0x40
RIP: 0010:clear_bhb_loop+0x0/0xb0
...
</#DB>
<TASK>
? entry_SYSENTER_compat_after_hwframe+0x6e/0x8d
</TASK>
[ bp: Massage commit message. ] |
In the Linux kernel, the following vulnerability has been resolved:
net: ks8851: Fix deadlock with the SPI chip variant
When SMP is enabled and spinlocks are actually functional then there is
a deadlock with the 'statelock' spinlock between ks8851_start_xmit_spi
and ks8851_irq:
watchdog: BUG: soft lockup - CPU#0 stuck for 27s!
call trace:
queued_spin_lock_slowpath+0x100/0x284
do_raw_spin_lock+0x34/0x44
ks8851_start_xmit_spi+0x30/0xb8
ks8851_start_xmit+0x14/0x20
netdev_start_xmit+0x40/0x6c
dev_hard_start_xmit+0x6c/0xbc
sch_direct_xmit+0xa4/0x22c
__qdisc_run+0x138/0x3fc
qdisc_run+0x24/0x3c
net_tx_action+0xf8/0x130
handle_softirqs+0x1ac/0x1f0
__do_softirq+0x14/0x20
____do_softirq+0x10/0x1c
call_on_irq_stack+0x3c/0x58
do_softirq_own_stack+0x1c/0x28
__irq_exit_rcu+0x54/0x9c
irq_exit_rcu+0x10/0x1c
el1_interrupt+0x38/0x50
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x64/0x68
__netif_schedule+0x6c/0x80
netif_tx_wake_queue+0x38/0x48
ks8851_irq+0xb8/0x2c8
irq_thread_fn+0x2c/0x74
irq_thread+0x10c/0x1b0
kthread+0xc8/0xd8
ret_from_fork+0x10/0x20
This issue has not been identified earlier because tests were done on
a device with SMP disabled and so spinlocks were actually NOPs.
Now use spin_(un)lock_bh for TX queue related locking to avoid execution
of softirq work synchronously that would lead to a deadlock. |
In the Linux kernel, the following vulnerability has been resolved:
wifi: mac80211: Fix deadlock in ieee80211_sta_ps_deliver_wakeup()
The ieee80211_sta_ps_deliver_wakeup() function takes sta->ps_lock to
synchronizes with ieee80211_tx_h_unicast_ps_buf() which is called from
softirq context. However using only spin_lock() to get sta->ps_lock in
ieee80211_sta_ps_deliver_wakeup() does not prevent softirq to execute
on this same CPU, to run ieee80211_tx_h_unicast_ps_buf() and try to
take this same lock ending in deadlock. Below is an example of rcu stall
that arises in such situation.
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 2-....: (42413413 ticks this GP) idle=b154/1/0x4000000000000000 softirq=1763/1765 fqs=21206996
rcu: (t=42586894 jiffies g=2057 q=362405 ncpus=4)
CPU: 2 PID: 719 Comm: wpa_supplicant Tainted: G W 6.4.0-02158-g1b062f552873 #742
Hardware name: RPT (r1) (DT)
pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : queued_spin_lock_slowpath+0x58/0x2d0
lr : invoke_tx_handlers_early+0x5b4/0x5c0
sp : ffff00001ef64660
x29: ffff00001ef64660 x28: ffff000009bc1070 x27: ffff000009bc0ad8
x26: ffff000009bc0900 x25: ffff00001ef647a8 x24: 0000000000000000
x23: ffff000009bc0900 x22: ffff000009bc0900 x21: ffff00000ac0e000
x20: ffff00000a279e00 x19: ffff00001ef646e8 x18: 0000000000000000
x17: ffff800016468000 x16: ffff00001ef608c0 x15: 0010533c93f64f80
x14: 0010395c9faa3946 x13: 0000000000000000 x12: 00000000fa83b2da
x11: 000000012edeceea x10: ffff0000010fbe00 x9 : 0000000000895440
x8 : 000000000010533c x7 : ffff00000ad8b740 x6 : ffff00000c350880
x5 : 0000000000000007 x4 : 0000000000000001 x3 : 0000000000000000
x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffff00000ac0e0e8
Call trace:
queued_spin_lock_slowpath+0x58/0x2d0
ieee80211_tx+0x80/0x12c
ieee80211_tx_pending+0x110/0x278
tasklet_action_common.constprop.0+0x10c/0x144
tasklet_action+0x20/0x28
_stext+0x11c/0x284
____do_softirq+0xc/0x14
call_on_irq_stack+0x24/0x34
do_softirq_own_stack+0x18/0x20
do_softirq+0x74/0x7c
__local_bh_enable_ip+0xa0/0xa4
_ieee80211_wake_txqs+0x3b0/0x4b8
__ieee80211_wake_queue+0x12c/0x168
ieee80211_add_pending_skbs+0xec/0x138
ieee80211_sta_ps_deliver_wakeup+0x2a4/0x480
ieee80211_mps_sta_status_update.part.0+0xd8/0x11c
ieee80211_mps_sta_status_update+0x18/0x24
sta_apply_parameters+0x3bc/0x4c0
ieee80211_change_station+0x1b8/0x2dc
nl80211_set_station+0x444/0x49c
genl_family_rcv_msg_doit.isra.0+0xa4/0xfc
genl_rcv_msg+0x1b0/0x244
netlink_rcv_skb+0x38/0x10c
genl_rcv+0x34/0x48
netlink_unicast+0x254/0x2bc
netlink_sendmsg+0x190/0x3b4
____sys_sendmsg+0x1e8/0x218
___sys_sendmsg+0x68/0x8c
__sys_sendmsg+0x44/0x84
__arm64_sys_sendmsg+0x20/0x28
do_el0_svc+0x6c/0xe8
el0_svc+0x14/0x48
el0t_64_sync_handler+0xb0/0xb4
el0t_64_sync+0x14c/0x150
Using spin_lock_bh()/spin_unlock_bh() instead prevents softirq to raise
on the same CPU that is holding the lock. |
In the Linux kernel, the following vulnerability has been resolved:
md/raid5: fix deadlock that raid5d() wait for itself to clear MD_SB_CHANGE_PENDING
Xiao reported that lvm2 test lvconvert-raid-takeover.sh can hang with
small possibility, the root cause is exactly the same as commit
bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")
However, Dan reported another hang after that, and junxiao investigated
the problem and found out that this is caused by plugged bio can't issue
from raid5d().
Current implementation in raid5d() has a weird dependence:
1) md_check_recovery() from raid5d() must hold 'reconfig_mutex' to clear
MD_SB_CHANGE_PENDING;
2) raid5d() handles IO in a deadloop, until all IO are issued;
3) IO from raid5d() must wait for MD_SB_CHANGE_PENDING to be cleared;
This behaviour is introduce before v2.6, and for consequence, if other
context hold 'reconfig_mutex', and md_check_recovery() can't update
super_block, then raid5d() will waste one cpu 100% by the deadloop, until
'reconfig_mutex' is released.
Refer to the implementation from raid1 and raid10, fix this problem by
skipping issue IO if MD_SB_CHANGE_PENDING is still set after
md_check_recovery(), daemon thread will be woken up when 'reconfig_mutex'
is released. Meanwhile, the hang problem will be fixed as well. |
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix to avoid potential panic during recovery
During recovery, if FAULT_BLOCK is on, it is possible that
f2fs_reserve_new_block() will return -ENOSPC during recovery,
then it may trigger panic.
Also, if fault injection rate is 1 and only FAULT_BLOCK fault
type is on, it may encounter deadloop in loop of block reservation.
Let's change as below to fix these issues:
- remove bug_on() to avoid panic.
- limit the loop count of block reservation to avoid potential
deadloop. |
In the Linux kernel, the following vulnerability has been resolved:
scsi: Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock"
This reverts commit 1a1975551943f681772720f639ff42fbaa746212.
This commit causes interrupts to be lost for FCoE devices, since it changed
sping locks from "bh" to "irqsave".
Instead, a work queue should be used, and will be addressed in a separate
commit. |
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: af_bluetooth: Fix deadlock
Attemting to do sock_lock on .recvmsg may cause a deadlock as shown
bellow, so instead of using sock_sock this uses sk_receive_queue.lock
on bt_sock_ioctl to avoid the UAF:
INFO: task kworker/u9:1:121 blocked for more than 30 seconds.
Not tainted 6.7.6-lemon #183
Workqueue: hci0 hci_rx_work
Call Trace:
<TASK>
__schedule+0x37d/0xa00
schedule+0x32/0xe0
__lock_sock+0x68/0xa0
? __pfx_autoremove_wake_function+0x10/0x10
lock_sock_nested+0x43/0x50
l2cap_sock_recv_cb+0x21/0xa0
l2cap_recv_frame+0x55b/0x30a0
? psi_task_switch+0xeb/0x270
? finish_task_switch.isra.0+0x93/0x2a0
hci_rx_work+0x33a/0x3f0
process_one_work+0x13a/0x2f0
worker_thread+0x2f0/0x410
? __pfx_worker_thread+0x10/0x10
kthread+0xe0/0x110
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK> |
In the Linux kernel, the following vulnerability has been resolved:
PCI/ASPM: Fix deadlock when enabling ASPM
A last minute revert in 6.7-final introduced a potential deadlock when
enabling ASPM during probe of Qualcomm PCIe controllers as reported by
lockdep:
============================================
WARNING: possible recursive locking detected
6.7.0 #40 Not tainted
--------------------------------------------
kworker/u16:5/90 is trying to acquire lock:
ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pcie_aspm_pm_state_change+0x58/0xdc
but task is already holding lock:
ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(pci_bus_sem);
lock(pci_bus_sem);
*** DEADLOCK ***
Call trace:
print_deadlock_bug+0x25c/0x348
__lock_acquire+0x10a4/0x2064
lock_acquire+0x1e8/0x318
down_read+0x60/0x184
pcie_aspm_pm_state_change+0x58/0xdc
pci_set_full_power_state+0xa8/0x114
pci_set_power_state+0xc4/0x120
qcom_pcie_enable_aspm+0x1c/0x3c [pcie_qcom]
pci_walk_bus+0x64/0xbc
qcom_pcie_host_post_init_2_7_0+0x28/0x34 [pcie_qcom]
The deadlock can easily be reproduced on machines like the Lenovo ThinkPad
X13s by adding a delay to increase the race window during asynchronous
probe where another thread can take a write lock.
Add a new pci_set_power_state_locked() and associated helper functions that
can be called with the PCI bus semaphore held to avoid taking the read lock
twice. |
In the Linux kernel, the following vulnerability has been resolved:
crypto: qcom-rng - fix infinite loop on requests not multiple of WORD_SZ
The commit referenced in the Fixes tag removed the 'break' from the else
branch in qcom_rng_read(), causing an infinite loop whenever 'max' is
not a multiple of WORD_SZ. This can be reproduced e.g. by running:
kcapi-rng -b 67 >/dev/null
There are many ways to fix this without adding back the 'break', but
they all seem more awkward than simply adding it back, so do just that.
Tested on a machine with Qualcomm Amberwing processor. |
In the Linux kernel, the following vulnerability has been resolved:
audit: improve robustness of the audit queue handling
If the audit daemon were ever to get stuck in a stopped state the
kernel's kauditd_thread() could get blocked attempting to send audit
records to the userspace audit daemon. With the kernel thread
blocked it is possible that the audit queue could grow unbounded as
certain audit record generating events must be exempt from the queue
limits else the system enter a deadlock state.
This patch resolves this problem by lowering the kernel thread's
socket sending timeout from MAX_SCHEDULE_TIMEOUT to HZ/10 and tweaks
the kauditd_send_queue() function to better manage the various audit
queues when connection problems occur between the kernel and the
audit daemon. With this patch, the backlog may temporarily grow
beyond the defined limits when the audit daemon is stopped and the
system is under heavy audit pressure, but kauditd_thread() will
continue to make progress and drain the queues as it would for other
connection problems. For example, with the audit daemon put into a
stopped state and the system configured to audit every syscall it
was still possible to shutdown the system without a kernel panic,
deadlock, etc.; granted, the system was slow to shutdown but that is
to be expected given the extreme pressure of recording every syscall.
The timeout value of HZ/10 was chosen primarily through
experimentation and this developer's "gut feeling". There is likely
no one perfect value, but as this scenario is limited in scope (root
privileges would be needed to send SIGSTOP to the audit daemon), it
is likely not worth exposing this as a tunable at present. This can
always be done at a later date if it proves necessary. |
In the Linux kernel, the following vulnerability has been resolved:
s390/qeth: fix deadlock during failing recovery
Commit 0b9902c1fcc5 ("s390/qeth: fix deadlock during recovery") removed
taking discipline_mutex inside qeth_do_reset(), fixing potential
deadlocks. An error path was missed though, that still takes
discipline_mutex and thus has the original deadlock potential.
Intermittent deadlocks were seen when a qeth channel path is configured
offline, causing a race between qeth_do_reset and ccwgroup_remove.
Call qeth_set_offline() directly in the qeth_do_reset() error case and
then a new variant of ccwgroup_set_offline(), without taking
discipline_mutex. |
In the Linux kernel, the following vulnerability has been resolved:
scsi: core: sysfs: Fix hang when device state is set via sysfs
This fixes a regression added with:
commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
offlinining device")
The problem is that after iSCSI recovery, iscsid will call into the kernel
to set the dev's state to running, and with that patch we now call
scsi_rescan_device() with the state_mutex held. If the SCSI error handler
thread is just starting to test the device in scsi_send_eh_cmnd() then it's
going to try to grab the state_mutex.
We are then stuck, because when scsi_rescan_device() tries to send its I/O
scsi_queue_rq() calls -> scsi_host_queue_ready() -> scsi_host_in_recovery()
which will return true (the host state is still in recovery) and I/O will
just be requeued. scsi_send_eh_cmnd() will then never be able to grab the
state_mutex to finish error handling.
To prevent the deadlock move the rescan-related code to after we drop the
state_mutex.
This also adds a check for if we are already in the running state. This
prevents extra scans and helps the iscsid case where if the transport class
has already onlined the device during its recovery process then we don't
need userspace to do it again plus possibly block that daemon. |