CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
In xfig diagramming tool, a stack-overflow while running fig2dev allows memory corruption via local input manipulation at the bezier_spline function. |
A flaw was found in linux-pam. The module pam_namespace may use access user-controlled paths without proper protection, allowing local users to elevate their privileges to root via multiple symlink attacks and race conditions. |
The issue was addressed with improved bounds checks. This issue is fixed in tvOS 15.6, watchOS 8.7, iOS 15.6 and iPadOS 15.6, macOS Monterey 12.5, Safari 15.6. Processing web content may lead to arbitrary code execution. |
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: E-Switch, pair only capable devices
OFFLOADS paring using devcom is possible only on devices
that support LAG. Filter based on lag capabilities.
This fixes an issue where mlx5_get_next_phys_dev() was
called without holding the interface lock.
This issue was found when commit
bc4c2f2e0179 ("net/mlx5: Lag, filter non compatible devices")
added an assert that verifies the interface lock is held.
WARNING: CPU: 9 PID: 1706 at drivers/net/ethernet/mellanox/mlx5/core/dev.c:642 mlx5_get_next_phys_dev+0xd2/0x100 [mlx5_core]
Modules linked in: mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm ib_uverbs ib_core overlay fuse [last unloaded: mlx5_core]
CPU: 9 PID: 1706 Comm: devlink Not tainted 5.18.0-rc7+ #11
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:mlx5_get_next_phys_dev+0xd2/0x100 [mlx5_core]
Code: 02 00 75 48 48 8b 85 80 04 00 00 5d c3 31 c0 5d c3 be ff ff ff ff 48 c7 c7 08 41 5b a0 e8 36 87 28 e3 85 c0 0f 85 6f ff ff ff <0f> 0b e9 68 ff ff ff 48 c7 c7 0c 91 cc 84 e8 cb 36 6f e1 e9 4d ff
RSP: 0018:ffff88811bf47458 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88811b398000 RCX: 0000000000000001
RDX: 0000000080000000 RSI: ffffffffa05b4108 RDI: ffff88812daaaa78
RBP: ffff88812d050380 R08: 0000000000000001 R09: ffff88811d6b3437
R10: 0000000000000001 R11: 00000000fddd3581 R12: ffff88815238c000
R13: ffff88812d050380 R14: ffff8881018aa7e0 R15: ffff88811d6b3428
FS: 00007fc82e18ae80(0000) GS:ffff88842e080000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f9630d1b421 CR3: 0000000149802004 CR4: 0000000000370ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
mlx5_esw_offloads_devcom_event+0x99/0x3b0 [mlx5_core]
mlx5_devcom_send_event+0x167/0x1d0 [mlx5_core]
esw_offloads_enable+0x1153/0x1500 [mlx5_core]
? mlx5_esw_offloads_controller_valid+0x170/0x170 [mlx5_core]
? wait_for_completion_io_timeout+0x20/0x20
? mlx5_rescan_drivers_locked+0x318/0x810 [mlx5_core]
mlx5_eswitch_enable_locked+0x586/0xc50 [mlx5_core]
? mlx5_eswitch_disable_pf_vf_vports+0x1d0/0x1d0 [mlx5_core]
? mlx5_esw_try_lock+0x1b/0xb0 [mlx5_core]
? mlx5_eswitch_enable+0x270/0x270 [mlx5_core]
? __debugfs_create_file+0x260/0x3e0
mlx5_devlink_eswitch_mode_set+0x27e/0x870 [mlx5_core]
? mutex_lock_io_nested+0x12c0/0x12c0
? esw_offloads_disable+0x250/0x250 [mlx5_core]
? devlink_nl_cmd_trap_get_dumpit+0x470/0x470
? rcu_read_lock_sched_held+0x3f/0x70
devlink_nl_cmd_eswitch_set_doit+0x217/0x620 |
In the Linux kernel, the following vulnerability has been resolved:
ip_gre: test csum_start instead of transport header
GRE with TUNNEL_CSUM will apply local checksum offload on
CHECKSUM_PARTIAL packets.
ipgre_xmit must validate csum_start after an optional skb_pull,
else lco_csum may trigger an overflow. The original check was
if (csum && skb_checksum_start(skb) < skb->data)
return -EINVAL;
This had false positives when skb_checksum_start is undefined:
when ip_summed is not CHECKSUM_PARTIAL. A discussed refinement
was straightforward
if (csum && skb->ip_summed == CHECKSUM_PARTIAL &&
skb_checksum_start(skb) < skb->data)
return -EINVAL;
But was eventually revised more thoroughly:
- restrict the check to the only branch where needed, in an
uncommon GRE path that uses header_ops and calls skb_pull.
- test skb_transport_header, which is set along with csum_start
in skb_partial_csum_set in the normal header_ops datapath.
Turns out skbs can arrive in this branch without the transport
header set, e.g., through BPF redirection.
Revise the check back to check csum_start directly, and only if
CHECKSUM_PARTIAL. Do leave the check in the updated location.
Check field regardless of whether TUNNEL_CSUM is configured. |
In the Linux kernel, the following vulnerability has been resolved:
ext4: avoid cycles in directory h-tree
A maliciously corrupted filesystem can contain cycles in the h-tree
stored inside a directory. That can easily lead to the kernel corrupting
tree nodes that were already verified under its hands while doing a node
split and consequently accessing unallocated memory. Fix the problem by
verifying traversed block numbers are unique. |
In the Linux kernel, the following vulnerability has been resolved:
ext4: filter out EXT4_FC_REPLAY from on-disk superblock field s_state
The EXT4_FC_REPLAY bit in sbi->s_mount_state is used to indicate that
we are in the middle of replay the fast commit journal. This was
actually a mistake, since the sbi->s_mount_info is initialized from
es->s_state. Arguably s_mount_state is misleadingly named, but the
name is historical --- s_mount_state and s_state dates back to ext2.
What should have been used is the ext4_{set,clear,test}_mount_flag()
inline functions, which sets EXT4_MF_* bits in sbi->s_mount_flags.
The problem with using EXT4_FC_REPLAY is that a maliciously corrupted
superblock could result in EXT4_FC_REPLAY getting set in
s_mount_state. This bypasses some sanity checks, and this can trigger
a BUG() in ext4_es_cache_extent(). As a easy-to-backport-fix, filter
out the EXT4_FC_REPLAY bit for now. We should eventually transition
away from EXT4_FC_REPLAY to something like EXT4_MF_REPLAY. |
In the Linux kernel, the following vulnerability has been resolved:
SUNRPC: Trap RDMA segment overflows
Prevent svc_rdma_build_writes() from walking off the end of a Write
chunk's segment array. Caught with KASAN.
The test that this fix replaces is invalid, and might have been left
over from an earlier prototype of the PCL work. |
In the Linux kernel, the following vulnerability has been resolved:
tcp: tcp_rtx_synack() can be called from process context
Laurent reported the enclosed report [1]
This bug triggers with following coditions:
0) Kernel built with CONFIG_DEBUG_PREEMPT=y
1) A new passive FastOpen TCP socket is created.
This FO socket waits for an ACK coming from client to be a complete
ESTABLISHED one.
2) A socket operation on this socket goes through lock_sock()
release_sock() dance.
3) While the socket is owned by the user in step 2),
a retransmit of the SYN is received and stored in socket backlog.
4) At release_sock() time, the socket backlog is processed while
in process context.
5) A SYNACK packet is cooked in response of the SYN retransmit.
6) -> tcp_rtx_synack() is called in process context.
Before blamed commit, tcp_rtx_synack() was always called from BH handler,
from a timer handler.
Fix this by using TCP_INC_STATS() & NET_INC_STATS()
which do not assume caller is in non preemptible context.
[1]
BUG: using __this_cpu_add() in preemptible [00000000] code: epollpep/2180
caller is tcp_rtx_synack.part.0+0x36/0xc0
CPU: 10 PID: 2180 Comm: epollpep Tainted: G OE 5.16.0-0.bpo.4-amd64 #1 Debian 5.16.12-1~bpo11+1
Hardware name: Supermicro SYS-5039MC-H8TRF/X11SCD-F, BIOS 1.7 11/23/2021
Call Trace:
<TASK>
dump_stack_lvl+0x48/0x5e
check_preemption_disabled+0xde/0xe0
tcp_rtx_synack.part.0+0x36/0xc0
tcp_rtx_synack+0x8d/0xa0
? kmem_cache_alloc+0x2e0/0x3e0
? apparmor_file_alloc_security+0x3b/0x1f0
inet_rtx_syn_ack+0x16/0x30
tcp_check_req+0x367/0x610
tcp_rcv_state_process+0x91/0xf60
? get_nohz_timer_target+0x18/0x1a0
? lock_timer_base+0x61/0x80
? preempt_count_add+0x68/0xa0
tcp_v4_do_rcv+0xbd/0x270
__release_sock+0x6d/0xb0
release_sock+0x2b/0x90
sock_setsockopt+0x138/0x1140
? __sys_getsockname+0x7e/0xc0
? aa_sk_perm+0x3e/0x1a0
__sys_setsockopt+0x198/0x1e0
__x64_sys_setsockopt+0x21/0x30
do_syscall_64+0x38/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae |
In the Linux kernel, the following vulnerability has been resolved:
sfc: fix considering that all channels have TX queues
Normally, all channels have RX and TX queues, but this is not true if
modparam efx_separate_tx_channels=1 is used. In that cases, some
channels only have RX queues and others only TX queues (or more
preciselly, they have them allocated, but not initialized).
Fix efx_channel_has_tx_queues to return the correct value for this case
too.
Messages shown at probe time before the fix:
sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0
------------[ cut here ]------------
netdevice: ens6f0np0: failed to initialise TXQ -1
WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc]
[...] stripped
RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc]
[...] stripped
Call Trace:
efx_init_tx_queue+0xaa/0xf0 [sfc]
efx_start_channels+0x49/0x120 [sfc]
efx_start_all+0x1f8/0x430 [sfc]
efx_net_open+0x5a/0xe0 [sfc]
__dev_open+0xd0/0x190
__dev_change_flags+0x1b3/0x220
dev_change_flags+0x21/0x60
[...] stripped
Messages shown at remove time before the fix:
sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues
sfc 0000:03:00.0 ens6f0np0: failed to flush queues |
In the Linux kernel, the following vulnerability has been resolved:
blk-iolatency: Fix inflight count imbalances and IO hangs on offline
iolatency needs to track the number of inflight IOs per cgroup. As this
tracking can be expensive, it is disabled when no cgroup has iolatency
configured for the device. To ensure that the inflight counters stay
balanced, iolatency_set_limit() freezes the request_queue while manipulating
the enabled counter, which ensures that no IO is in flight and thus all
counters are zero.
Unfortunately, iolatency_set_limit() isn't the only place where the enabled
counter is manipulated. iolatency_pd_offline() can also dec the counter and
trigger disabling. As this disabling happens without freezing the q, this
can easily happen while some IOs are in flight and thus leak the counts.
This can be easily demonstrated by turning on iolatency on an one empty
cgroup while IOs are in flight in other cgroups and then removing the
cgroup. Note that iolatency shouldn't have been enabled elsewhere in the
system to ensure that removing the cgroup disables iolatency for the whole
device.
The following keeps flipping on and off iolatency on sda:
echo +io > /sys/fs/cgroup/cgroup.subtree_control
while true; do
mkdir -p /sys/fs/cgroup/test
echo '8:0 target=100000' > /sys/fs/cgroup/test/io.latency
sleep 1
rmdir /sys/fs/cgroup/test
sleep 1
done
and there's concurrent fio generating direct rand reads:
fio --name test --filename=/dev/sda --direct=1 --rw=randread \
--runtime=600 --time_based --iodepth=256 --numjobs=4 --bs=4k
while monitoring with the following drgn script:
while True:
for css in css_for_each_descendant_pre(prog['blkcg_root'].css.address_of_()):
for pos in hlist_for_each(container_of(css, 'struct blkcg', 'css').blkg_list):
blkg = container_of(pos, 'struct blkcg_gq', 'blkcg_node')
pd = blkg.pd[prog['blkcg_policy_iolatency'].plid]
if pd.value_() == 0:
continue
iolat = container_of(pd, 'struct iolatency_grp', 'pd')
inflight = iolat.rq_wait.inflight.counter.value_()
if inflight:
print(f'inflight={inflight} {disk_name(blkg.q.disk).decode("utf-8")} '
f'{cgroup_path(css.cgroup).decode("utf-8")}')
time.sleep(1)
The monitoring output looks like the following:
inflight=1 sda /user.slice
inflight=1 sda /user.slice
...
inflight=14 sda /user.slice
inflight=13 sda /user.slice
inflight=17 sda /user.slice
inflight=15 sda /user.slice
inflight=18 sda /user.slice
inflight=17 sda /user.slice
inflight=20 sda /user.slice
inflight=19 sda /user.slice <- fio stopped, inflight stuck at 19
inflight=19 sda /user.slice
inflight=19 sda /user.slice
If a cgroup with stuck inflight ends up getting throttled, the throttled IOs
will never get issued as there's no completion event to wake it up leading
to an indefinite hang.
This patch fixes the bug by unifying enable handling into a work item which
is automatically kicked off from iolatency_set_min_lat_nsec() which is
called from both iolatency_set_limit() and iolatency_pd_offline() paths.
Punting to a work item is necessary as iolatency_pd_offline() is called
under spinlocks while freezing a request_queue requires a sleepable context.
This also simplifies the code reducing LOC sans the comments and avoids the
unnecessary freezes which were happening whenever a cgroup's latency target
is newly set or cleared. |
In the Linux kernel, the following vulnerability has been resolved:
usb: dwc3: gadget: Replace list_for_each_entry_safe() if using giveback
The list_for_each_entry_safe() macro saves the current item (n) and
the item after (n+1), so that n can be safely removed without
corrupting the list. However, when traversing the list and removing
items using gadget giveback, the DWC3 lock is briefly released,
allowing other routines to execute. There is a situation where, while
items are being removed from the cancelled_list using
dwc3_gadget_ep_cleanup_cancelled_requests(), the pullup disable
routine is running in parallel (due to UDC unbind). As the cleanup
routine removes n, and the pullup disable removes n+1, once the
cleanup retakes the DWC3 lock, it references a request who was already
removed/handled. With list debug enabled, this leads to a panic.
Ensure all instances of the macro are replaced where gadget giveback
is used.
Example call stack:
Thread#1:
__dwc3_gadget_ep_set_halt() - CLEAR HALT
-> dwc3_gadget_ep_cleanup_cancelled_requests()
->list_for_each_entry_safe()
->dwc3_gadget_giveback(n)
->dwc3_gadget_del_and_unmap_request()- n deleted[cancelled_list]
->spin_unlock
->Thread#2 executes
...
->dwc3_gadget_giveback(n+1)
->Already removed!
Thread#2:
dwc3_gadget_pullup()
->waiting for dwc3 spin_lock
...
->Thread#1 released lock
->dwc3_stop_active_transfers()
->dwc3_remove_requests()
->fetches n+1 item from cancelled_list (n removed by Thread#1)
->dwc3_gadget_giveback()
->dwc3_gadget_del_and_unmap_request()- n+1 deleted[cancelled_list]
->spin_unlock |
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Inhibit aborts if external loopback plug is inserted
After running a short external loopback test, when the external loopback is
removed and a normal cable inserted that is directly connected to a target
device, the system oops in the llpfc_set_rrq_active() routine.
When the loopback was inserted an FLOGI was transmit. As we're looped back,
we receive the FLOGI request. The FLOGI is ABTS'd as we recognize the same
wppn thus understand it's a loopback. However, as the ABTS sends address
information the port is not set to (fffffe), the ABTS is dropped on the
wire. A short 1 frame loopback test is run and completes before the ABTS
times out. The looback is unplugged and the new cable plugged in, and the
an FLOGI to the new device occurs and completes. Due to a mixup in ref
counting the completion of the new FLOGI releases the fabric ndlp. Then the
original ABTS completes and references the released ndlp generating the
oops.
Correct by no-op'ing the ABTS when in loopback mode (it will be dropped
anyway). Added a flag to track the mode to recognize when it should be
no-op'd. |
In the Linux kernel, the following vulnerability has been resolved:
fbdev: defio: fix the pagelist corruption
Easily hit the below list corruption:
==
list_add corruption. prev->next should be next (ffffffffc0ceb090), but
was ffffec604507edc8. (prev=ffffec604507edc8).
WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26
__list_add_valid+0x53/0x80
CPU: 65 PID: 3959 Comm: fbdev Tainted: G U
RIP: 0010:__list_add_valid+0x53/0x80
Call Trace:
<TASK>
fb_deferred_io_mkwrite+0xea/0x150
do_page_mkwrite+0x57/0xc0
do_wp_page+0x278/0x2f0
__handle_mm_fault+0xdc2/0x1590
handle_mm_fault+0xdd/0x2c0
do_user_addr_fault+0x1d3/0x650
exc_page_fault+0x77/0x180
? asm_exc_page_fault+0x8/0x30
asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x7fd98fc8fad1
==
Figure out the race happens when one process is adding &page->lru into
the pagelist tail in fb_deferred_io_mkwrite(), another process is
re-initializing the same &page->lru in fb_deferred_io_fault(), which is
not protected by the lock.
This fix is to init all the page lists one time during initialization,
it not only fixes the list corruption, but also avoids INIT_LIST_HEAD()
redundantly.
V2: change "int i" to "unsigned int i" (Geert Uytterhoeven) |
In the Linux kernel, the following vulnerability has been resolved:
cpufreq: governor: Use kobject release() method to free dbs_data
The struct dbs_data embeds a struct gov_attr_set and
the struct gov_attr_set embeds a kobject. Since every kobject must have
a release() method and we can't use kfree() to free it directly,
so introduce cpufreq_dbs_data_release() to release the dbs_data via
the kobject::release() method. This fixes the calltrace like below:
ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x34
WARNING: CPU: 12 PID: 810 at lib/debugobjects.c:505 debug_print_object+0xb8/0x100
Modules linked in:
CPU: 12 PID: 810 Comm: sh Not tainted 5.16.0-next-20220120-yocto-standard+ #536
Hardware name: Marvell OcteonTX CN96XX board (DT)
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : debug_print_object+0xb8/0x100
lr : debug_print_object+0xb8/0x100
sp : ffff80001dfcf9a0
x29: ffff80001dfcf9a0 x28: 0000000000000001 x27: ffff0001464f0000
x26: 0000000000000000 x25: ffff8000090e3f00 x24: ffff80000af60210
x23: ffff8000094dfb78 x22: ffff8000090e3f00 x21: ffff0001080b7118
x20: ffff80000aeb2430 x19: ffff800009e8f5e0 x18: 0000000000000000
x17: 0000000000000002 x16: 00004d62e58be040 x15: 013590470523aff8
x14: ffff8000090e1828 x13: 0000000001359047 x12: 00000000f5257d14
x11: 0000000000040591 x10: 0000000066c1ffea x9 : ffff8000080d15e0
x8 : ffff80000a1765a8 x7 : 0000000000000000 x6 : 0000000000000001
x5 : ffff800009e8c000 x4 : ffff800009e8c760 x3 : 0000000000000000
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0001474ed040
Call trace:
debug_print_object+0xb8/0x100
__debug_check_no_obj_freed+0x1d0/0x25c
debug_check_no_obj_freed+0x24/0xa0
kfree+0x11c/0x440
cpufreq_dbs_governor_exit+0xa8/0xac
cpufreq_exit_governor+0x44/0x90
cpufreq_set_policy+0x29c/0x570
store_scaling_governor+0x110/0x154
store+0xb0/0xe0
sysfs_kf_write+0x58/0x84
kernfs_fop_write_iter+0x12c/0x1c0
new_sync_write+0xf0/0x18c
vfs_write+0x1cc/0x220
ksys_write+0x74/0x100
__arm64_sys_write+0x28/0x3c
invoke_syscall.constprop.0+0x58/0xf0
do_el0_svc+0x70/0x170
el0_svc+0x54/0x190
el0t_64_sync_handler+0xa4/0x130
el0t_64_sync+0x1a0/0x1a4
irq event stamp: 189006
hardirqs last enabled at (189005): [<ffff8000080849d0>] finish_task_switch.isra.0+0xe0/0x2c0
hardirqs last disabled at (189006): [<ffff8000090667a4>] el1_dbg+0x24/0xa0
softirqs last enabled at (188966): [<ffff8000080106d0>] __do_softirq+0x4b0/0x6a0
softirqs last disabled at (188957): [<ffff80000804a618>] __irq_exit_rcu+0x108/0x1a4
[ rjw: Because can be freed by the gov_attr_set_put() in
cpufreq_dbs_governor_exit() now, it is also necessary to put the
invocation of the governor ->exit() callback into the new
cpufreq_dbs_data_release() function. ] |
In the Linux kernel, the following vulnerability has been resolved:
ASoC: cs35l41: Fix an out-of-bounds access in otp_packed_element_t
The CS35L41_NUM_OTP_ELEM is 100, but only 99 entries are defined in
the array otp_map_1/2[CS35L41_NUM_OTP_ELEM], this will trigger UBSAN
to report a shift-out-of-bounds warning in the cs35l41_otp_unpack()
since the last entry in the array will result in GENMASK(-1, 0).
UBSAN reports this problem:
UBSAN: shift-out-of-bounds in /home/hwang4/build/jammy/jammy/sound/soc/codecs/cs35l41-lib.c:836:8
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 10 PID: 595 Comm: systemd-udevd Not tainted 5.15.0-23-generic #23
Hardware name: LENOVO \x02MFG_IN_GO/\x02MFG_IN_GO, BIOS N3GET19W (1.00 ) 03/11/2022
Call Trace:
<TASK>
show_stack+0x52/0x58
dump_stack_lvl+0x4a/0x5f
dump_stack+0x10/0x12
ubsan_epilogue+0x9/0x45
__ubsan_handle_shift_out_of_bounds.cold+0x61/0xef
? regmap_unlock_mutex+0xe/0x10
cs35l41_otp_unpack.cold+0x1c6/0x2b2 [snd_soc_cs35l41_lib]
cs35l41_hda_probe+0x24f/0x33a [snd_hda_scodec_cs35l41]
cs35l41_hda_i2c_probe+0x65/0x90 [snd_hda_scodec_cs35l41_i2c]
? cs35l41_hda_i2c_remove+0x20/0x20 [snd_hda_scodec_cs35l41_i2c]
i2c_device_probe+0x252/0x2b0 |
In the Linux kernel, the following vulnerability has been resolved:
ath10k: skip ath10k_halt during suspend for driver state RESTARTING
Double free crash is observed when FW recovery(caused by wmi
timeout/crash) is followed by immediate suspend event. The FW recovery
is triggered by ath10k_core_restart() which calls driver clean up via
ath10k_halt(). When the suspend event occurs between the FW recovery,
the restart worker thread is put into frozen state until suspend completes.
The suspend event triggers ath10k_stop() which again triggers ath10k_halt()
The double invocation of ath10k_halt() causes ath10k_htt_rx_free() to be
called twice(Note: ath10k_htt_rx_alloc was not called by restart worker
thread because of its frozen state), causing the crash.
To fix this, during the suspend flow, skip call to ath10k_halt() in
ath10k_stop() when the current driver state is ATH10K_STATE_RESTARTING.
Also, for driver state ATH10K_STATE_RESTARTING, call
ath10k_wait_for_suspend() in ath10k_stop(). This is because call to
ath10k_wait_for_suspend() is skipped later in
[ath10k_halt() > ath10k_core_stop()] for the driver state
ATH10K_STATE_RESTARTING.
The frozen restart worker thread will be cancelled during resume when the
device comes out of suspend.
Below is the crash stack for reference:
[ 428.469167] ------------[ cut here ]------------
[ 428.469180] kernel BUG at mm/slub.c:4150!
[ 428.469193] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 428.469219] Workqueue: events_unbound async_run_entry_fn
[ 428.469230] RIP: 0010:kfree+0x319/0x31b
[ 428.469241] RSP: 0018:ffffa1fac015fc30 EFLAGS: 00010246
[ 428.469247] RAX: ffffedb10419d108 RBX: ffff8c05262b0000
[ 428.469252] RDX: ffff8c04a8c07000 RSI: 0000000000000000
[ 428.469256] RBP: ffffa1fac015fc78 R08: 0000000000000000
[ 428.469276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 428.469285] Call Trace:
[ 428.469295] ? dma_free_attrs+0x5f/0x7d
[ 428.469320] ath10k_core_stop+0x5b/0x6f
[ 428.469336] ath10k_halt+0x126/0x177
[ 428.469352] ath10k_stop+0x41/0x7e
[ 428.469387] drv_stop+0x88/0x10e
[ 428.469410] __ieee80211_suspend+0x297/0x411
[ 428.469441] rdev_suspend+0x6e/0xd0
[ 428.469462] wiphy_suspend+0xb1/0x105
[ 428.469483] ? name_show+0x2d/0x2d
[ 428.469490] dpm_run_callback+0x8c/0x126
[ 428.469511] ? name_show+0x2d/0x2d
[ 428.469517] __device_suspend+0x2e7/0x41b
[ 428.469523] async_suspend+0x1f/0x93
[ 428.469529] async_run_entry_fn+0x3d/0xd1
[ 428.469535] process_one_work+0x1b1/0x329
[ 428.469541] worker_thread+0x213/0x372
[ 428.469547] kthread+0x150/0x15f
[ 428.469552] ? pr_cont_work+0x58/0x58
[ 428.469558] ? kthread_blkcg+0x31/0x31
Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 |
In the Linux kernel, the following vulnerability has been resolved:
arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall
If a compat process tries to execute an unknown system call above the
__ARM_NR_COMPAT_END number, the kernel sends a SIGILL signal to the
offending process. Information about the error is printed to dmesg in
compat_arm_syscall() -> arm64_notify_die() -> arm64_force_sig_fault() ->
arm64_show_signal().
arm64_show_signal() interprets a non-zero value for
current->thread.fault_code as an exception syndrome and displays the
message associated with the ESR_ELx.EC field (bits 31:26).
current->thread.fault_code is set in compat_arm_syscall() ->
arm64_notify_die() with the bad syscall number instead of a valid ESR_ELx
value. This means that the ESR_ELx.EC field has the value that the user set
for the syscall number and the kernel can end up printing bogus exception
messages*. For example, for the syscall number 0x68000000, which evaluates
to ESR_ELx.EC value of 0x1A (ESR_ELx_EC_FPAC) the kernel prints this error:
[ 18.349161] syscall[300]: unhandled exception: ERET/ERETAA/ERETAB, ESR 0x68000000, Oops - bad compat syscall(2) in syscall[10000+50000]
[ 18.350639] CPU: 2 PID: 300 Comm: syscall Not tainted 5.18.0-rc1 #79
[ 18.351249] Hardware name: Pine64 RockPro64 v2.0 (DT)
[..]
which is misleading, as the bad compat syscall has nothing to do with
pointer authentication.
Stop arm64_show_signal() from printing exception syndrome information by
having compat_arm_syscall() set the ESR_ELx value to 0, as it has no
meaning for an invalid system call number. The example above now becomes:
[ 19.935275] syscall[301]: unhandled exception: Oops - bad compat syscall(2) in syscall[10000+50000]
[ 19.936124] CPU: 1 PID: 301 Comm: syscall Not tainted 5.18.0-rc1-00005-g7e08006d4102 #80
[ 19.936894] Hardware name: Pine64 RockPro64 v2.0 (DT)
[..]
which although shows less information because the syscall number,
wrongfully advertised as the ESR value, is missing, it is better than
showing plainly wrong information. The syscall number can be easily
obtained with strace.
*A 32-bit value above or equal to 0x8000_0000 is interpreted as a negative
integer in compat_arm_syscal() and the condition scno < __ARM_NR_COMPAT_END
evaluates to true; the syscall will exit to userspace in this case with the
ENOSYS error code instead of arm64_notify_die() being called. |
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Fix call trace observed during I/O with CMF enabled
The following was seen with CMF enabled:
BUG: using smp_processor_id() in preemptible
code: systemd-udevd/31711
kernel: caller is lpfc_update_cmf_cmd+0x214/0x420 [lpfc]
kernel: CPU: 12 PID: 31711 Comm: systemd-udevd
kernel: Call Trace:
kernel: <TASK>
kernel: dump_stack_lvl+0x44/0x57
kernel: check_preemption_disabled+0xbf/0xe0
kernel: lpfc_update_cmf_cmd+0x214/0x420 [lpfc]
kernel: lpfc_nvme_fcp_io_submit+0x23b4/0x4df0 [lpfc]
this_cpu_ptr() calls smp_processor_id() in a preemptible context.
Fix by using per_cpu_ptr() with raw_smp_processor_id() instead. |
In the Linux kernel, the following vulnerability has been resolved:
rtw89: ser: fix CAM leaks occurring in L2 reset
The CAM, meaning address CAM and bssid CAM here, will get leaks during
SER (system error recover) L2 reset process and ieee80211_restart_hw()
which is called by L2 reset process eventually.
The normal flow would be like
-> add interface (acquire 1)
-> enter ips (release 1)
-> leave ips (acquire 1)
-> connection (occupy 1) <(A) 1 leak after L2 reset if non-sec connection>
The ieee80211_restart_hw() flow (under connection)
-> ieee80211 reconfig
-> add interface (acquire 1)
-> leave ips (acquire 1)
-> connection (occupy (A) + 2) <(B) 1 more leak>
Originally, CAM is released before HW restart only if connection is under
security. Now, release CAM whatever connection it is to fix leak in (A).
OTOH, check if CAM is already valid to avoid acquiring multiple times to
fix (B).
Besides, if AP mode, release address CAM of all stations before HW restart. |