CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
In the Linux kernel, the following vulnerability has been resolved:
riscv: save the SR_SUM status over switches
When threads/tasks are switched we need to ensure the old execution's
SR_SUM state is saved and the new thread has the old SR_SUM state
restored.
The issue was seen under heavy load especially with the syz-stress tool
running, with crashes as follows in schedule_tail:
Unable to handle kernel access to user memory without uaccess routines
at virtual address 000000002749f0d0
Oops [#1]
Modules linked in:
CPU: 1 PID: 4875 Comm: syz-executor.0 Not tainted
5.12.0-rc2-syzkaller-00467-g0d7588ab9ef9 #0
Hardware name: riscv-virtio,qemu (DT)
epc : schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
ra : task_pid_vnr include/linux/sched.h:1421 [inline]
ra : schedule_tail+0x70/0xb2 kernel/sched/core.c:4264
epc : ffffffe00008c8b0 ra : ffffffe00008c8ae sp : ffffffe025d17ec0
gp : ffffffe005d25378 tp : ffffffe00f0d0000 t0 : 0000000000000000
t1 : 0000000000000001 t2 : 00000000000f4240 s0 : ffffffe025d17ee0
s1 : 000000002749f0d0 a0 : 000000000000002a a1 : 0000000000000003
a2 : 1ffffffc0cfac500 a3 : ffffffe0000c80cc a4 : 5ae9db91c19bbe00
a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffe000082eba
s2 : 0000000000040000 s3 : ffffffe00eef96c0 s4 : ffffffe022c77fe0
s5 : 0000000000004000 s6 : ffffffe067d74e00 s7 : ffffffe067d74850
s8 : ffffffe067d73e18 s9 : ffffffe067d74e00 s10: ffffffe00eef96e8
s11: 000000ae6cdf8368 t3 : 5ae9db91c19bbe00 t4 : ffffffc4043cafb2
t5 : ffffffc4043cafba t6 : 0000000000040000
status: 0000000000000120 badaddr: 000000002749f0d0 cause:
000000000000000f
Call Trace:
[<ffffffe00008c8b0>] schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
[<ffffffe000005570>] ret_from_exception+0x0/0x14
Dumping ftrace buffer:
(ftrace buffer empty)
---[ end trace b5f8f9231dc87dda ]---
The issue comes from the put_user() in schedule_tail
(kernel/sched/core.c) doing the following:
asmlinkage __visible void schedule_tail(struct task_struct *prev)
{
...
if (current->set_child_tid)
put_user(task_pid_vnr(current), current->set_child_tid);
...
}
the put_user() macro causes the code sequence to come out as follows:
1: __enable_user_access()
2: reg = task_pid_vnr(current);
3: *current->set_child_tid = reg;
4: __disable_user_access()
The problem is that we may have a sleeping function as argument which
could clear SR_SUM causing the panic above. This was fixed by
evaluating the argument of the put_user() macro outside the user-enabled
section in commit 285a76bb2cf5 ("riscv: evaluate put_user() arg before
enabling user access")"
In order for riscv to take advantage of unsafe_get/put_XXX() macros and
to avoid the same issue we had with put_user() and sleeping functions we
must ensure code flow can go through switch_to() from within a region of
code with SR_SUM enabled and come back with SR_SUM still enabled. This
patch addresses the problem allowing future work to enable full use of
unsafe_get/put_XXX() macros without needing to take a CSR bit flip cost
on every access. Make switch_to() save and restore SR_SUM. |
In the Linux kernel, the following vulnerability has been resolved:
btrfs: handle csum tree error with rescue=ibadroots correctly
[BUG]
There is syzbot based reproducer that can crash the kernel, with the
following call trace: (With some debug output added)
DEBUG: rescue=ibadroots parsed
BTRFS: device fsid 14d642db-7b15-43e4-81e6-4b8fac6a25f8 devid 1 transid 8 /dev/loop0 (7:0) scanned by repro (1010)
BTRFS info (device loop0): first mount of filesystem 14d642db-7b15-43e4-81e6-4b8fac6a25f8
BTRFS info (device loop0): using blake2b (blake2b-256-generic) checksum algorithm
BTRFS info (device loop0): using free-space-tree
BTRFS warning (device loop0): checksum verify failed on logical 5312512 mirror 1 wanted 0xb043382657aede36608fd3386d6b001692ff406164733d94e2d9a180412c6003 found 0x810ceb2bacb7f0f9eb2bf3b2b15c02af867cb35ad450898169f3b1f0bd818651 level 0
DEBUG: read tree root path failed for tree csum, ret=-5
BTRFS warning (device loop0): checksum verify failed on logical 5328896 mirror 1 wanted 0x51be4e8b303da58e6340226815b70e3a93592dac3f30dd510c7517454de8567a found 0x51be4e8b303da58e634022a315b70e3a93592dac3f30dd510c7517454de8567a level 0
BTRFS warning (device loop0): checksum verify failed on logical 5292032 mirror 1 wanted 0x1924ccd683be9efc2fa98582ef58760e3848e9043db8649ee382681e220cdee4 found 0x0cb6184f6e8799d9f8cb335dccd1d1832da1071d12290dab3b85b587ecacca6e level 0
process 'repro' launched './file2' with NULL argv: empty string added
DEBUG: no csum root, idatacsums=0 ibadroots=134217728
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000041: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000208-0x000000000000020f]
CPU: 5 UID: 0 PID: 1010 Comm: repro Tainted: G OE 6.15.0-custom+ #249 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
RIP: 0010:btrfs_lookup_csum+0x93/0x3d0 [btrfs]
Call Trace:
<TASK>
btrfs_lookup_bio_sums+0x47a/0xdf0 [btrfs]
btrfs_submit_bbio+0x43e/0x1a80 [btrfs]
submit_one_bio+0xde/0x160 [btrfs]
btrfs_readahead+0x498/0x6a0 [btrfs]
read_pages+0x1c3/0xb20
page_cache_ra_order+0x4b5/0xc20
filemap_get_pages+0x2d3/0x19e0
filemap_read+0x314/0xde0
__kernel_read+0x35b/0x900
bprm_execve+0x62e/0x1140
do_execveat_common.isra.0+0x3fc/0x520
__x64_sys_execveat+0xdc/0x130
do_syscall_64+0x54/0x1d0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
---[ end trace 0000000000000000 ]---
[CAUSE]
Firstly the fs has a corrupted csum tree root, thus to mount the fs we
have to go "ro,rescue=ibadroots" mount option.
Normally with that mount option, a bad csum tree root should set
BTRFS_FS_STATE_NO_DATA_CSUMS flag, so that any future data read will
ignore csum search.
But in this particular case, we have the following call trace that
caused NULL csum root, but not setting BTRFS_FS_STATE_NO_DATA_CSUMS:
load_global_roots_objectid():
ret = btrfs_search_slot();
/* Succeeded */
btrfs_item_key_to_cpu()
found = true;
/* We found the root item for csum tree. */
root = read_tree_root_path();
if (IS_ERR(root)) {
if (!btrfs_test_opt(fs_info, IGNOREBADROOTS))
/*
* Since we have rescue=ibadroots mount option,
* @ret is still 0.
*/
break;
if (!found || ret) {
/* @found is true, @ret is 0, error handling for csum
* tree is skipped.
*/
}
This means we completely skipped to set BTRFS_FS_STATE_NO_DATA_CSUMS if
the csum tree is corrupted, which results unexpected later csum lookup.
[FIX]
If read_tree_root_path() failed, always populate @ret to the error
number.
As at the end of the function, we need @ret to determine if we need to
do the extra error handling for csum tree. |
In the Linux kernel, the following vulnerability has been resolved:
ASoC: codecs: wcd9335: Fix missing free of regulator supplies
Driver gets and enables all regulator supplies in probe path
(wcd9335_parse_dt() and wcd9335_power_on_reset()), but does not cleanup
in final error paths and in unbind (missing remove() callback). This
leads to leaked memory and unbalanced regulator enable count during
probe errors or unbind.
Fix this by converting entire code into devm_regulator_bulk_get_enable()
which also greatly simplifies the code. |
In the Linux kernel, the following vulnerability has been resolved:
mm/damon/sysfs-schemes: free old damon_sysfs_scheme_filter->memcg_path on write
memcg_path_store() assigns a newly allocated memory buffer to
filter->memcg_path, without deallocating the previously allocated and
assigned memory buffer. As a result, users can leak kernel memory by
continuously writing a data to memcg_path DAMOS sysfs file. Fix the leak
by deallocating the previously set memory buffer. |
In the Linux kernel, the following vulnerability has been resolved:
s390/pkey: Prevent overflow in size calculation for memdup_user()
Number of apqn target list entries contained in 'nr_apqns' variable is
determined by userspace via an ioctl call so the result of the product in
calculation of size passed to memdup_user() may overflow.
In this case the actual size of the allocated area and the value
describing it won't be in sync leading to various types of unpredictable
behaviour later.
Use a proper memdup_array_user() helper which returns an error if an
overflow is detected. Note that it is different from when nr_apqns is
initially zero - that case is considered valid and should be handled in
subsequent pkey_handler implementations.
Found by Linux Verification Center (linuxtesting.org). |
In the Linux kernel, the following vulnerability has been resolved:
io_uring/rsrc: fix folio unpinning
syzbot complains about an unmapping failure:
[ 108.070381][ T14] kernel BUG at mm/gup.c:71!
[ 108.070502][ T14] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
[ 108.123672][ T14] Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20250221-8.fc42 02/21/2025
[ 108.127458][ T14] Workqueue: iou_exit io_ring_exit_work
[ 108.174205][ T14] Call trace:
[ 108.175649][ T14] sanity_check_pinned_pages+0x7cc/0x7d0 (P)
[ 108.178138][ T14] unpin_user_page+0x80/0x10c
[ 108.180189][ T14] io_release_ubuf+0x84/0xf8
[ 108.182196][ T14] io_free_rsrc_node+0x250/0x57c
[ 108.184345][ T14] io_rsrc_data_free+0x148/0x298
[ 108.186493][ T14] io_sqe_buffers_unregister+0x84/0xa0
[ 108.188991][ T14] io_ring_ctx_free+0x48/0x480
[ 108.191057][ T14] io_ring_exit_work+0x764/0x7d8
[ 108.193207][ T14] process_one_work+0x7e8/0x155c
[ 108.195431][ T14] worker_thread+0x958/0xed8
[ 108.197561][ T14] kthread+0x5fc/0x75c
[ 108.199362][ T14] ret_from_fork+0x10/0x20
We can pin a tail page of a folio, but then io_uring will try to unpin
the head page of the folio. While it should be fine in terms of keeping
the page actually alive, mm folks say it's wrong and triggers a debug
warning. Use unpin_user_folio() instead of unpin_user_page*.
[axboe: adapt to current tree, massage commit message] |
In the Linux kernel, the following vulnerability has been resolved:
lib/group_cpus: fix NULL pointer dereference from group_cpus_evenly()
While testing null_blk with configfs, echo 0 > poll_queues will trigger
following panic:
BUG: kernel NULL pointer dereference, address: 0000000000000010
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 27 UID: 0 PID: 920 Comm: bash Not tainted 6.15.0-02023-gadbdb95c8696-dirty #1238 PREEMPT(undef)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
RIP: 0010:__bitmap_or+0x48/0x70
Call Trace:
<TASK>
__group_cpus_evenly+0x822/0x8c0
group_cpus_evenly+0x2d9/0x490
blk_mq_map_queues+0x1e/0x110
null_map_queues+0xc9/0x170 [null_blk]
blk_mq_update_queue_map+0xdb/0x160
blk_mq_update_nr_hw_queues+0x22b/0x560
nullb_update_nr_hw_queues+0x71/0xf0 [null_blk]
nullb_device_poll_queues_store+0xa4/0x130 [null_blk]
configfs_write_iter+0x109/0x1d0
vfs_write+0x26e/0x6f0
ksys_write+0x79/0x180
__x64_sys_write+0x1d/0x30
x64_sys_call+0x45c4/0x45f0
do_syscall_64+0xa5/0x240
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Root cause is that numgrps is set to 0, and ZERO_SIZE_PTR is returned from
kcalloc(), and later ZERO_SIZE_PTR will be deferenced.
Fix the problem by checking numgrps first in group_cpus_evenly(), and
return NULL directly if numgrps is zero.
[yukuai3@huawei.com: also fix the non-SMP version] |
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: Add sanity checks for drm_edid_raw()
When EDID is retrieved via drm_edid_raw(), it doesn't guarantee to
return proper EDID bytes the caller wants: it may be either NULL (that
leads to an Oops) or with too long bytes over the fixed size raw_edid
array (that may lead to memory corruption). The latter was reported
actually when connected with a bad adapter.
Add sanity checks for drm_edid_raw() to address the above corner
cases, and return EDID_BAD_INPUT accordingly.
(cherry picked from commit 648d3f4d209725d51900d6a3ed46b7b600140cdf) |
In the Linux kernel, the following vulnerability has been resolved:
HID: wacom: fix crash in wacom_aes_battery_handler()
Commit fd2a9b29dc9c ("HID: wacom: Remove AES power_supply after extended
inactivity") introduced wacom_aes_battery_handler() which is scheduled
as a delayed work (aes_battery_work).
In wacom_remove(), aes_battery_work is not canceled. Consequently, if
the device is removed while aes_battery_work is still pending, then hard
crashes or "Oops: general protection fault..." are experienced when
wacom_aes_battery_handler() is finally called. E.g., this happens with
built-in USB devices after resume from hibernate when aes_battery_work
was still pending at the time of hibernation.
So, take care to cancel aes_battery_work in wacom_remove(). |
In the Linux kernel, the following vulnerability has been resolved:
cxl/ras: Fix CPER handler device confusion
By inspection, cxl_cper_handle_prot_err() is making a series of fragile
assumptions that can lead to crashes:
1/ It assumes that endpoints identified in the record are a CXL-type-3
device, nothing guarantees that.
2/ It assumes that the device is bound to the cxl_pci driver, nothing
guarantees that.
3/ Minor, it holds the device lock over the switch-port tracing for no
reason as the trace is 100% generated from data in the record.
Correct those by checking that the PCIe endpoint parents a cxl_memdev
before assuming the format of the driver data, and move the lock to where
it is required. Consequently this also makes the implementation ready for
CXL accelerators that are not bound to cxl_pci. |
In the Linux kernel, the following vulnerability has been resolved:
atm: clip: prevent NULL deref in clip_push()
Blamed commit missed that vcc_destroy_socket() calls
clip_push() with a NULL skb.
If clip_devs is NULL, clip_push() then crashes when reading
skb->truesize. |
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: hci_core: Fix use-after-free in vhci_flush()
syzbot reported use-after-free in vhci_flush() without repro. [0]
From the splat, a thread close()d a vhci file descriptor while
its device was being used by iotcl() on another thread.
Once the last fd refcnt is released, vhci_release() calls
hci_unregister_dev(), hci_free_dev(), and kfree() for struct
vhci_data, which is set to hci_dev->dev->driver_data.
The problem is that there is no synchronisation after unlinking
hdev from hci_dev_list in hci_unregister_dev(). There might be
another thread still accessing the hdev which was fetched before
the unlink operation.
We can use SRCU for such synchronisation.
Let's run hci_dev_reset() under SRCU and wait for its completion
in hci_unregister_dev().
Another option would be to restore hci_dev->destruct(), which was
removed in commit 587ae086f6e4 ("Bluetooth: Remove unused
hci-destruct cb"). However, this would not be a good solution, as
we should not run hci_unregister_dev() while there are in-flight
ioctl() requests, which could lead to another data-race KCSAN splat.
Note that other drivers seem to have the same problem, for exmaple,
virtbt_remove().
[0]:
BUG: KASAN: slab-use-after-free in skb_queue_empty_lockless include/linux/skbuff.h:1891 [inline]
BUG: KASAN: slab-use-after-free in skb_queue_purge_reason+0x99/0x360 net/core/skbuff.c:3937
Read of size 8 at addr ffff88807cb8d858 by task syz.1.219/6718
CPU: 1 UID: 0 PID: 6718 Comm: syz.1.219 Not tainted 6.16.0-rc1-syzkaller-00196-g08207f42d3ff #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:408 [inline]
print_report+0xd2/0x2b0 mm/kasan/report.c:521
kasan_report+0x118/0x150 mm/kasan/report.c:634
skb_queue_empty_lockless include/linux/skbuff.h:1891 [inline]
skb_queue_purge_reason+0x99/0x360 net/core/skbuff.c:3937
skb_queue_purge include/linux/skbuff.h:3368 [inline]
vhci_flush+0x44/0x50 drivers/bluetooth/hci_vhci.c:69
hci_dev_do_reset net/bluetooth/hci_core.c:552 [inline]
hci_dev_reset+0x420/0x5c0 net/bluetooth/hci_core.c:592
sock_do_ioctl+0xd9/0x300 net/socket.c:1190
sock_ioctl+0x576/0x790 net/socket.c:1311
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:907 [inline]
__se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fcf5b98e929
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fcf5c7b9038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fcf5bbb6160 RCX: 00007fcf5b98e929
RDX: 0000000000000000 RSI: 00000000400448cb RDI: 0000000000000009
RBP: 00007fcf5ba10b39 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fcf5bbb6160 R15: 00007ffd6353d528
</TASK>
Allocated by task 6535:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
__kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:394
kasan_kmalloc include/linux/kasan.h:260 [inline]
__kmalloc_cache_noprof+0x230/0x3d0 mm/slub.c:4359
kmalloc_noprof include/linux/slab.h:905 [inline]
kzalloc_noprof include/linux/slab.h:1039 [inline]
vhci_open+0x57/0x360 drivers/bluetooth/hci_vhci.c:635
misc_open+0x2bc/0x330 drivers/char/misc.c:161
chrdev_open+0x4c9/0x5e0 fs/char_dev.c:414
do_dentry_open+0xdf0/0x1970 fs/open.c:964
vfs_open+0x3b/0x340 fs/open.c:1094
do_open fs/namei.c:3887 [inline]
path_openat+0x2ee5/0x3830 fs/name
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
ALSA: usb-audio: Fix out-of-bounds read in snd_usb_get_audioformat_uac3()
In snd_usb_get_audioformat_uac3(), the length value returned from
snd_usb_ctl_msg() is used directly for memory allocation without
validation. This length is controlled by the USB device.
The allocated buffer is cast to a uac3_cluster_header_descriptor
and its fields are accessed without verifying that the buffer
is large enough. If the device returns a smaller than expected
length, this leads to an out-of-bounds read.
Add a length check to ensure the buffer is large enough for
uac3_cluster_header_descriptor. |
In the Linux kernel, the following vulnerability has been resolved:
bridge: mcast: Fix use-after-free during router port configuration
The bridge maintains a global list of ports behind which a multicast
router resides. The list is consulted during forwarding to ensure
multicast packets are forwarded to these ports even if the ports are not
member in the matching MDB entry.
When per-VLAN multicast snooping is enabled, the per-port multicast
context is disabled on each port and the port is removed from the global
router port list:
# ip link add name br1 up type bridge vlan_filtering 1 mcast_snooping 1
# ip link add name dummy1 up master br1 type dummy
# ip link set dev dummy1 type bridge_slave mcast_router 2
$ bridge -d mdb show | grep router
router ports on br1: dummy1
# ip link set dev br1 type bridge mcast_vlan_snooping 1
$ bridge -d mdb show | grep router
However, the port can be re-added to the global list even when per-VLAN
multicast snooping is enabled:
# ip link set dev dummy1 type bridge_slave mcast_router 0
# ip link set dev dummy1 type bridge_slave mcast_router 2
$ bridge -d mdb show | grep router
router ports on br1: dummy1
Since commit 4b30ae9adb04 ("net: bridge: mcast: re-implement
br_multicast_{enable, disable}_port functions"), when per-VLAN multicast
snooping is enabled, multicast disablement on a port will disable the
per-{port, VLAN} multicast contexts and not the per-port one. As a
result, a port will remain in the global router port list even after it
is deleted. This will lead to a use-after-free [1] when the list is
traversed (when adding a new port to the list, for example):
# ip link del dev dummy1
# ip link add name dummy2 up master br1 type dummy
# ip link set dev dummy2 type bridge_slave mcast_router 2
Similarly, stale entries can also be found in the per-VLAN router port
list. When per-VLAN multicast snooping is disabled, the per-{port, VLAN}
contexts are disabled on each port and the port is removed from the
per-VLAN router port list:
# ip link add name br1 up type bridge vlan_filtering 1 mcast_snooping 1 mcast_vlan_snooping 1
# ip link add name dummy1 up master br1 type dummy
# bridge vlan add vid 2 dev dummy1
# bridge vlan global set vid 2 dev br1 mcast_snooping 1
# bridge vlan set vid 2 dev dummy1 mcast_router 2
$ bridge vlan global show dev br1 vid 2 | grep router
router ports: dummy1
# ip link set dev br1 type bridge mcast_vlan_snooping 0
$ bridge vlan global show dev br1 vid 2 | grep router
However, the port can be re-added to the per-VLAN list even when
per-VLAN multicast snooping is disabled:
# bridge vlan set vid 2 dev dummy1 mcast_router 0
# bridge vlan set vid 2 dev dummy1 mcast_router 2
$ bridge vlan global show dev br1 vid 2 | grep router
router ports: dummy1
When the VLAN is deleted from the port, the per-{port, VLAN} multicast
context will not be disabled since multicast snooping is not enabled
on the VLAN. As a result, the port will remain in the per-VLAN router
port list even after it is no longer member in the VLAN. This will lead
to a use-after-free [2] when the list is traversed (when adding a new
port to the list, for example):
# ip link add name dummy2 up master br1 type dummy
# bridge vlan add vid 2 dev dummy2
# bridge vlan del vid 2 dev dummy1
# bridge vlan set vid 2 dev dummy2 mcast_router 2
Fix these issues by removing the port from the relevant (global or
per-VLAN) router port list in br_multicast_port_ctx_deinit(). The
function is invoked during port deletion with the per-port multicast
context and during VLAN deletion with the per-{port, VLAN} multicast
context.
Note that deleting the multicast router timer is not enough as it only
takes care of the temporary multicast router states (1 or 3) and not the
permanent one (2).
[1]
BUG: KASAN: slab-out-of-bounds in br_multicast_add_router.part.0+0x3f1/0x560
Write of size 8 at addr ffff888004a67328 by task ip/384
[...]
Call Trace:
<TASK>
dump_stack
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
userns and mnt_idmap leak in open_tree_attr(2)
Once want_mount_setattr() has returned a positive, it does require
finish_mount_kattr() to release ->mnt_userns. Failing do_mount_setattr()
does not change that.
As the result, we can end up leaking userns and possibly mnt_idmap as
well. |
In the Linux kernel, the following vulnerability has been resolved:
atm: Release atm_dev_mutex after removing procfs in atm_dev_deregister().
syzbot reported a warning below during atm_dev_register(). [0]
Before creating a new device and procfs/sysfs for it, atm_dev_register()
looks up a duplicated device by __atm_dev_lookup(). These operations are
done under atm_dev_mutex.
However, when removing a device in atm_dev_deregister(), it releases the
mutex just after removing the device from the list that __atm_dev_lookup()
iterates over.
So, there will be a small race window where the device does not exist on
the device list but procfs/sysfs are still not removed, triggering the
splat.
Let's hold the mutex until procfs/sysfs are removed in
atm_dev_deregister().
[0]:
proc_dir_entry 'atm/atmtcp:0' already registered
WARNING: CPU: 0 PID: 5919 at fs/proc/generic.c:377 proc_register+0x455/0x5f0 fs/proc/generic.c:377
Modules linked in:
CPU: 0 UID: 0 PID: 5919 Comm: syz-executor284 Not tainted 6.16.0-rc2-syzkaller-00047-g52da431bf03b #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
RIP: 0010:proc_register+0x455/0x5f0 fs/proc/generic.c:377
Code: 48 89 f9 48 c1 e9 03 80 3c 01 00 0f 85 a2 01 00 00 48 8b 44 24 10 48 c7 c7 20 c0 c2 8b 48 8b b0 d8 00 00 00 e8 0c 02 1c ff 90 <0f> 0b 90 90 48 c7 c7 80 f2 82 8e e8 0b de 23 09 48 8b 4c 24 28 48
RSP: 0018:ffffc9000466fa30 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff817ae248
RDX: ffff888026280000 RSI: ffffffff817ae255 RDI: 0000000000000001
RBP: ffff8880232bed48 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff888076ed2140
R13: dffffc0000000000 R14: ffff888078a61340 R15: ffffed100edda444
FS: 00007f38b3b0c6c0(0000) GS:ffff888124753000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f38b3bdf953 CR3: 0000000076d58000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
proc_create_data+0xbe/0x110 fs/proc/generic.c:585
atm_proc_dev_register+0x112/0x1e0 net/atm/proc.c:361
atm_dev_register+0x46d/0x890 net/atm/resources.c:113
atmtcp_create+0x77/0x210 drivers/atm/atmtcp.c:369
atmtcp_attach drivers/atm/atmtcp.c:403 [inline]
atmtcp_ioctl+0x2f9/0xd60 drivers/atm/atmtcp.c:464
do_vcc_ioctl+0x12c/0x930 net/atm/ioctl.c:159
sock_do_ioctl+0x115/0x280 net/socket.c:1190
sock_ioctl+0x227/0x6b0 net/socket.c:1311
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:907 [inline]
__se_sys_ioctl fs/ioctl.c:893 [inline]
__x64_sys_ioctl+0x18b/0x210 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f38b3b74459
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f38b3b0c198 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f38b3bfe318 RCX: 00007f38b3b74459
RDX: 0000000000000000 RSI: 0000000000006180 RDI: 0000000000000005
RBP: 00007f38b3bfe310 R08: 65732f636f72702f R09: 65732f636f72702f
R10: 65732f636f72702f R11: 0000000000000246 R12: 00007f38b3bcb0ac
R13: 00007f38b3b0c1a0 R14: 0000200000000200 R15: 00007f38b3bcb03b
</TASK> |
In the Linux kernel, the following vulnerability has been resolved:
smb: client: fix potential deadlock when reconnecting channels
Fix cifs_signal_cifsd_for_reconnect() to take the correct lock order
and prevent the following deadlock from happening
======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc3-build2+ #1301 Tainted: G S W
------------------------------------------------------
cifsd/6055 is trying to acquire lock:
ffff88810ad56038 (&tcp_ses->srv_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x134/0x200
but task is already holding lock:
ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&ret_buf->chan_lock){+.+.}-{3:3}:
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_setup_session+0x81/0x4b0
cifs_get_smb_ses+0x771/0x900
cifs_mount_get_session+0x7e/0x170
cifs_mount+0x92/0x2d0
cifs_smb3_do_mount+0x161/0x460
smb3_get_tree+0x55/0x90
vfs_get_tree+0x46/0x180
do_new_mount+0x1b0/0x2e0
path_mount+0x6ee/0x740
do_mount+0x98/0xe0
__do_sys_mount+0x148/0x180
do_syscall_64+0xa4/0x260
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #1 (&ret_buf->ses_lock){+.+.}-{3:3}:
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_match_super+0x101/0x320
sget+0xab/0x270
cifs_smb3_do_mount+0x1e0/0x460
smb3_get_tree+0x55/0x90
vfs_get_tree+0x46/0x180
do_new_mount+0x1b0/0x2e0
path_mount+0x6ee/0x740
do_mount+0x98/0xe0
__do_sys_mount+0x148/0x180
do_syscall_64+0xa4/0x260
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #0 (&tcp_ses->srv_lock){+.+.}-{3:3}:
check_noncircular+0x95/0xc0
check_prev_add+0x115/0x2f0
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_signal_cifsd_for_reconnect+0x134/0x200
__cifs_reconnect+0x8f/0x500
cifs_handle_standard+0x112/0x280
cifs_demultiplex_thread+0x64d/0xbc0
kthread+0x2f7/0x310
ret_from_fork+0x2a/0x230
ret_from_fork_asm+0x1a/0x30
other info that might help us debug this:
Chain exists of:
&tcp_ses->srv_lock --> &ret_buf->ses_lock --> &ret_buf->chan_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&ret_buf->chan_lock);
lock(&ret_buf->ses_lock);
lock(&ret_buf->chan_lock);
lock(&tcp_ses->srv_lock);
*** DEADLOCK ***
3 locks held by cifsd/6055:
#0: ffffffff857de398 (&cifs_tcp_ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x7b/0x200
#1: ffff888119c64060 (&ret_buf->ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x9c/0x200
#2: ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200 |
In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix invalid inode pointer dereferences during log replay
In a few places where we call read_one_inode(), if we get a NULL pointer
we end up jumping into an error path, or fallthrough in case of
__add_inode_ref(), where we then do something like this:
iput(&inode->vfs_inode);
which results in an invalid inode pointer that triggers an invalid memory
access, resulting in a crash.
Fix this by making sure we don't do such dereferences. |
In the Linux kernel, the following vulnerability has been resolved:
mm: userfaultfd: fix race of userfaultfd_move and swap cache
This commit fixes two kinds of races, they may have different results:
Barry reported a BUG_ON in commit c50f8e6053b0, we may see the same
BUG_ON if the filemap lookup returned NULL and folio is added to swap
cache after that.
If another kind of race is triggered (folio changed after lookup) we
may see RSS counter is corrupted:
[ 406.893936] BUG: Bad rss-counter state mm:ffff0000c5a9ddc0
type:MM_ANONPAGES val:-1
[ 406.894071] BUG: Bad rss-counter state mm:ffff0000c5a9ddc0
type:MM_SHMEMPAGES val:1
Because the folio is being accounted to the wrong VMA.
I'm not sure if there will be any data corruption though, seems no.
The issues above are critical already.
On seeing a swap entry PTE, userfaultfd_move does a lockless swap cache
lookup, and tries to move the found folio to the faulting vma. Currently,
it relies on checking the PTE value to ensure that the moved folio still
belongs to the src swap entry and that no new folio has been added to the
swap cache, which turns out to be unreliable.
While working and reviewing the swap table series with Barry, following
existing races are observed and reproduced [1]:
In the example below, move_pages_pte is moving src_pte to dst_pte, where
src_pte is a swap entry PTE holding swap entry S1, and S1 is not in the
swap cache:
CPU1 CPU2
userfaultfd_move
move_pages_pte()
entry = pte_to_swp_entry(orig_src_pte);
// Here it got entry = S1
... < interrupted> ...
<swapin src_pte, alloc and use folio A>
// folio A is a new allocated folio
// and get installed into src_pte
<frees swap entry S1>
// src_pte now points to folio A, S1
// has swap count == 0, it can be freed
// by folio_swap_swap or swap
// allocator's reclaim.
<try to swap out another folio B>
// folio B is a folio in another VMA.
<put folio B to swap cache using S1 >
// S1 is freed, folio B can use it
// for swap out with no problem.
...
folio = filemap_get_folio(S1)
// Got folio B here !!!
... < interrupted again> ...
<swapin folio B and free S1>
// Now S1 is free to be used again.
<swapout src_pte & folio A using S1>
// Now src_pte is a swap entry PTE
// holding S1 again.
folio_trylock(folio)
move_swap_pte
double_pt_lock
is_pte_pages_stable
// Check passed because src_pte == S1
folio_move_anon_rmap(...)
// Moved invalid folio B here !!!
The race window is very short and requires multiple collisions of multiple
rare events, so it's very unlikely to happen, but with a deliberately
constructed reproducer and increased time window, it can be reproduced
easily.
This can be fixed by checking if the folio returned by filemap is the
valid swap cache folio after acquiring the folio lock.
Another similar race is possible: filemap_get_folio may return NULL, but
folio (A) could be swapped in and then swapped out again using the same
swap entry after the lookup. In such a case, folio (A) may remain in the
swap cache, so it must be moved too:
CPU1 CPU2
userfaultfd_move
move_pages_pte()
entry = pte_to_swp_entry(orig_src_pte);
// Here it got entry = S1, and S1 is not in swap cache
folio = filemap_get
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
mm/shmem, swap: fix softlockup with mTHP swapin
Following softlockup can be easily reproduced on my test machine with:
echo always > /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled
swapon /dev/zram0 # zram0 is a 48G swap device
mkdir -p /sys/fs/cgroup/memory/test
echo 1G > /sys/fs/cgroup/test/memory.max
echo $BASHPID > /sys/fs/cgroup/test/cgroup.procs
while true; do
dd if=/dev/zero of=/tmp/test.img bs=1M count=5120
cat /tmp/test.img > /dev/null
rm /tmp/test.img
done
Then after a while:
watchdog: BUG: soft lockup - CPU#0 stuck for 763s! [cat:5787]
Modules linked in: zram virtiofs
CPU: 0 UID: 0 PID: 5787 Comm: cat Kdump: loaded Tainted: G L 6.15.0.orig-gf3021d9246bc-dirty #118 PREEMPT(voluntary)ยท
Tainted: [L]=SOFTLOCKUP
Hardware name: Red Hat KVM/RHEL-AV, BIOS 0.0.0 02/06/2015
RIP: 0010:mpol_shared_policy_lookup+0xd/0x70
Code: e9 b8 b4 ff ff 31 c0 c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 41 54 55 53 <48> 8b 1f 48 85 db 74 41 4c 8d 67 08 48 89 fb 48 89 f5 4c 89 e7 e8
RSP: 0018:ffffc90002b1fc28 EFLAGS: 00000202
RAX: 00000000001c20ca RBX: 0000000000724e1e RCX: 0000000000000001
RDX: ffff888118e214c8 RSI: 0000000000057d42 RDI: ffff888118e21518
RBP: 000000000002bec8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000bf4 R11: 0000000000000000 R12: 0000000000000001
R13: 00000000001c20ca R14: 00000000001c20ca R15: 0000000000000000
FS: 00007f03f995c740(0000) GS:ffff88a07ad9a000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f03f98f1000 CR3: 0000000144626004 CR4: 0000000000770eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
shmem_alloc_folio+0x31/0xc0
shmem_swapin_folio+0x309/0xcf0
? filemap_get_entry+0x117/0x1e0
? xas_load+0xd/0xb0
? filemap_get_entry+0x101/0x1e0
shmem_get_folio_gfp+0x2ed/0x5b0
shmem_file_read_iter+0x7f/0x2e0
vfs_read+0x252/0x330
ksys_read+0x68/0xf0
do_syscall_64+0x4c/0x1c0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f03f9a46991
Code: 00 48 8b 15 81 14 10 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd e8 20 ad 01 00 f3 0f 1e fa 80 3d 35 97 10 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec
RSP: 002b:00007fff3c52bd28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000040000 RCX: 00007f03f9a46991
RDX: 0000000000040000 RSI: 00007f03f98ba000 RDI: 0000000000000003
RBP: 00007fff3c52bd50 R08: 0000000000000000 R09: 00007f03f9b9a380
R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000040000
R13: 00007f03f98ba000 R14: 0000000000000003 R15: 0000000000000000
</TASK>
The reason is simple, readahead brought some order 0 folio in swap cache,
and the swapin mTHP folio being allocated is in conflict with it, so
swapcache_prepare fails and causes shmem_swap_alloc_folio to return
-EEXIST, and shmem simply retries again and again causing this loop.
Fix it by applying a similar fix for anon mTHP swapin.
The performance change is very slight, time of swapin 10g zero folios
with shmem (test for 12 times):
Before: 2.47s
After: 2.48s
[kasong@tencent.com: add comment] |