CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
In the Linux kernel, the following vulnerability has been resolved:
media: ti: j721e-csi2rx: Fix races while restarting DMA
After the frame is submitted to DMA, it may happen that the submitted
list is not updated soon enough, and the DMA callback is triggered
before that.
This can lead to kernel crashes, so move everything in a single
lock/unlock section to prevent such races. |
In the Linux kernel, the following vulnerability has been resolved:
dma-mapping: benchmark: fix node id validation
While validating node ids in map_benchmark_ioctl(), node_possible() may
be provided with invalid argument outside of [0,MAX_NUMNODES-1] range
leading to:
BUG: KASAN: wild-memory-access in map_benchmark_ioctl (kernel/dma/map_benchmark.c:214)
Read of size 8 at addr 1fffffff8ccb6398 by task dma_map_benchma/971
CPU: 7 PID: 971 Comm: dma_map_benchma Not tainted 6.9.0-rc6 #37
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:117)
kasan_report (mm/kasan/report.c:603)
kasan_check_range (mm/kasan/generic.c:189)
variable_test_bit (arch/x86/include/asm/bitops.h:227) [inline]
arch_test_bit (arch/x86/include/asm/bitops.h:239) [inline]
_test_bit at (include/asm-generic/bitops/instrumented-non-atomic.h:142) [inline]
node_state (include/linux/nodemask.h:423) [inline]
map_benchmark_ioctl (kernel/dma/map_benchmark.c:214)
full_proxy_unlocked_ioctl (fs/debugfs/file.c:333)
__x64_sys_ioctl (fs/ioctl.c:890)
do_syscall_64 (arch/x86/entry/common.c:83)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
Compare node ids with sane bounds first. NUMA_NO_NODE is considered a
special valid case meaning that benchmarking kthreads won't be bound to a
cpuset of a given node.
Found by Linux Verification Center (linuxtesting.org). |
In the Linux kernel, the following vulnerability has been resolved:
greybus: lights: check return of get_channel_from_mode
If channel for the given node is not found we return null from
get_channel_from_mode. Make sure we validate the return pointer
before using it in two of the missing places.
This was originally reported in [0]:
Found by Linux Verification Center (linuxtesting.org) with SVACE.
[0] https://lore.kernel.org/all/20240301190425.120605-1-m.lobanov@rosalinux.ru |
In the Linux kernel, the following vulnerability has been resolved:
soundwire: cadence: fix invalid PDI offset
For some reason, we add an offset to the PDI, presumably to skip the
PDI0 and PDI1 which are reserved for BPT.
This code is however completely wrong and leads to an out-of-bounds
access. We were just lucky so far since we used only a couple of PDIs
and remained within the PDI array bounds.
A Fixes: tag is not provided since there are no known platforms where
the out-of-bounds would be accessed, and the initial code had problems
as well.
A follow-up patch completely removes this useless offset. |
In the Linux kernel, the following vulnerability has been resolved:
drm/msm/dpu: Add callback function pointer check before its call
In dpu_core_irq_callback_handler() callback function pointer is compared to NULL,
but then callback function is unconditionally called by this pointer.
Fix this bug by adding conditional return.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Patchwork: https://patchwork.freedesktop.org/patch/588237/ |
In the Linux kernel, the following vulnerability has been resolved:
media: stk1160: fix bounds checking in stk1160_copy_video()
The subtract in this condition is reversed. The ->length is the length
of the buffer. The ->bytesused is how many bytes we have copied thus
far. When the condition is reversed that means the result of the
subtraction is always negative but since it's unsigned then the result
is a very high positive value. That means the overflow check is never
true.
Additionally, the ->bytesused doesn't actually work for this purpose
because we're not writing to "buf->mem + buf->bytesused". Instead, the
math to calculate the destination where we are writing is a bit
involved. You calculate the number of full lines already written,
multiply by two, skip a line if necessary so that we start on an odd
numbered line, and add the offset into the line.
To fix this buffer overflow, just take the actual destination where we
are writing, if the offset is already out of bounds print an error and
return. Otherwise, write up to buf->length bytes. |
In the Linux kernel, the following vulnerability has been resolved:
tcp: Fix shift-out-of-bounds in dctcp_update_alpha().
In dctcp_update_alpha(), we use a module parameter dctcp_shift_g
as follows:
alpha -= min_not_zero(alpha, alpha >> dctcp_shift_g);
...
delivered_ce <<= (10 - dctcp_shift_g);
It seems syzkaller started fuzzing module parameters and triggered
shift-out-of-bounds [0] by setting 100 to dctcp_shift_g:
memcpy((void*)0x20000080,
"/sys/module/tcp_dctcp/parameters/dctcp_shift_g\000", 47);
res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul, /*file=*/0x20000080ul,
/*flags=*/2ul, /*mode=*/0ul);
memcpy((void*)0x20000000, "100\000", 4);
syscall(__NR_write, /*fd=*/r[0], /*val=*/0x20000000ul, /*len=*/4ul);
Let's limit the max value of dctcp_shift_g by param_set_uint_minmax().
With this patch:
# echo 10 > /sys/module/tcp_dctcp/parameters/dctcp_shift_g
# cat /sys/module/tcp_dctcp/parameters/dctcp_shift_g
10
# echo 11 > /sys/module/tcp_dctcp/parameters/dctcp_shift_g
-bash: echo: write error: Invalid argument
[0]:
UBSAN: shift-out-of-bounds in net/ipv4/tcp_dctcp.c:143:12
shift exponent 100 is too large for 32-bit type 'u32' (aka 'unsigned int')
CPU: 0 PID: 8083 Comm: syz-executor345 Not tainted 6.9.0-05151-g1b294a1f3561 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x201/0x300 lib/dump_stack.c:114
ubsan_epilogue lib/ubsan.c:231 [inline]
__ubsan_handle_shift_out_of_bounds+0x346/0x3a0 lib/ubsan.c:468
dctcp_update_alpha+0x540/0x570 net/ipv4/tcp_dctcp.c:143
tcp_in_ack_event net/ipv4/tcp_input.c:3802 [inline]
tcp_ack+0x17b1/0x3bc0 net/ipv4/tcp_input.c:3948
tcp_rcv_state_process+0x57a/0x2290 net/ipv4/tcp_input.c:6711
tcp_v4_do_rcv+0x764/0xc40 net/ipv4/tcp_ipv4.c:1937
sk_backlog_rcv include/net/sock.h:1106 [inline]
__release_sock+0x20f/0x350 net/core/sock.c:2983
release_sock+0x61/0x1f0 net/core/sock.c:3549
mptcp_subflow_shutdown+0x3d0/0x620 net/mptcp/protocol.c:2907
mptcp_check_send_data_fin+0x225/0x410 net/mptcp/protocol.c:2976
__mptcp_close+0x238/0xad0 net/mptcp/protocol.c:3072
mptcp_close+0x2a/0x1a0 net/mptcp/protocol.c:3127
inet_release+0x190/0x1f0 net/ipv4/af_inet.c:437
__sock_release net/socket.c:659 [inline]
sock_close+0xc0/0x240 net/socket.c:1421
__fput+0x41b/0x890 fs/file_table.c:422
task_work_run+0x23b/0x300 kernel/task_work.c:180
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x9c8/0x2540 kernel/exit.c:878
do_group_exit+0x201/0x2b0 kernel/exit.c:1027
__do_sys_exit_group kernel/exit.c:1038 [inline]
__se_sys_exit_group kernel/exit.c:1036 [inline]
__x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1036
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xe4/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x67/0x6f
RIP: 0033:0x7f6c2b5005b6
Code: Unable to access opcode bytes at 0x7f6c2b50058c.
RSP: 002b:00007ffe883eb948 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f6c2b5862f0 RCX: 00007f6c2b5005b6
RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001
RBP: 0000000000000001 R08: 00000000000000e7 R09: ffffffffffffffc0
R10: 0000000000000006 R11: 0000000000000246 R12: 00007f6c2b5862f0
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
</TASK> |
In the Linux kernel, the following vulnerability has been resolved:
KVM: x86: Forcibly leave nested virt when SMM state is toggled
Forcibly leave nested virtualization operation if userspace toggles SMM
state via KVM_SET_VCPU_EVENTS or KVM_SYNC_X86_EVENTS. If userspace
forces the vCPU out of SMM while it's post-VMXON and then injects an SMI,
vmx_enter_smm() will overwrite vmx->nested.smm.vmxon and end up with both
vmxon=false and smm.vmxon=false, but all other nVMX state allocated.
Don't attempt to gracefully handle the transition as (a) most transitions
are nonsencial, e.g. forcing SMM while L2 is running, (b) there isn't
sufficient information to handle all transitions, e.g. SVM wants access
to the SMRAM save state, and (c) KVM_SET_VCPU_EVENTS must precede
KVM_SET_NESTED_STATE during state restore as the latter disallows putting
the vCPU into L2 if SMM is active, and disallows tagging the vCPU as
being post-VMXON in SMM if SMM is not active.
Abuse of KVM_SET_VCPU_EVENTS manifests as a WARN and memory leak in nVMX
due to failure to free vmcs01's shadow VMCS, but the bug goes far beyond
just a memory leak, e.g. toggling SMM on while L2 is active puts the vCPU
in an architecturally impossible state.
WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline]
WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656
Modules linked in:
CPU: 1 PID: 3606 Comm: syz-executor725 Not tainted 5.17.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline]
RIP: 0010:free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656
Code: <0f> 0b eb b3 e8 8f 4d 9f 00 e9 f7 fe ff ff 48 89 df e8 92 4d 9f 00
Call Trace:
<TASK>
kvm_arch_vcpu_destroy+0x72/0x2f0 arch/x86/kvm/x86.c:11123
kvm_vcpu_destroy arch/x86/kvm/../../../virt/kvm/kvm_main.c:441 [inline]
kvm_destroy_vcpus+0x11f/0x290 arch/x86/kvm/../../../virt/kvm/kvm_main.c:460
kvm_free_vcpus arch/x86/kvm/x86.c:11564 [inline]
kvm_arch_destroy_vm+0x2e8/0x470 arch/x86/kvm/x86.c:11676
kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1217 [inline]
kvm_put_kvm+0x4fa/0xb00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1250
kvm_vm_release+0x3f/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1273
__fput+0x286/0x9f0 fs/file_table.c:311
task_work_run+0xdd/0x1a0 kernel/task_work.c:164
exit_task_work include/linux/task_work.h:32 [inline]
do_exit+0xb29/0x2a30 kernel/exit.c:806
do_group_exit+0xd2/0x2f0 kernel/exit.c:935
get_signal+0x4b0/0x28c0 kernel/signal.c:2862
arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
handle_signal_work kernel/entry/common.c:148 [inline]
exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
__syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae
</TASK> |
In the Linux kernel, the following vulnerability has been resolved:
USB: core: Fix hang in usb_kill_urb by adding memory barriers
The syzbot fuzzer has identified a bug in which processes hang waiting
for usb_kill_urb() to return. It turns out the issue is not unlinking
the URB; that works just fine. Rather, the problem arises when the
wakeup notification that the URB has completed is not received.
The reason is memory-access ordering on SMP systems. In outline form,
usb_kill_urb() and __usb_hcd_giveback_urb() operating concurrently on
different CPUs perform the following actions:
CPU 0 CPU 1
---------------------------- ---------------------------------
usb_kill_urb(): __usb_hcd_giveback_urb():
... ...
atomic_inc(&urb->reject); atomic_dec(&urb->use_count);
... ...
wait_event(usb_kill_urb_queue,
atomic_read(&urb->use_count) == 0);
if (atomic_read(&urb->reject))
wake_up(&usb_kill_urb_queue);
Confining your attention to urb->reject and urb->use_count, you can
see that the overall pattern of accesses on CPU 0 is:
write urb->reject, then read urb->use_count;
whereas the overall pattern of accesses on CPU 1 is:
write urb->use_count, then read urb->reject.
This pattern is referred to in memory-model circles as SB (for "Store
Buffering"), and it is well known that without suitable enforcement of
the desired order of accesses -- in the form of memory barriers -- it
is entirely possible for one or both CPUs to execute their reads ahead
of their writes. The end result will be that sometimes CPU 0 sees the
old un-decremented value of urb->use_count while CPU 1 sees the old
un-incremented value of urb->reject. Consequently CPU 0 ends up on
the wait queue and never gets woken up, leading to the observed hang
in usb_kill_urb().
The same pattern of accesses occurs in usb_poison_urb() and the
failure pathway of usb_hcd_submit_urb().
The problem is fixed by adding suitable memory barriers. To provide
proper memory-access ordering in the SB pattern, a full barrier is
required on both CPUs. The atomic_inc() and atomic_dec() accesses
themselves don't provide any memory ordering, but since they are
present, we can use the optimized smp_mb__after_atomic() memory
barrier in the various routines to obtain the desired effect.
This patch adds the necessary memory barriers. |
In the Linux kernel, the following vulnerability has been resolved:
sock_map: avoid race between sock_map_close and sk_psock_put
sk_psock_get will return NULL if the refcount of psock has gone to 0, which
will happen when the last call of sk_psock_put is done. However,
sk_psock_drop may not have finished yet, so the close callback will still
point to sock_map_close despite psock being NULL.
This can be reproduced with a thread deleting an element from the sock map,
while the second one creates a socket, adds it to the map and closes it.
That will trigger the WARN_ON_ONCE:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 7220 at net/core/sock_map.c:1701 sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701
Modules linked in:
CPU: 1 PID: 7220 Comm: syz-executor380 Not tainted 6.9.0-syzkaller-07726-g3c999d1ae3c7 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
RIP: 0010:sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701
Code: df e8 92 29 88 f8 48 8b 1b 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 79 29 88 f8 4c 8b 23 eb 89 e8 4f 15 23 f8 90 <0f> 0b 90 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 13 26 3d 02
RSP: 0018:ffffc9000441fda8 EFLAGS: 00010293
RAX: ffffffff89731ae1 RBX: ffffffff94b87540 RCX: ffff888029470000
RDX: 0000000000000000 RSI: ffffffff8bcab5c0 RDI: ffffffff8c1faba0
RBP: 0000000000000000 R08: ffffffff92f9b61f R09: 1ffffffff25f36c3
R10: dffffc0000000000 R11: fffffbfff25f36c4 R12: ffffffff89731840
R13: ffff88804b587000 R14: ffff88804b587000 R15: ffffffff89731870
FS: 000055555e080380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000207d4000 CR4: 0000000000350ef0
Call Trace:
<TASK>
unix_release+0x87/0xc0 net/unix/af_unix.c:1048
__sock_release net/socket.c:659 [inline]
sock_close+0xbe/0x240 net/socket.c:1421
__fput+0x42b/0x8a0 fs/file_table.c:422
__do_sys_close fs/open.c:1556 [inline]
__se_sys_close fs/open.c:1541 [inline]
__x64_sys_close+0x7f/0x110 fs/open.c:1541
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb37d618070
Code: 00 00 48 c7 c2 b8 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb d4 e8 10 2c 00 00 80 3d 31 f0 07 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c
RSP: 002b:00007ffcd4a525d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fb37d618070
RDX: 0000000000000010 RSI: 00000000200001c0 RDI: 0000000000000004
RBP: 0000000000000000 R08: 0000000100000000 R09: 0000000100000000
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Use sk_psock, which will only check that the pointer is not been set to
NULL yet, which should only happen after the callbacks are restored. If,
then, a reference can still be gotten, we may call sk_psock_stop and cancel
psock->work.
As suggested by Paolo Abeni, reorder the condition so the control flow is
less convoluted.
After that change, the reproducer does not trigger the WARN_ON_ONCE
anymore. |
In the Linux kernel, the following vulnerability has been resolved:
vmci: prevent speculation leaks by sanitizing event in event_deliver()
Coverity spotted that event_msg is controlled by user-space,
event_msg->event_data.event is passed to event_deliver() and used
as an index without sanitization.
This change ensures that the event index is sanitized to mitigate any
possibility of speculative information leaks.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Only compile tested, no access to HW. |
In the Linux kernel, the following vulnerability has been resolved:
drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE)
Lack of check for copy-on-write (COW) mapping in drm_gem_shmem_mmap
allows users to call mmap with PROT_WRITE and MAP_PRIVATE flag
causing a kernel panic due to BUG_ON in vmf_insert_pfn_prot:
BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));
Return -EINVAL early if COW mapping is detected.
This bug affects all drm drivers using default shmem helpers.
It can be reproduced by this simple example:
void *ptr = mmap(0, size, PROT_WRITE, MAP_PRIVATE, fd, mmap_offset);
ptr[0] = 0; |
In the Linux kernel, the following vulnerability has been resolved:
ALSA: hda: cs35l56: Fix lifetime of cs_dsp instance
The cs_dsp instance is initialized in the driver probe() so it
should be freed in the driver remove(). Also fix a missing call
to cs_dsp_remove() in the error path of cs35l56_hda_common_probe().
The call to cs_dsp_remove() was being done in the component unbind
callback cs35l56_hda_unbind(). This meant that if the driver was
unbound and then re-bound it would be using an uninitialized cs_dsp
instance.
It is best to initialize the cs_dsp instance in probe() so that it
can return an error if it fails. The component binding API doesn't
have any error handling so there's no way to handle a failure if
cs_dsp was initialized in the bind. |
In the Linux kernel, the following vulnerability has been resolved:
arm64: asm-bug: Add .align 2 to the end of __BUG_ENTRY
When CONFIG_DEBUG_BUGVERBOSE=n, we fail to add necessary padding bytes
to bug_table entries, and as a result the last entry in a bug table will
be ignored, potentially leading to an unexpected panic(). All prior
entries in the table will be handled correctly.
The arm64 ABI requires that struct fields of up to 8 bytes are
naturally-aligned, with padding added within a struct such that struct
are suitably aligned within arrays.
When CONFIG_DEBUG_BUGVERPOSE=y, the layout of a bug_entry is:
struct bug_entry {
signed int bug_addr_disp; // 4 bytes
signed int file_disp; // 4 bytes
unsigned short line; // 2 bytes
unsigned short flags; // 2 bytes
}
... with 12 bytes total, requiring 4-byte alignment.
When CONFIG_DEBUG_BUGVERBOSE=n, the layout of a bug_entry is:
struct bug_entry {
signed int bug_addr_disp; // 4 bytes
unsigned short flags; // 2 bytes
< implicit padding > // 2 bytes
}
... with 8 bytes total, with 6 bytes of data and 2 bytes of trailing
padding, requiring 4-byte alginment.
When we create a bug_entry in assembly, we align the start of the entry
to 4 bytes, which implicitly handles padding for any prior entries.
However, we do not align the end of the entry, and so when
CONFIG_DEBUG_BUGVERBOSE=n, the final entry lacks the trailing padding
bytes.
For the main kernel image this is not a problem as find_bug() doesn't
depend on the trailing padding bytes when searching for entries:
for (bug = __start___bug_table; bug < __stop___bug_table; ++bug)
if (bugaddr == bug_addr(bug))
return bug;
However for modules, module_bug_finalize() depends on the trailing
bytes when calculating the number of entries:
mod->num_bugs = sechdrs[i].sh_size / sizeof(struct bug_entry);
... and as the last bug_entry lacks the necessary padding bytes, this entry
will not be counted, e.g. in the case of a single entry:
sechdrs[i].sh_size == 6
sizeof(struct bug_entry) == 8;
sechdrs[i].sh_size / sizeof(struct bug_entry) == 0;
Consequently module_find_bug() will miss the last bug_entry when it does:
for (i = 0; i < mod->num_bugs; ++i, ++bug)
if (bugaddr == bug_addr(bug))
goto out;
... which can lead to a kenrel panic due to an unhandled bug.
This can be demonstrated with the following module:
static int __init buginit(void)
{
WARN(1, "hello\n");
return 0;
}
static void __exit bugexit(void)
{
}
module_init(buginit);
module_exit(bugexit);
MODULE_LICENSE("GPL");
... which will trigger a kernel panic when loaded:
------------[ cut here ]------------
hello
Unexpected kernel BRK exception at EL1
Internal error: BRK handler: 00000000f2000800 [#1] PREEMPT SMP
Modules linked in: hello(O+)
CPU: 0 PID: 50 Comm: insmod Tainted: G O 6.9.1 #8
Hardware name: linux,dummy-virt (DT)
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : buginit+0x18/0x1000 [hello]
lr : buginit+0x18/0x1000 [hello]
sp : ffff800080533ae0
x29: ffff800080533ae0 x28: 0000000000000000 x27: 0000000000000000
x26: ffffaba8c4e70510 x25: ffff800080533c30 x24: ffffaba8c4a28a58
x23: 0000000000000000 x22: 0000000000000000 x21: ffff3947c0eab3c0
x20: ffffaba8c4e3f000 x19: ffffaba846464000 x18: 0000000000000006
x17: 0000000000000000 x16: ffffaba8c2492834 x15: 0720072007200720
x14: 0720072007200720 x13: ffffaba8c49b27c8 x12: 0000000000000312
x11: 0000000000000106 x10: ffffaba8c4a0a7c8 x9 : ffffaba8c49b27c8
x8 : 00000000ffffefff x7 : ffffaba8c4a0a7c8 x6 : 80000000fffff000
x5 : 0000000000000107 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff3947c0eab3c0
Call trace:
buginit+0x18/0x1000 [hello]
do_one_initcall+0x80/0x1c8
do_init_module+0x60/0x218
load_module+0x1ba4/0x1d70
__do_sys_init_module+0x198/0x1d0
__arm64_sys_init_module+0x1c/0x28
invoke_syscall+0x48/0x114
el0_svc
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix to do sanity check on i_xattr_nid in sanity_check_inode()
syzbot reports a kernel bug as below:
F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4
==================================================================
BUG: KASAN: slab-out-of-bounds in f2fs_test_bit fs/f2fs/f2fs.h:2933 [inline]
BUG: KASAN: slab-out-of-bounds in current_nat_addr fs/f2fs/node.h:213 [inline]
BUG: KASAN: slab-out-of-bounds in f2fs_get_node_info+0xece/0x1200 fs/f2fs/node.c:600
Read of size 1 at addr ffff88807a58c76c by task syz-executor280/5076
CPU: 1 PID: 5076 Comm: syz-executor280 Not tainted 6.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
f2fs_test_bit fs/f2fs/f2fs.h:2933 [inline]
current_nat_addr fs/f2fs/node.h:213 [inline]
f2fs_get_node_info+0xece/0x1200 fs/f2fs/node.c:600
f2fs_xattr_fiemap fs/f2fs/data.c:1848 [inline]
f2fs_fiemap+0x55d/0x1ee0 fs/f2fs/data.c:1925
ioctl_fiemap fs/ioctl.c:220 [inline]
do_vfs_ioctl+0x1c07/0x2e50 fs/ioctl.c:838
__do_sys_ioctl fs/ioctl.c:902 [inline]
__se_sys_ioctl+0x81/0x170 fs/ioctl.c:890
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The root cause is we missed to do sanity check on i_xattr_nid during
f2fs_iget(), so that in fiemap() path, current_nat_addr() will access
nat_bitmap w/ offset from invalid i_xattr_nid, result in triggering
kasan bug report, fix it. |
In the Linux kernel, the following vulnerability has been resolved:
bonding: fix oops during rmmod
"rmmod bonding" causes an oops ever since commit cc317ea3d927 ("bonding:
remove redundant NULL check in debugfs function"). Here are the relevant
functions being called:
bonding_exit()
bond_destroy_debugfs()
debugfs_remove_recursive(bonding_debug_root);
bonding_debug_root = NULL; <--------- SET TO NULL HERE
bond_netlink_fini()
rtnl_link_unregister()
__rtnl_link_unregister()
unregister_netdevice_many_notify()
bond_uninit()
bond_debug_unregister()
(commit removed check for bonding_debug_root == NULL)
debugfs_remove()
simple_recursive_removal()
down_write() -> OOPS
However, reverting the bad commit does not solve the problem completely
because the original code contains a race that could cause the same
oops, although it was much less likely to be triggered unintentionally:
CPU1
rmmod bonding
bonding_exit()
bond_destroy_debugfs()
debugfs_remove_recursive(bonding_debug_root);
CPU2
echo -bond0 > /sys/class/net/bonding_masters
bond_uninit()
bond_debug_unregister()
if (!bonding_debug_root)
CPU1
bonding_debug_root = NULL;
So do NOT revert the bad commit (since the removed checks were racy
anyway), and instead change the order of actions taken during module
removal. The same oops can also happen if there is an error during
module init, so apply the same fix there. |
In the Linux kernel, the following vulnerability has been resolved:
Revert "xsk: Support redirect to any socket bound to the same umem"
This reverts commit 2863d665ea41282379f108e4da6c8a2366ba66db.
This patch introduced a potential kernel crash when multiple napi instances
redirect to the same AF_XDP socket. By removing the queue_index check, it is
possible for multiple napi instances to access the Rx ring at the same time,
which will result in a corrupted ring state which can lead to a crash when
flushing the rings in __xsk_flush(). This can happen when the linked list of
sockets to flush gets corrupted by concurrent accesses. A quick and small fix
is not possible, so let us revert this for now. |
In the Linux kernel, the following vulnerability has been resolved:
btrfs: protect folio::private when attaching extent buffer folios
[BUG]
Since v6.8 there are rare kernel crashes reported by various people,
the common factor is bad page status error messages like this:
BUG: Bad page state in process kswapd0 pfn:d6e840
page: refcount:0 mapcount:0 mapping:000000007512f4f2 index:0x2796c2c7c
pfn:0xd6e840
aops:btree_aops ino:1
flags: 0x17ffffe0000008(uptodate|node=0|zone=2|lastcpupid=0x3fffff)
page_type: 0xffffffff()
raw: 0017ffffe0000008 dead000000000100 dead000000000122 ffff88826d0be4c0
raw: 00000002796c2c7c 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: non-NULL mapping
[CAUSE]
Commit 09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to
allocate-then-attach method") changes the sequence when allocating a new
extent buffer.
Previously we always called grab_extent_buffer() under
mapping->i_private_lock, to ensure the safety on modification on
folio::private (which is a pointer to extent buffer for regular
sectorsize).
This can lead to the following race:
Thread A is trying to allocate an extent buffer at bytenr X, with 4
4K pages, meanwhile thread B is trying to release the page at X + 4K
(the second page of the extent buffer at X).
Thread A | Thread B
-----------------------------------+-------------------------------------
| btree_release_folio()
| | This is for the page at X + 4K,
| | Not page X.
| |
alloc_extent_buffer() | |- release_extent_buffer()
|- filemap_add_folio() for the | | |- atomic_dec_and_test(eb->refs)
| page at bytenr X (the first | | |
| page). | | |
| Which returned -EEXIST. | | |
| | | |
|- filemap_lock_folio() | | |
| Returned the first page locked. | | |
| | | |
|- grab_extent_buffer() | | |
| |- atomic_inc_not_zero() | | |
| | Returned false | | |
| |- folio_detach_private() | | |- folio_detach_private() for X
| |- folio_test_private() | | |- folio_test_private()
| Returned true | | | Returned true
|- folio_put() | |- folio_put()
Now there are two puts on the same folio at folio X, leading to refcount
underflow of the folio X, and eventually causing the BUG_ON() on the
page->mapping.
The condition is not that easy to hit:
- The release must be triggered for the middle page of an eb
If the release is on the same first page of an eb, page lock would kick
in and prevent the race.
- folio_detach_private() has a very small race window
It's only between folio_test_private() and folio_clear_private().
That's exactly when mapping->i_private_lock is used to prevent such race,
and commit 09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to
allocate-then-attach method") screwed that up.
At that time, I thought the page lock would kick in as
filemap_release_folio() also requires the page to be locked, but forgot
the filemap_release_folio() only locks one page, not all pages of an
extent buffer.
[FIX]
Move all the code requiring i_private_lock into
attach_eb_folio_to_filemap(), so that everything is done with proper
lock protection.
Furthermore to prevent future problems, add an extra
lockdep_assert_locked() to ensure we're holding the proper lock.
To reproducer that is able to hit the race (takes a few minutes with
instrumented code inserting delays to alloc_extent_buffer()):
#!/bin/sh
drop_caches () {
while(true); do
echo 3 > /proc/sys/vm/drop_caches
echo 1 > /proc/sys/vm/compact_memory
done
}
run_tar () {
while(true); do
for x in `seq 1 80` ; do
tar cf /dev/zero /mnt > /dev/null &
done
wait
done
}
mkfs.btrfs -f -d single -m single
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix crash on racing fsync and size-extending write into prealloc
We have been seeing crashes on duplicate keys in
btrfs_set_item_key_safe():
BTRFS critical (device vdb): slot 4 key (450 108 8192) new key (450 108 8192)
------------[ cut here ]------------
kernel BUG at fs/btrfs/ctree.c:2620!
invalid opcode: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 3139 Comm: xfs_io Kdump: loaded Not tainted 6.9.0 #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
RIP: 0010:btrfs_set_item_key_safe+0x11f/0x290 [btrfs]
With the following stack trace:
#0 btrfs_set_item_key_safe (fs/btrfs/ctree.c:2620:4)
#1 btrfs_drop_extents (fs/btrfs/file.c:411:4)
#2 log_one_extent (fs/btrfs/tree-log.c:4732:9)
#3 btrfs_log_changed_extents (fs/btrfs/tree-log.c:4955:9)
#4 btrfs_log_inode (fs/btrfs/tree-log.c:6626:9)
#5 btrfs_log_inode_parent (fs/btrfs/tree-log.c:7070:8)
#6 btrfs_log_dentry_safe (fs/btrfs/tree-log.c:7171:8)
#7 btrfs_sync_file (fs/btrfs/file.c:1933:8)
#8 vfs_fsync_range (fs/sync.c:188:9)
#9 vfs_fsync (fs/sync.c:202:9)
#10 do_fsync (fs/sync.c:212:9)
#11 __do_sys_fdatasync (fs/sync.c:225:9)
#12 __se_sys_fdatasync (fs/sync.c:223:1)
#13 __x64_sys_fdatasync (fs/sync.c:223:1)
#14 do_syscall_x64 (arch/x86/entry/common.c:52:14)
#15 do_syscall_64 (arch/x86/entry/common.c:83:7)
#16 entry_SYSCALL_64+0xaf/0x14c (arch/x86/entry/entry_64.S:121)
So we're logging a changed extent from fsync, which is splitting an
extent in the log tree. But this split part already exists in the tree,
triggering the BUG().
This is the state of the log tree at the time of the crash, dumped with
drgn (https://github.com/osandov/drgn/blob/main/contrib/btrfs_tree.py)
to get more details than btrfs_print_leaf() gives us:
>>> print_extent_buffer(prog.crashed_thread().stack_trace()[0]["eb"])
leaf 33439744 level 0 items 72 generation 9 owner 18446744073709551610
leaf 33439744 flags 0x100000000000000
fs uuid e5bd3946-400c-4223-8923-190ef1f18677
chunk uuid d58cb17e-6d02-494a-829a-18b7d8a399da
item 0 key (450 INODE_ITEM 0) itemoff 16123 itemsize 160
generation 7 transid 9 size 8192 nbytes 8473563889606862198
block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0
sequence 204 flags 0x10(PREALLOC)
atime 1716417703.220000000 (2024-05-22 15:41:43)
ctime 1716417704.983333333 (2024-05-22 15:41:44)
mtime 1716417704.983333333 (2024-05-22 15:41:44)
otime 17592186044416.000000000 (559444-03-08 01:40:16)
item 1 key (450 INODE_REF 256) itemoff 16110 itemsize 13
index 195 namelen 3 name: 193
item 2 key (450 XATTR_ITEM 1640047104) itemoff 16073 itemsize 37
location key (0 UNKNOWN.0 0) type XATTR
transid 7 data_len 1 name_len 6
name: user.a
data a
item 3 key (450 EXTENT_DATA 0) itemoff 16020 itemsize 53
generation 9 type 1 (regular)
extent data disk byte 303144960 nr 12288
extent data offset 0 nr 4096 ram 12288
extent compression 0 (none)
item 4 key (450 EXTENT_DATA 4096) itemoff 15967 itemsize 53
generation 9 type 2 (prealloc)
prealloc data disk byte 303144960 nr 12288
prealloc data offset 4096 nr 8192
item 5 key (450 EXTENT_DATA 8192) itemoff 15914 itemsize 53
generation 9 type 2 (prealloc)
prealloc data disk byte 303144960 nr 12288
prealloc data offset 8192 nr 4096
...
So the real problem happened earlier: notice that items 4 (4k-12k) and 5
(8k-12k) overlap. Both are prealloc extents. Item 4 straddles i_size and
item 5 starts at i_size.
Here is the state of
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state
Pass the already obtained vlan group pointer to br_mst_vlan_set_state()
instead of dereferencing it again. Each caller has already correctly
dereferenced it for their context. This change is required for the
following suspicious RCU dereference fix. No functional changes
intended. |