CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
In the Linux kernel, the following vulnerability has been resolved:
netfilter: ctnetlink: remove refcounting in expectation dumpers
Same pattern as previous patch: do not keep the expectation object
alive via refcount, only store a cookie value and then use that
as the skip hint for dump resumption.
AFAICS this has the same issue as the one resolved in the conntrack
dumper, when we do
if (!refcount_inc_not_zero(&exp->use))
to increment the refcount, there is a chance that exp == last, which
causes a double-increment of the refcount and subsequent memory leak. |
In the Linux kernel, the following vulnerability has been resolved:
ACPI: APEI: send SIGBUS to current task if synchronous memory error not recovered
If a synchronous error is detected as a result of user-space process
triggering a 2-bit uncorrected error, the CPU will take a synchronous
error exception such as Synchronous External Abort (SEA) on Arm64. The
kernel will queue a memory_failure() work which poisons the related
page, unmaps the page, and then sends a SIGBUS to the process, so that
a system wide panic can be avoided.
However, no memory_failure() work will be queued when abnormal
synchronous errors occur. These errors can include situations like
invalid PA, unexpected severity, no memory failure config support,
invalid GUID section, etc. In such a case, the user-space process will
trigger SEA again. This loop can potentially exceed the platform
firmware threshold or even trigger a kernel hard lockup, leading to a
system reboot.
Fix it by performing a force kill if no memory_failure() work is queued
for synchronous errors.
[ rjw: Changelog edits ] |
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: add null check
[WHY]
Prevents null pointer dereferences to enhance function robustness
[HOW]
Adds early null check and return false if invalid. |
In the Linux kernel, the following vulnerability has been resolved:
wifi: ath12k: Decrement TID on RX peer frag setup error handling
Currently, TID is not decremented before peer cleanup, during error
handling path of ath12k_dp_rx_peer_frag_setup(). This could lead to
out-of-bounds access in peer->rx_tid[].
Hence, add a decrement operation for TID, before peer cleanup to
ensures proper cleanup and prevents out-of-bounds access issues when
the RX peer frag setup fails.
Found during code review. Compile tested only. |
In the Linux kernel, the following vulnerability has been resolved:
usb: core: config: Prevent OOB read in SS endpoint companion parsing
usb_parse_ss_endpoint_companion() checks descriptor type before length,
enabling a potentially odd read outside of the buffer size.
Fix this up by checking the size first before looking at any of the
fields in the descriptor. |
In the Linux kernel, the following vulnerability has been resolved:
btrfs: qgroup: fix race between quota disable and quota rescan ioctl
There's a race between a task disabling quotas and another running the
rescan ioctl that can result in a use-after-free of qgroup records from
the fs_info->qgroup_tree rbtree.
This happens as follows:
1) Task A enters btrfs_ioctl_quota_rescan() -> btrfs_qgroup_rescan();
2) Task B enters btrfs_quota_disable() and calls
btrfs_qgroup_wait_for_completion(), which does nothing because at that
point fs_info->qgroup_rescan_running is false (it wasn't set yet by
task A);
3) Task B calls btrfs_free_qgroup_config() which starts freeing qgroups
from fs_info->qgroup_tree without taking the lock fs_info->qgroup_lock;
4) Task A enters qgroup_rescan_zero_tracking() which starts iterating
the fs_info->qgroup_tree tree while holding fs_info->qgroup_lock,
but task B is freeing qgroup records from that tree without holding
the lock, resulting in a use-after-free.
Fix this by taking fs_info->qgroup_lock at btrfs_free_qgroup_config().
Also at btrfs_qgroup_rescan() don't start the rescan worker if quotas
were already disabled. |
In the Linux kernel, the following vulnerability has been resolved:
RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
Ever since commit c2ff29e99a76 ("siw: Inline do_tcp_sendpages()"),
we have been doing this:
static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
size_t size)
[...]
/* Calculate the number of bytes we need to push, for this page
* specifically */
size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
/* If we can't splice it, then copy it in, as normal */
if (!sendpage_ok(page[i]))
msg.msg_flags &= ~MSG_SPLICE_PAGES;
/* Set the bvec pointing to the page, with len $bytes */
bvec_set_page(&bvec, page[i], bytes, offset);
/* Set the iter to $size, aka the size of the whole sendpages (!!!) */
iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
try_page_again:
lock_sock(sk);
/* Sendmsg with $size size (!!!) */
rv = tcp_sendmsg_locked(sk, &msg, size);
This means we've been sending oversized iov_iters and tcp_sendmsg calls
for a while. This has a been a benign bug because sendpage_ok() always
returned true. With the recent slab allocator changes being slowly
introduced into next (that disallow sendpage on large kmalloc
allocations), we have recently hit out-of-bounds crashes, due to slight
differences in iov_iter behavior between the MSG_SPLICE_PAGES and
"regular" copy paths:
(MSG_SPLICE_PAGES)
skb_splice_from_iter
iov_iter_extract_pages
iov_iter_extract_bvec_pages
uses i->nr_segs to correctly stop in its tracks before OoB'ing everywhere
skb_splice_from_iter gets a "short" read
(!MSG_SPLICE_PAGES)
skb_copy_to_page_nocache copy=iov_iter_count
[...]
copy_from_iter
/* this doesn't help */
if (unlikely(iter->count < len))
len = iter->count;
iterate_bvec
... and we run off the bvecs
Fix this by properly setting the iov_iter's byte count, plus sending the
correct byte count to tcp_sendmsg_locked. |
In the Linux kernel, the following vulnerability has been resolved:
ALSA: usb-audio: Validate UAC3 cluster segment descriptors
UAC3 class segment descriptors need to be verified whether their sizes
match with the declared lengths and whether they fit with the
allocated buffer sizes, too. Otherwise malicious firmware may lead to
the unexpected OOB accesses. |
In the Linux kernel, the following vulnerability has been resolved:
fs: Prevent file descriptor table allocations exceeding INT_MAX
When sysctl_nr_open is set to a very high value (for example, 1073741816
as set by systemd), processes attempting to use file descriptors near
the limit can trigger massive memory allocation attempts that exceed
INT_MAX, resulting in a WARNING in mm/slub.c:
WARNING: CPU: 0 PID: 44 at mm/slub.c:5027 __kvmalloc_node_noprof+0x21a/0x288
This happens because kvmalloc_array() and kvmalloc() check if the
requested size exceeds INT_MAX and emit a warning when the allocation is
not flagged with __GFP_NOWARN.
Specifically, when nr_open is set to 1073741816 (0x3ffffff8) and a
process calls dup2(oldfd, 1073741880), the kernel attempts to allocate:
- File descriptor array: 1073741880 * 8 bytes = 8,589,935,040 bytes
- Multiple bitmaps: ~400MB
- Total allocation size: > 8GB (exceeding INT_MAX = 2,147,483,647)
Reproducer:
1. Set /proc/sys/fs/nr_open to 1073741816:
# echo 1073741816 > /proc/sys/fs/nr_open
2. Run a program that uses a high file descriptor:
#include <unistd.h>
#include <sys/resource.h>
int main() {
struct rlimit rlim = {1073741824, 1073741824};
setrlimit(RLIMIT_NOFILE, &rlim);
dup2(2, 1073741880); // Triggers the warning
return 0;
}
3. Observe WARNING in dmesg at mm/slub.c:5027
systemd commit a8b627a introduced automatic bumping of fs.nr_open to the
maximum possible value. The rationale was that systems with memory
control groups (memcg) no longer need separate file descriptor limits
since memory is properly accounted. However, this change overlooked
that:
1. The kernel's allocation functions still enforce INT_MAX as a maximum
size regardless of memcg accounting
2. Programs and tests that legitimately test file descriptor limits can
inadvertently trigger massive allocations
3. The resulting allocations (>8GB) are impractical and will always fail
systemd's algorithm starts with INT_MAX and keeps halving the value
until the kernel accepts it. On most systems, this results in nr_open
being set to 1073741816 (0x3ffffff8), which is just under 1GB of file
descriptors.
While processes rarely use file descriptors near this limit in normal
operation, certain selftests (like
tools/testing/selftests/core/unshare_test.c) and programs that test file
descriptor limits can trigger this issue.
Fix this by adding a check in alloc_fdtable() to ensure the requested
allocation size does not exceed INT_MAX. This causes the operation to
fail with -EMFILE instead of triggering a kernel warning and avoids the
impractical >8GB memory allocation request. |
In the Linux kernel, the following vulnerability has been resolved:
mm/smaps: fix race between smaps_hugetlb_range and migration
smaps_hugetlb_range() handles the pte without holdling ptl, and may be
concurrenct with migration, leaing to BUG_ON in pfn_swap_entry_to_page().
The race is as follows.
smaps_hugetlb_range migrate_pages
huge_ptep_get
remove_migration_ptes
folio_unlock
pfn_swap_entry_folio
BUG_ON
To fix it, hold ptl lock in smaps_hugetlb_range(). |
In the Linux kernel, the following vulnerability has been resolved:
gfs2: Set .migrate_folio in gfs2_{rgrp,meta}_aops
Clears up the warning added in 7ee3647243e5 ("migrate: Remove call to
->writepage") that occurs in various xfstests, causing "something found
in dmesg" failures.
[ 341.136573] gfs2_meta_aops does not implement migrate_folio
[ 341.136953] WARNING: CPU: 1 PID: 36 at mm/migrate.c:944 move_to_new_folio+0x2f8/0x300 |
In the Linux kernel, the following vulnerability has been resolved:
ARM: rockchip: fix kernel hang during smp initialization
In order to bring up secondary CPUs main CPU write trampoline
code to SRAM. The trampoline code is written while secondary
CPUs are powered on (at least that true for RK3188 CPU).
Sometimes that leads to kernel hang. Probably because secondary
CPU execute trampoline code while kernel doesn't expect.
The patch moves SRAM initialization step to the point where all
secondary CPUs are powered down.
That fixes rarely hangs on RK3188:
[ 0.091568] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[ 0.091996] rockchip_smp_prepare_cpus: ncores 4 |
In the Linux kernel, the following vulnerability has been resolved:
ALSA: hda/ca0132: Fix buffer overflow in add_tuning_control
The 'sprintf' call in 'add_tuning_control' may exceed the 44-byte
buffer if either string argument is too long. This triggers a compiler
warning.
Replaced 'sprintf' with 'snprintf' to limit string lengths to prevent
overflow. |
In the Linux kernel, the following vulnerability has been resolved:
wifi: ath12k: Correct tid cleanup when tid setup fails
Currently, if any error occurs during ath12k_dp_rx_peer_tid_setup(),
the tid value is already incremented, even though the corresponding
TID is not actually allocated. Proceed to
ath12k_dp_rx_peer_tid_delete() starting from unallocated tid,
which might leads to freeing unallocated TID and cause potential
crash or out-of-bounds access.
Hence, fix by correctly decrementing tid before cleanup to match only
the successfully allocated TIDs.
Also, remove tid-- from failure case of ath12k_dp_rx_peer_frag_setup(),
as decrementing the tid before cleanup in loop will take care of this.
Compile tested only. |
In the Linux kernel, the following vulnerability has been resolved:
rcu: Protect ->defer_qs_iw_pending from data race
On kernels built with CONFIG_IRQ_WORK=y, when rcu_read_unlock() is
invoked within an interrupts-disabled region of code [1], it will invoke
rcu_read_unlock_special(), which uses an irq-work handler to force the
system to notice when the RCU read-side critical section actually ends.
That end won't happen until interrupts are enabled at the soonest.
In some kernels, such as those booted with rcutree.use_softirq=y, the
irq-work handler is used unconditionally.
The per-CPU rcu_data structure's ->defer_qs_iw_pending field is
updated by the irq-work handler and is both read and updated by
rcu_read_unlock_special(). This resulted in the following KCSAN splat:
------------------------------------------------------------------------
BUG: KCSAN: data-race in rcu_preempt_deferred_qs_handler / rcu_read_unlock_special
read to 0xffff96b95f42d8d8 of 1 bytes by task 90 on cpu 8:
rcu_read_unlock_special+0x175/0x260
__rcu_read_unlock+0x92/0xa0
rt_spin_unlock+0x9b/0xc0
__local_bh_enable+0x10d/0x170
__local_bh_enable_ip+0xfb/0x150
rcu_do_batch+0x595/0xc40
rcu_cpu_kthread+0x4e9/0x830
smpboot_thread_fn+0x24d/0x3b0
kthread+0x3bd/0x410
ret_from_fork+0x35/0x40
ret_from_fork_asm+0x1a/0x30
write to 0xffff96b95f42d8d8 of 1 bytes by task 88 on cpu 8:
rcu_preempt_deferred_qs_handler+0x1e/0x30
irq_work_single+0xaf/0x160
run_irq_workd+0x91/0xc0
smpboot_thread_fn+0x24d/0x3b0
kthread+0x3bd/0x410
ret_from_fork+0x35/0x40
ret_from_fork_asm+0x1a/0x30
no locks held by irq_work/8/88.
irq event stamp: 200272
hardirqs last enabled at (200272): [<ffffffffb0f56121>] finish_task_switch+0x131/0x320
hardirqs last disabled at (200271): [<ffffffffb25c7859>] __schedule+0x129/0xd70
softirqs last enabled at (0): [<ffffffffb0ee093f>] copy_process+0x4df/0x1cc0
softirqs last disabled at (0): [<0000000000000000>] 0x0
------------------------------------------------------------------------
The problem is that irq-work handlers run with interrupts enabled, which
means that rcu_preempt_deferred_qs_handler() could be interrupted,
and that interrupt handler might contain an RCU read-side critical
section, which might invoke rcu_read_unlock_special(). In the strict
KCSAN mode of operation used by RCU, this constitutes a data race on
the ->defer_qs_iw_pending field.
This commit therefore disables interrupts across the portion of the
rcu_preempt_deferred_qs_handler() that updates the ->defer_qs_iw_pending
field. This suffices because this handler is not a fast path. |
In the Linux kernel, the following vulnerability has been resolved:
bpf: Forget ranges when refining tnum after JSET
Syzbot reported a kernel warning due to a range invariant violation on
the following BPF program.
0: call bpf_get_netns_cookie
1: if r0 == 0 goto <exit>
2: if r0 & Oxffffffff goto <exit>
The issue is on the path where we fall through both jumps.
That path is unreachable at runtime: after insn 1, we know r0 != 0, but
with the sign extension on the jset, we would only fallthrough insn 2
if r0 == 0. Unfortunately, is_branch_taken() isn't currently able to
figure this out, so the verifier walks all branches. The verifier then
refines the register bounds using the second condition and we end
up with inconsistent bounds on this unreachable path:
1: if r0 == 0 goto <exit>
r0: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0xffffffffffffffff)
2: if r0 & 0xffffffff goto <exit>
r0 before reg_bounds_sync: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0)
r0 after reg_bounds_sync: u64=[0x1, 0] var_off=(0, 0)
Improving the range refinement for JSET to cover all cases is tricky. We
also don't expect many users to rely on JSET given LLVM doesn't generate
those instructions. So instead of improving the range refinement for
JSETs, Eduard suggested we forget the ranges whenever we're narrowing
tnums after a JSET. This patch implements that approach. |
In the Linux kernel, the following vulnerability has been resolved:
drm/msm: Add error handling for krealloc in metadata setup
Function msm_ioctl_gem_info_set_metadata() now checks for krealloc
failure and returns -ENOMEM, avoiding potential NULL pointer dereference.
Explicitly avoids __GFP_NOFAIL due to deadlock risks and allocation constraints.
Patchwork: https://patchwork.freedesktop.org/patch/661235/ |
In the Linux kernel, the following vulnerability has been resolved:
wifi: ath10k: shutdown driver when hardware is unreliable
In rare cases, ath10k may lose connection with the PCIe bus due to
some unknown reasons, which could further lead to system crashes during
resuming due to watchdog timeout:
ath10k_pci 0000:01:00.0: wmi command 20486 timeout, restarting hardware
ath10k_pci 0000:01:00.0: already restarting
ath10k_pci 0000:01:00.0: failed to stop WMI vdev 0: -11
ath10k_pci 0000:01:00.0: failed to stop vdev 0: -11
ieee80211 phy0: PM: **** DPM device timeout ****
Call Trace:
panic+0x125/0x315
dpm_watchdog_set+0x54/0x54
dpm_watchdog_handler+0x57/0x57
call_timer_fn+0x31/0x13c
At this point, all WMI commands will timeout and attempt to restart
device. So set a threshold for consecutive restart failures. If the
threshold is exceeded, consider the hardware is unreliable and all
ath10k operations should be skipped to avoid system crash.
fail_cont_count and pending_recovery are atomic variables, and
do not involve complex conditional logic. Therefore, even if recovery
check and reconfig complete are executed concurrently, the recovery
mechanism will not be broken.
Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 |
In the Linux kernel, the following vulnerability has been resolved:
rcutorture: Fix rcutorture_one_extend_check() splat in RT kernels
For built with CONFIG_PREEMPT_RT=y kernels, running rcutorture
tests resulted in the following splat:
[ 68.797425] rcutorture_one_extend_check during change: Current 0x1 To add 0x1 To remove 0x0 preempt_count() 0x0
[ 68.797533] WARNING: CPU: 2 PID: 512 at kernel/rcu/rcutorture.c:1993 rcutorture_one_extend_check+0x419/0x560 [rcutorture]
[ 68.797601] Call Trace:
[ 68.797602] <TASK>
[ 68.797619] ? lockdep_softirqs_off+0xa5/0x160
[ 68.797631] rcutorture_one_extend+0x18e/0xcc0 [rcutorture 2466dbd2ff34dbaa36049cb323a80c3306ac997c]
[ 68.797646] ? local_clock+0x19/0x40
[ 68.797659] rcu_torture_one_read+0xf0/0x280 [rcutorture 2466dbd2ff34dbaa36049cb323a80c3306ac997c]
[ 68.797678] ? __pfx_rcu_torture_one_read+0x10/0x10 [rcutorture 2466dbd2ff34dbaa36049cb323a80c3306ac997c]
[ 68.797804] ? __pfx_rcu_torture_timer+0x10/0x10 [rcutorture 2466dbd2ff34dbaa36049cb323a80c3306ac997c]
[ 68.797815] rcu-torture: rcu_torture_reader task started
[ 68.797824] rcu-torture: Creating rcu_torture_reader task
[ 68.797824] rcu_torture_reader+0x238/0x580 [rcutorture 2466dbd2ff34dbaa36049cb323a80c3306ac997c]
[ 68.797836] ? kvm_sched_clock_read+0x15/0x30
Disable BH does not change the SOFTIRQ corresponding bits in
preempt_count() for RT kernels, this commit therefore use
softirq_count() to check the if BH is disabled. |
In the Linux kernel, the following vulnerability has been resolved:
rcu: Fix rcu_read_unlock() deadloop due to IRQ work
During rcu_read_unlock_special(), if this happens during irq_exit(), we
can lockup if an IPI is issued. This is because the IPI itself triggers
the irq_exit() path causing a recursive lock up.
This is precisely what Xiongfeng found when invoking a BPF program on
the trace_tick_stop() tracepoint As shown in the trace below. Fix by
managing the irq_work state correctly.
irq_exit()
__irq_exit_rcu()
/* in_hardirq() returns false after this */
preempt_count_sub(HARDIRQ_OFFSET)
tick_irq_exit()
tick_nohz_irq_exit()
tick_nohz_stop_sched_tick()
trace_tick_stop() /* a bpf prog is hooked on this trace point */
__bpf_trace_tick_stop()
bpf_trace_run2()
rcu_read_unlock_special()
/* will send a IPI to itself */
irq_work_queue_on(&rdp->defer_qs_iw, rdp->cpu);
A simple reproducer can also be obtained by doing the following in
tick_irq_exit(). It will hang on boot without the patch:
static inline void tick_irq_exit(void)
{
+ rcu_read_lock();
+ WRITE_ONCE(current->rcu_read_unlock_special.b.need_qs, true);
+ rcu_read_unlock();
+
[neeraj: Apply Frederic's suggested fix for PREEMPT_RT] |