| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
hfsplus: return error when node already exists in hfs_bnode_create
When hfs_bnode_create() finds that a node is already hashed (which should
not happen in normal operation), it currently returns the existing node
without incrementing its reference count. This causes a reference count
inconsistency that leads to a kernel panic when the node is later freed
in hfs_bnode_put():
kernel BUG at fs/hfsplus/bnode.c:676!
BUG_ON(!atomic_read(&node->refcnt))
This scenario can occur when hfs_bmap_alloc() attempts to allocate a node
that is already in use (e.g., when node 0's bitmap bit is incorrectly
unset), or due to filesystem corruption.
Returning an existing node from a create path is not normal operation.
Fix this by returning ERR_PTR(-EEXIST) instead of the node when it's
already hashed. This properly signals the error condition to callers,
which already check for IS_ERR() return values. |
| In the Linux kernel, the following vulnerability has been resolved:
spi: imx: fix use-after-free on unbind
The SPI subsystem frees the controller and any subsystem allocated
driver data as part of deregistration (unless the allocation is device
managed).
Take another reference before deregistering the controller so that the
driver data is not freed until the driver is done with it. |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: sd: fix missing put_disk() when device_add(&disk_dev) fails
If device_add(&sdkp->disk_dev) fails, put_device() runs
scsi_disk_release(), which frees the scsi_disk but leaves the gendisk
referenced. The device_add_disk() error path in sd_probe() calls
put_disk(gd); call put_disk(gd) here to mirror that cleanup. |
| In the Linux kernel, the following vulnerability has been resolved:
hwmon: (pt5161l) Fix bugs in pt5161l_read_block_data()
Fix two bugs in pt5161l_read_block_data():
1. Buffer overrun: The local buffer rbuf is declared as u8 rbuf[24],
but i2c_smbus_read_block_data() can return up to
I2C_SMBUS_BLOCK_MAX (32) bytes. The i2c-core copies the data into
the caller's buffer before the return value can be checked, so
the post-read length validation does not prevent a stack overrun
if a device returns more than 24 bytes. Resize the buffer to
I2C_SMBUS_BLOCK_MAX.
2. Unexpected positive return on length mismatch: When all three
retries are exhausted because the device returns data with an
unexpected length, i2c_smbus_read_block_data() returns a positive
byte count. The function returns this directly, and callers treat
any non-negative return as success, processing stale or incomplete
buffer contents. Return -EIO when retries are exhausted with a
positive return value, preserving the negative error code on I2C
failure. |
| In the Linux kernel, the following vulnerability has been resolved:
zram: do not forget to endio for partial discard requests
As reported by Qu Wenruo and Avinesh Kumar, the following
getconf PAGESIZE
65536
blkdiscard -p 4k /dev/zram0
takes literally forever to complete. zram doesn't support partial
discards and just returns immediately w/o doing any discard work in such
cases. The problem is that we forget to endio on our way out, so
blkdiscard sleeps forever in submit_bio_wait(). Fix this by jumping to
end_bio label, which does bio_endio(). |
| In the Linux kernel, the following vulnerability has been resolved:
ext4: fix bounds check in check_xattrs() to prevent out-of-bounds access
The bounds check for the next xattr entry in check_xattrs() uses
(void *)next >= end, which allows next to point within sizeof(u32)
bytes of end. On the next loop iteration, IS_LAST_ENTRY() reads 4
bytes via *(__u32 *)(entry), which can overrun the valid xattr region.
For example, if next lands at end - 1, the check passes since
next < end, but IS_LAST_ENTRY() reads 4 bytes starting at end - 1,
accessing 3 bytes beyond the valid region.
Fix this by changing the check to (void *)next + sizeof(u32) > end,
ensuring there is always enough space for the IS_LAST_ENTRY() read
on the subsequent iteration. |
| In the Linux kernel, the following vulnerability has been resolved:
SUNRPC: auth_gss: fix memory leaks in XDR decoding error paths
The gssx_dec_ctx(), gssx_dec_status(), and gssx_dec_name()
functions allocate memory via gssx_dec_buffer(), which calls
kmemdup(). When a subsequent decode operation fails, these
functions return immediately without freeing previously
allocated buffers, causing memory leaks.
The leak in gssx_dec_ctx() is particularly relevant because
the caller (gssp_accept_sec_context_upcall) initializes several
buffer length fields to non-zero values, resulting in memory
allocation:
struct gssx_ctx rctxh = {
.exported_context_token.len = GSSX_max_output_handle_sz,
.mech.len = GSS_OID_MAX_LEN,
.src_name.display_name.len = GSSX_max_princ_sz,
.targ_name.display_name.len = GSSX_max_princ_sz
};
If, for example, gssx_dec_name() succeeds for src_name but
fails for targ_name, the memory allocated for
exported_context_token, mech, and src_name.display_name
remains unreferenced and cannot be reclaimed.
Add error handling with goto-based cleanup to free any
previously allocated buffers before returning an error. |
| In the Linux kernel, the following vulnerability has been resolved:
crypto: ccree - fix a memory leak in cc_mac_digest()
Add cc_unmap_result() if cc_map_hash_request_final()
fails to prevent potential memory leak. |
| In the Linux kernel, the following vulnerability has been resolved:
LoongArch: Add spectre boundry for syscall dispatch table
The LoongArch syscall number is directly controlled by userspace, but
does not have a array_index_nospec() boundry to prevent access past the
syscall function pointer tables. |
| In the Linux kernel, the following vulnerability has been resolved:
RDMA/rxe: Fix race condition in QP timer handlers
I encontered the following warning:
WARNING: drivers/infiniband/sw/rxe/rxe_task.c:249 at rxe_sched_task+0x1c8/0x238 [rdma_rxe], CPU#0: swapper/0/0
...
libsha1 [last unloaded: ip6_udp_tunnel]
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G C 6.19.0-rc5-64k-v8+ #37 PREEMPT
Tainted: [C]=CRAP
Hardware name: Raspberry Pi 4 Model B Rev 1.2
Call trace:
rxe_sched_task+0x1c8/0x238 [rdma_rxe] (P)
retransmit_timer+0x130/0x188 [rdma_rxe]
call_timer_fn+0x68/0x4d0
__run_timers+0x630/0x888
...
WARNING: drivers/infiniband/sw/rxe/rxe_task.c:38 at rxe_sched_task+0x1c0/0x238 [rdma_rxe], CPU#0: swapper/0/0
...
WARNING: drivers/infiniband/sw/rxe/rxe_task.c:111 at do_work+0x488/0x5c8 [rdma_rxe], CPU#3: kworker/u17:4/93400
...
refcount_t: underflow; use-after-free.
WARNING: lib/refcount.c:28 at refcount_warn_saturate+0x138/0x1a0, CPU#3: kworker/u17:4/93400
The issue is caused by a race condition between retransmit_timer() and
rxe_destroy_qp, leading to the Queue Pair's (QP) reference count dropping
to zero during timer handler execution.
It seems this warning is harmless because rxe_qp_do_cleanup() will flush
all pending timers and requests.
Example of flow causing the issue:
CPU0 CPU1
retransmit_timer() {
spin_lock_irqsave
rxe_destroy_qp()
__rxe_cleanup()
__rxe_put() // qp->ref_count decrease to 0
rxe_qp_do_cleanup() {
if (qp->valid) {
rxe_sched_task() {
WARN_ON(rxe_read(task->qp) <= 0);
}
}
spin_unlock_irqrestore
}
spin_lock_irqsave
qp->valid = 0
spin_unlock_irqrestore
}
Ensure the QP's reference count is maintained and its validity is checked
within the timer callbacks by adding calls to rxe_get(qp) and corresponding
rxe_put(qp) after use. |
| In the Linux kernel, the following vulnerability has been resolved:
ext4: don't cache extent during splitting extent
Caching extents during the splitting process is risky, as it may result
in stale extents remaining in the status tree. Moreover, in most cases,
the corresponding extent block entries are likely already cached before
the split happens, making caching here not particularly useful.
Assume we have an unwritten extent, and then DIO writes the first half.
[UUUUUUUUUUUUUUUU] on-disk extent U: unwritten extent
[UUUUUUUUUUUUUUUU] extent status tree
|<- ->| ----> dio write this range
First, when ext4_split_extent_at() splits this extent, it truncates the
existing extent and then inserts a new one. During this process, this
extent status entry may be shrunk, and calls to ext4_find_extent() and
ext4_cache_extents() may occur, which could potentially insert the
truncated range as a hole into the extent status tree. After the split
is completed, this hole is not replaced with the correct status.
[UUUUUUU|UUUUUUUU] on-disk extent U: unwritten extent
[UUUUUUU|HHHHHHHH] extent status tree H: hole
Then, the outer calling functions will not correct this remaining hole
extent either. Finally, if we perform a delayed buffer write on this
latter part, it will re-insert the delayed extent and cause an error in
space accounting.
In adition, if the unwritten extent cache is not shrunk during the
splitting, ext4_cache_extents() also conflicts with existing extents
when caching extents. In the future, we will add checks when caching
extents, which will trigger a warning. Therefore, Do not cache extents
that are being split. |
| In the Linux kernel, the following vulnerability has been resolved:
net: usb: catc: enable basic endpoint checking
catc_probe() fills three URBs with hardcoded endpoint pipes without
verifying the endpoint descriptors:
- usb_sndbulkpipe(usbdev, 1) and usb_rcvbulkpipe(usbdev, 1) for TX/RX
- usb_rcvintpipe(usbdev, 2) for interrupt status
A malformed USB device can present these endpoints with transfer types
that differ from what the driver assumes.
Add a catc_usb_ep enum for endpoint numbers, replacing magic constants
throughout. Add usb_check_bulk_endpoints() and usb_check_int_endpoints()
calls after usb_set_interface() to verify endpoint types before use,
rejecting devices with mismatched descriptors at probe time.
Similar to
- commit 90b7f2961798 ("net: usb: rtl8150: enable basic endpoint checking")
which fixed the issue in rtl8150. |
| In the Linux kernel, the following vulnerability has been resolved:
crypto: starfive - Fix memory leak in starfive_aes_aead_do_one_req()
The starfive_aes_aead_do_one_req() function allocates rctx->adata with
kzalloc() but fails to free it if sg_copy_to_buffer() or
starfive_aes_hw_init() fails, which lead to memory leaks.
Since rctx->adata is unconditionally freed after the write_adata
operations, ensure consistent cleanup by freeing the allocation in these
earlier error paths as well.
Compile tested only. Issue found using a prototype static analysis tool
and code review. |
| In the Linux kernel, the following vulnerability has been resolved:
md/md-llbitmap: fix percpu_ref not resurrected on suspend timeout
When llbitmap_suspend_timeout() times out waiting for percpu_ref to
become zero, it returns -ETIMEDOUT without resurrecting the percpu_ref.
The caller (md_llbitmap_daemon_fn) then continues to the next page
without calling llbitmap_resume(), leaving the percpu_ref in a killed
state permanently.
Fix this by resurrecting the percpu_ref before returning the error,
ensuring the page control structure remains usable for subsequent
operations. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: clean up the amdgpu_cs_parser_bos
In low memory conditions, kmalloc can fail. In such conditions
unlock the mutex for a clean exit.
We do not need to amdgpu_bo_list_put as it's been handled in the
amdgpu_cs_parser_fini. |
| In the Linux kernel, the following vulnerability has been resolved:
smack: /smack/doi: accept previously used values
Writing to /smack/doi a value that has ever been
written there in the past disables networking for
non-ambient labels.
E.g.
# cat /smack/doi
3
# netlabelctl -p cipso list
Configured CIPSO mappings (1)
DOI value : 3
mapping type : PASS_THROUGH
# netlabelctl -p map list
Configured NetLabel domain mappings (3)
domain: "_" (IPv4)
protocol: UNLABELED
domain: DEFAULT (IPv4)
protocol: CIPSO, DOI = 3
domain: DEFAULT (IPv6)
protocol: UNLABELED
# cat /smack/ambient
_
# cat /proc/$$/attr/smack/current
_
# ping -c1 10.1.95.12
64 bytes from 10.1.95.12: icmp_seq=1 ttl=64 time=0.964 ms
# echo foo >/proc/$$/attr/smack/current
# ping -c1 10.1.95.12
64 bytes from 10.1.95.12: icmp_seq=1 ttl=64 time=0.956 ms
unknown option 86
# echo 4 >/smack/doi
# echo 3 >/smack/doi
!> [ 214.050395] smk_cipso_doi:691 cipso add rc = -17
# echo 3 >/smack/doi
!> [ 249.402261] smk_cipso_doi:678 remove rc = -2
!> [ 249.402261] smk_cipso_doi:691 cipso add rc = -17
# ping -c1 10.1.95.12
!!> ping: 10.1.95.12: Address family for hostname not supported
# echo _ >/proc/$$/attr/smack/current
# ping -c1 10.1.95.12
64 bytes from 10.1.95.12: icmp_seq=1 ttl=64 time=0.617 ms
This happens because Smack keeps decommissioned DOIs,
fails to re-add them, and consequently refuses to add
the “default” domain map:
# netlabelctl -p cipso list
Configured CIPSO mappings (2)
DOI value : 3
mapping type : PASS_THROUGH
DOI value : 4
mapping type : PASS_THROUGH
# netlabelctl -p map list
Configured NetLabel domain mappings (2)
domain: "_" (IPv4)
protocol: UNLABELED
!> (no ipv4 map for default domain here)
domain: DEFAULT (IPv6)
protocol: UNLABELED
Fix by clearing decommissioned DOI definitions and
serializing concurrent DOI updates with a new lock.
Also:
- allow /smack/doi to live unconfigured, since
adding a map (netlbl_cfg_cipsov4_map_add) may fail.
CIPSO_V4_DOI_UNKNOWN(0) indicates the unconfigured DOI
- add new DOI before removing the old default map,
so the old map remains if the add fails
(2008-02-04, Casey Schaufler) |
| In the Linux kernel, the following vulnerability has been resolved:
tpm: st33zp24: Fix missing cleanup on get_burstcount() error
get_burstcount() can return -EBUSY on timeout. When this happens,
st33zp24_send() returns directly without releasing the locality
acquired earlier.
Use goto out_err to ensure proper cleanup when get_burstcount() fails. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/amdkfd: Fix watch_id bounds checking in debug address watch v2
The address watch clear code receives watch_id as an unsigned value
(u32), but some helper functions were using a signed int and checked
bits by shifting with watch_id.
If a very large watch_id is passed from userspace, it can be converted
to a negative value. This can cause invalid shifts and may access
memory outside the watch_points array.
drm/amdkfd: Fix watch_id bounds checking in debug address watch v2
Fix this by checking that watch_id is within MAX_WATCH_ADDRESSES before
using it. Also use BIT(watch_id) to test and clear bits safely.
This keeps the behavior unchanged for valid watch IDs and avoids
undefined behavior for invalid ones.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_debug.c:448
kfd_dbg_trap_clear_dev_address_watch() error: buffer overflow
'pdd->watch_points' 4 <= u32max user_rl='0-3,2147483648-u32max' uncapped
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_debug.c
433 int kfd_dbg_trap_clear_dev_address_watch(struct kfd_process_device *pdd,
434 uint32_t watch_id)
435 {
436 int r;
437
438 if (!kfd_dbg_owns_dev_watch_id(pdd, watch_id))
kfd_dbg_owns_dev_watch_id() doesn't check for negative values so if
watch_id is larger than INT_MAX it leads to a buffer overflow.
(Negative shifts are undefined).
439 return -EINVAL;
440
441 if (!pdd->dev->kfd->shared_resources.enable_mes) {
442 r = debug_lock_and_unmap(pdd->dev->dqm);
443 if (r)
444 return r;
445 }
446
447 amdgpu_gfx_off_ctrl(pdd->dev->adev, false);
--> 448 pdd->watch_points[watch_id] = pdd->dev->kfd2kgd->clear_address_watch(
449 pdd->dev->adev,
450 watch_id);
v2: (as per, Jonathan Kim)
- Add early watch_id >= MAX_WATCH_ADDRESSES validation in the set path to
match the clear path.
- Drop the redundant bounds check in kfd_dbg_owns_dev_watch_id(). |
| In the Linux kernel, the following vulnerability has been resolved:
net: bridge: mcast: always update mdb_n_entries for vlan contexts
syzbot triggered a warning[1] about the number of mdb entries in a context.
It turned out that there are multiple ways to trigger that warning today
(some got added during the years), the root cause of the problem is that
the increase is done conditionally, and over the years these different
conditions increased so there were new ways to trigger the warning, that is
to do a decrease which wasn't paired with a previous increase.
For example one way to trigger it is with flush:
$ ip l add br0 up type bridge vlan_filtering 1 mcast_snooping 1
$ ip l add dumdum up master br0 type dummy
$ bridge mdb add dev br0 port dumdum grp 239.0.0.1 permanent vid 1
$ ip link set dev br0 down
$ ip link set dev br0 type bridge mcast_vlan_snooping 1
^^^^ this will enable snooping, but will not update mdb_n_entries
because in __br_multicast_enable_port_ctx() we check !netif_running
$ bridge mdb flush dev br0
^^^ this will trigger the warning because it will delete the pg which
we added above, which will try to decrease mdb_n_entries
Fix the problem by removing the conditional increase and always keep the
count up-to-date while the vlan exists. In order to do that we have to
first initialize it on port-vlan context creation, and then always increase
or decrease the value regardless of mcast options. To keep the current
behaviour we have to enforce the mdb limit only if the context is port's or
if the port-vlan's mcast snooping is enabled.
[1]
------------[ cut here ]------------
n == 0
WARNING: net/bridge/br_multicast.c:718 at br_multicast_port_ngroups_dec_one net/bridge/br_multicast.c:718 [inline], CPU#0: syz.4.4607/22043
WARNING: net/bridge/br_multicast.c:718 at br_multicast_port_ngroups_dec net/bridge/br_multicast.c:771 [inline], CPU#0: syz.4.4607/22043
WARNING: net/bridge/br_multicast.c:718 at br_multicast_del_pg+0x1bbe/0x1e20 net/bridge/br_multicast.c:825, CPU#0: syz.4.4607/22043
Modules linked in:
CPU: 0 UID: 0 PID: 22043 Comm: syz.4.4607 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026
RIP: 0010:br_multicast_port_ngroups_dec_one net/bridge/br_multicast.c:718 [inline]
RIP: 0010:br_multicast_port_ngroups_dec net/bridge/br_multicast.c:771 [inline]
RIP: 0010:br_multicast_del_pg+0x1bbe/0x1e20 net/bridge/br_multicast.c:825
Code: 41 5f 5d e9 04 7a 48 f7 e8 3f 73 5c f7 90 0f 0b 90 e9 cf fd ff ff e8 31 73 5c f7 90 0f 0b 90 e9 16 fd ff ff e8 23 73 5c f7 90 <0f> 0b 90 e9 60 fd ff ff e8 15 73 5c f7 eb 05 e8 0e 73 5c f7 48 8b
RSP: 0018:ffffc9000c207220 EFLAGS: 00010293
RAX: ffffffff8a68042d RBX: ffff88807c6f1800 RCX: ffff888066e90000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffff888066e90000 R09: 000000000000000c
R10: 000000000000000c R11: 0000000000000000 R12: ffff8880303ef800
R13: dffffc0000000000 R14: ffff888050eb11c4 R15: 1ffff1100a1d6238
FS: 00007fa45921b6c0(0000) GS:ffff8881256f5000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa4591f9ff8 CR3: 0000000081df2000 CR4: 00000000003526f0
Call Trace:
<TASK>
br_mdb_flush_pgs net/bridge/br_mdb.c:1525 [inline]
br_mdb_flush net/bridge/br_mdb.c:1544 [inline]
br_mdb_del_bulk+0x5e2/0xb20 net/bridge/br_mdb.c:1561
rtnl_mdb_del+0x48a/0x640 net/core/rtnetlink.c:-1
rtnetlink_rcv_msg+0x77e/0xbe0 net/core/rtnetlink.c:6967
netlink_rcv_skb+0x232/0x4b0 net/netlink/af_netlink.c:2550
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x80f/0x9b0 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x813/0xb40 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
____sys_sendmsg+0xa68/0xad0 net/socket.c:2592
___sys_sendmsg+0x2a5/0x360 net/socke
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
bonding: alb: fix UAF in rlb_arp_recv during bond up/down
The ALB RX path may access rx_hashtbl concurrently with bond
teardown. During rapid bond up/down cycles, rlb_deinitialize()
frees rx_hashtbl while RX handlers are still running, leading
to a null pointer dereference detected by KASAN.
However, the root cause is that rlb_arp_recv() can still be accessed
after setting recv_probe to NULL, which is actually a use-after-free
(UAF) issue. That is the reason for using the referenced commit in the
Fixes tag.
[ 214.174138] Oops: general protection fault, probably for non-canonical address 0xdffffc000000001d: 0000 [#1] SMP KASAN PTI
[ 214.186478] KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef]
[ 214.194933] CPU: 30 UID: 0 PID: 2375 Comm: ping Kdump: loaded Not tainted 6.19.0-rc8+ #2 PREEMPT(voluntary)
[ 214.205907] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.14.0 01/14/2022
[ 214.214357] RIP: 0010:rlb_arp_recv+0x505/0xab0 [bonding]
[ 214.220320] Code: 0f 85 2b 05 00 00 48 b8 00 00 00 00 00 fc ff df 40 0f b6 ed 48 c1 e5 06 49 03 ad 78 01 00 00 48 8d 7d 28 48 89 fa 48 c1 ea 03 <0f> b6
04 02 84 c0 74 06 0f 8e 12 05 00 00 80 7d 28 00 0f 84 8c 00
[ 214.241280] RSP: 0018:ffffc900073d8870 EFLAGS: 00010206
[ 214.247116] RAX: dffffc0000000000 RBX: ffff888168556822 RCX: ffff88816855681e
[ 214.255082] RDX: 000000000000001d RSI: dffffc0000000000 RDI: 00000000000000e8
[ 214.263048] RBP: 00000000000000c0 R08: 0000000000000002 R09: ffffed11192021c8
[ 214.271013] R10: ffff8888c9010e43 R11: 0000000000000001 R12: 1ffff92000e7b119
[ 214.278978] R13: ffff8888c9010e00 R14: ffff888168556822 R15: ffff888168556810
[ 214.286943] FS: 00007f85d2d9cb80(0000) GS:ffff88886ccb3000(0000) knlGS:0000000000000000
[ 214.295966] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 214.302380] CR2: 00007f0d047b5e34 CR3: 00000008a1c2e002 CR4: 00000000001726f0
[ 214.310347] Call Trace:
[ 214.313070] <IRQ>
[ 214.315318] ? __pfx_rlb_arp_recv+0x10/0x10 [bonding]
[ 214.320975] bond_handle_frame+0x166/0xb60 [bonding]
[ 214.326537] ? __pfx_bond_handle_frame+0x10/0x10 [bonding]
[ 214.332680] __netif_receive_skb_core.constprop.0+0x576/0x2710
[ 214.339199] ? __pfx_arp_process+0x10/0x10
[ 214.343775] ? sched_balance_find_src_group+0x98/0x630
[ 214.349513] ? __pfx___netif_receive_skb_core.constprop.0+0x10/0x10
[ 214.356513] ? arp_rcv+0x307/0x690
[ 214.360311] ? __pfx_arp_rcv+0x10/0x10
[ 214.364499] ? __lock_acquire+0x58c/0xbd0
[ 214.368975] __netif_receive_skb_one_core+0xae/0x1b0
[ 214.374518] ? __pfx___netif_receive_skb_one_core+0x10/0x10
[ 214.380743] ? lock_acquire+0x10b/0x140
[ 214.385026] process_backlog+0x3f1/0x13a0
[ 214.389502] ? process_backlog+0x3aa/0x13a0
[ 214.394174] __napi_poll.constprop.0+0x9f/0x370
[ 214.399233] net_rx_action+0x8c1/0xe60
[ 214.403423] ? __pfx_net_rx_action+0x10/0x10
[ 214.408193] ? lock_acquire.part.0+0xbd/0x260
[ 214.413058] ? sched_clock_cpu+0x6c/0x540
[ 214.417540] ? mark_held_locks+0x40/0x70
[ 214.421920] handle_softirqs+0x1fd/0x860
[ 214.426302] ? __pfx_handle_softirqs+0x10/0x10
[ 214.431264] ? __neigh_event_send+0x2d6/0xf50
[ 214.436131] do_softirq+0xb1/0xf0
[ 214.439830] </IRQ>
The issue is reproducible by repeatedly running
ip link set bond0 up/down while receiving ARP messages, where
rlb_arp_recv() can race with rlb_deinitialize() and dereference
a freed rx_hashtbl entry.
Fix this by setting recv_probe to NULL and then calling
synchronize_net() to wait for any concurrent RX processing to finish.
This ensures that no RX handler can access rx_hashtbl after it is freed
in bond_alb_deinitialize(). |