CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
In the Linux kernel, the following vulnerability has been resolved:
hwmon: (mlxreg-fan) Return non-zero value when fan current state is enforced from sysfs
Fan speed minimum can be enforced from sysfs. For example, setting
current fan speed to 20 is used to enforce fan speed to be at 100%
speed, 19 - to be not below 90% speed, etcetera. This feature provides
ability to limit fan speed according to some system wise
considerations, like absence of some replaceable units or high system
ambient temperature.
Request for changing fan minimum speed is configuration request and can
be set only through 'sysfs' write procedure. In this situation value of
argument 'state' is above nominal fan speed maximum.
Return non-zero code in this case to avoid
thermal_cooling_device_stats_update() call, because in this case
statistics update violates thermal statistics table range.
The issues is observed in case kernel is configured with option
CONFIG_THERMAL_STATISTICS.
Here is the trace from KASAN:
[ 159.506659] BUG: KASAN: slab-out-of-bounds in thermal_cooling_device_stats_update+0x7d/0xb0
[ 159.516016] Read of size 4 at addr ffff888116163840 by task hw-management.s/7444
[ 159.545625] Call Trace:
[ 159.548366] dump_stack+0x92/0xc1
[ 159.552084] ? thermal_cooling_device_stats_update+0x7d/0xb0
[ 159.635869] thermal_zone_device_update+0x345/0x780
[ 159.688711] thermal_zone_device_set_mode+0x7d/0xc0
[ 159.694174] mlxsw_thermal_modules_init+0x48f/0x590 [mlxsw_core]
[ 159.700972] ? mlxsw_thermal_set_cur_state+0x5a0/0x5a0 [mlxsw_core]
[ 159.731827] mlxsw_thermal_init+0x763/0x880 [mlxsw_core]
[ 160.070233] RIP: 0033:0x7fd995909970
[ 160.074239] Code: 73 01 c3 48 8b 0d 28 d5 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 99 2d 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ..
[ 160.095242] RSP: 002b:00007fff54f5d938 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 160.103722] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007fd995909970
[ 160.111710] RDX: 0000000000000013 RSI: 0000000001906008 RDI: 0000000000000001
[ 160.119699] RBP: 0000000001906008 R08: 00007fd995bc9760 R09: 00007fd996210700
[ 160.127687] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000013
[ 160.135673] R13: 0000000000000001 R14: 00007fd995bc8600 R15: 0000000000000013
[ 160.143671]
[ 160.145338] Allocated by task 2924:
[ 160.149242] kasan_save_stack+0x19/0x40
[ 160.153541] __kasan_kmalloc+0x7f/0xa0
[ 160.157743] __kmalloc+0x1a2/0x2b0
[ 160.161552] thermal_cooling_device_setup_sysfs+0xf9/0x1a0
[ 160.167687] __thermal_cooling_device_register+0x1b5/0x500
[ 160.173833] devm_thermal_of_cooling_device_register+0x60/0xa0
[ 160.180356] mlxreg_fan_probe+0x474/0x5e0 [mlxreg_fan]
[ 160.248140]
[ 160.249807] The buggy address belongs to the object at ffff888116163400
[ 160.249807] which belongs to the cache kmalloc-1k of size 1024
[ 160.263814] The buggy address is located 64 bytes to the right of
[ 160.263814] 1024-byte region [ffff888116163400, ffff888116163800)
[ 160.277536] The buggy address belongs to the page:
[ 160.282898] page:0000000012275840 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888116167000 pfn:0x116160
[ 160.294872] head:0000000012275840 order:3 compound_mapcount:0 compound_pincount:0
[ 160.303251] flags: 0x200000000010200(slab|head|node=0|zone=2)
[ 160.309694] raw: 0200000000010200 ffffea00046f7208 ffffea0004928208 ffff88810004dbc0
[ 160.318367] raw: ffff888116167000 00000000000a0006 00000001ffffffff 0000000000000000
[ 160.327033] page dumped because: kasan: bad access detected
[ 160.333270]
[ 160.334937] Memory state around the buggy address:
[ 160.356469] >ffff888116163800: fc .. |
In the Linux kernel, the following vulnerability has been resolved:
RDMA/cma: Fix listener leak in rdma_cma_listen_on_all() failure
If cma_listen_on_all() fails it leaves the per-device ID still on the
listen_list but the state is not set to RDMA_CM_ADDR_BOUND.
When the cmid is eventually destroyed cma_cancel_listens() is not called
due to the wrong state, however the per-device IDs are still holding the
refcount preventing the ID from being destroyed, thus deadlocking:
task:rping state:D stack: 0 pid:19605 ppid: 47036 flags:0x00000084
Call Trace:
__schedule+0x29a/0x780
? free_unref_page_commit+0x9b/0x110
schedule+0x3c/0xa0
schedule_timeout+0x215/0x2b0
? __flush_work+0x19e/0x1e0
wait_for_completion+0x8d/0xf0
_destroy_id+0x144/0x210 [rdma_cm]
ucma_close_id+0x2b/0x40 [rdma_ucm]
__destroy_id+0x93/0x2c0 [rdma_ucm]
? __xa_erase+0x4a/0xa0
ucma_destroy_id+0x9a/0x120 [rdma_ucm]
ucma_write+0xb8/0x130 [rdma_ucm]
vfs_write+0xb4/0x250
ksys_write+0xb5/0xd0
? syscall_trace_enter.isra.19+0x123/0x190
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Ensure that cma_listen_on_all() atomically unwinds its action under the
lock during error. |
In the Linux kernel, the following vulnerability has been resolved:
RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
The FSM can run in a circle allowing rdma_resolve_ip() to be called twice
on the same id_priv. While this cannot happen without going through the
work, it violates the invariant that the same address resolution
background request cannot be active twice.
CPU 1 CPU 2
rdma_resolve_addr():
RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY
rdma_resolve_ip(addr_handler) #1
process_one_req(): for #1
addr_handler():
RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND
mutex_unlock(&id_priv->handler_mutex);
[.. handler still running ..]
rdma_resolve_addr():
RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY
rdma_resolve_ip(addr_handler)
!! two requests are now on the req_list
rdma_destroy_id():
destroy_id_handler_unlock():
_destroy_id():
cma_cancel_operation():
rdma_addr_cancel()
// process_one_req() self removes it
spin_lock_bh(&lock);
cancel_delayed_work(&req->work);
if (!list_empty(&req->list)) == true
! rdma_addr_cancel() returns after process_on_req #1 is done
kfree(id_priv)
process_one_req(): for #2
addr_handler():
mutex_lock(&id_priv->handler_mutex);
!! Use after free on id_priv
rdma_addr_cancel() expects there to be one req on the list and only
cancels the first one. The self-removal behavior of the work only happens
after the handler has returned. This yields a situations where the
req_list can have two reqs for the same "handle" but rdma_addr_cancel()
only cancels the first one.
The second req remains active beyond rdma_destroy_id() and will
use-after-free id_priv once it inevitably triggers.
Fix this by remembering if the id_priv has called rdma_resolve_ip() and
always cancel before calling it again. This ensures the req_list never
gets more than one item in it and doesn't cost anything in the normal flow
that never uses this strange error path. |
A flaw was found in linux-pam. The pam_namespace module may improperly handle user-controlled paths, allowing local users to exploit symlink attacks and race conditions to elevate their privileges to root. This CVE provides a "complete" fix for CVE-2025-6020. |
The SureForms WordPress plugin before 1.9.1 does not sanitise and escape some parameters when outputing them in the page, which could allow admin and above users to perform Cross-Site Scripting attacks. |
A flaw was found in linux-pam. The module pam_namespace may use access user-controlled paths without proper protection, allowing local users to elevate their privileges to root via multiple symlink attacks and race conditions. |
A vulnerability has been identified in the libarchive library, specifically within the archive_read_format_rar_seek_data() function. This flaw involves an integer overflow that can ultimately lead to a double-free condition. Exploiting a double-free vulnerability can result in memory corruption, enabling an attacker to execute arbitrary code or cause a denial-of-service condition. |
astral-tokio-tar is a tar archive reading/writing library for async Rust. In versions 0.5.3 and earlier of astral-tokio-tar, tar archives may extract outside of their intended destination directory when using the Entry::unpack_in_raw API. Additionally, the Entry::allow_external_symlinks control (which defaults to true) could be bypassed via a pair of symlinks that individually point within the destination but combine to point outside of it. These behaviors could be used individually or combined to bypass the intended security control of limiting extraction to the given directory. This in turn would allow an attacker with a malicious tar archive to perform an arbitrary file write and potentially pivot into code execution. This issue has been patched in version 0.5.4. There is no workaround other than upgrading. |
Http4s is a Scala interface for HTTP services. In versions from 1.0.0-M1 to before 1.0.0-M45 and before 0.23.31, http4s is vulnerable to HTTP Request Smuggling due to improper handling of HTTP trailer section. This vulnerability could enable attackers to bypass front-end servers security controls, launch targeted attacks against active users, and poison web caches. A pre-requisite for exploitation involves the web application being deployed behind a reverse-proxy that forwards trailer headers. This issue has been patched in versions 1.0.0-M45 and 0.23.31. |
OS Command injection vulnerability in D-Link C1 2020-02-21. The sub_47F028 function in jhttpd contains a command injection vulnerability via the HTTP parameter "time". |
Sunshine is a self-hosted game stream host for Moonlight. Prior to version 2025.923.33222, the Windows service SunshineService is installed with an unquoted executable path. If Sunshine is installed in a directory whose name includes a space, the Service Control Manager (SCM) interprets the path incrementally and may execute a malicious binary placed earlier in the search string. This issue has been patched in version 2025.923.33222. |
The CleverControl employee monitoring software (v11.5.1041.6) fails to validate TLS server certificates during the installation process. The installer downloads and executes external components using curl.exe --insecure, enabling a man-in-the-middle attacker to deliver malicious files that are executed with SYSTEM privileges. This can lead to full remote code execution with administrative rights. No patch is available as the vendor has been unresponsive. It is assumed that previous versions are also affected, but this is not confirmed. |
An information disclosure vulnerability exists in multiple WSO2 products due to improper implementation of the enrich mediator. Authenticated users may be able to view unintended business data from other mediation contexts because the internal state is not properly isolated or cleared between executions.
This vulnerability does not impact user credentials or access tokens but may lead to leakage of sensitive business information handled during message flows. |
In the Linux kernel, the following vulnerability has been resolved:
tty: Fix out-of-bound vmalloc access in imageblit
This issue happens when a userspace program does an ioctl
FBIOPUT_VSCREENINFO passing the fb_var_screeninfo struct
containing only the fields xres, yres, and bits_per_pixel
with values.
If this struct is the same as the previous ioctl, the
vc_resize() detects it and doesn't call the resize_screen(),
leaving the fb_var_screeninfo incomplete. And this leads to
the updatescrollmode() calculates a wrong value to
fbcon_display->vrows, which makes the real_y() return a
wrong value of y, and that value, eventually, causes
the imageblit to access an out-of-bound address value.
To solve this issue I made the resize_screen() be called
even if the screen does not need any resizing, so it will
"fix and fill" the fb_var_screeninfo independently. |
In the Linux kernel, the following vulnerability has been resolved:
SUNRPC: Fix RPC client cleaned up the freed pipefs dentries
RPC client pipefs dentries cleanup is in separated rpc_remove_pipedir()
workqueue,which takes care about pipefs superblock locking.
In some special scenarios, when kernel frees the pipefs sb of the
current client and immediately alloctes a new pipefs sb,
rpc_remove_pipedir function would misjudge the existence of pipefs
sb which is not the one it used to hold. As a result,
the rpc_remove_pipedir would clean the released freed pipefs dentries.
To fix this issue, rpc_remove_pipedir should check whether the
current pipefs sb is consistent with the original pipefs sb.
This error can be catched by KASAN:
=========================================================
[ 250.497700] BUG: KASAN: slab-use-after-free in dget_parent+0x195/0x200
[ 250.498315] Read of size 4 at addr ffff88800a2ab804 by task kworker/0:18/106503
[ 250.500549] Workqueue: events rpc_free_client_work
[ 250.501001] Call Trace:
[ 250.502880] kasan_report+0xb6/0xf0
[ 250.503209] ? dget_parent+0x195/0x200
[ 250.503561] dget_parent+0x195/0x200
[ 250.503897] ? __pfx_rpc_clntdir_depopulate+0x10/0x10
[ 250.504384] rpc_rmdir_depopulate+0x1b/0x90
[ 250.504781] rpc_remove_client_dir+0xf5/0x150
[ 250.505195] rpc_free_client_work+0xe4/0x230
[ 250.505598] process_one_work+0x8ee/0x13b0
...
[ 22.039056] Allocated by task 244:
[ 22.039390] kasan_save_stack+0x22/0x50
[ 22.039758] kasan_set_track+0x25/0x30
[ 22.040109] __kasan_slab_alloc+0x59/0x70
[ 22.040487] kmem_cache_alloc_lru+0xf0/0x240
[ 22.040889] __d_alloc+0x31/0x8e0
[ 22.041207] d_alloc+0x44/0x1f0
[ 22.041514] __rpc_lookup_create_exclusive+0x11c/0x140
[ 22.041987] rpc_mkdir_populate.constprop.0+0x5f/0x110
[ 22.042459] rpc_create_client_dir+0x34/0x150
[ 22.042874] rpc_setup_pipedir_sb+0x102/0x1c0
[ 22.043284] rpc_client_register+0x136/0x4e0
[ 22.043689] rpc_new_client+0x911/0x1020
[ 22.044057] rpc_create_xprt+0xcb/0x370
[ 22.044417] rpc_create+0x36b/0x6c0
...
[ 22.049524] Freed by task 0:
[ 22.049803] kasan_save_stack+0x22/0x50
[ 22.050165] kasan_set_track+0x25/0x30
[ 22.050520] kasan_save_free_info+0x2b/0x50
[ 22.050921] __kasan_slab_free+0x10e/0x1a0
[ 22.051306] kmem_cache_free+0xa5/0x390
[ 22.051667] rcu_core+0x62c/0x1930
[ 22.051995] __do_softirq+0x165/0x52a
[ 22.052347]
[ 22.052503] Last potentially related work creation:
[ 22.052952] kasan_save_stack+0x22/0x50
[ 22.053313] __kasan_record_aux_stack+0x8e/0xa0
[ 22.053739] __call_rcu_common.constprop.0+0x6b/0x8b0
[ 22.054209] dentry_free+0xb2/0x140
[ 22.054540] __dentry_kill+0x3be/0x540
[ 22.054900] shrink_dentry_list+0x199/0x510
[ 22.055293] shrink_dcache_parent+0x190/0x240
[ 22.055703] do_one_tree+0x11/0x40
[ 22.056028] shrink_dcache_for_umount+0x61/0x140
[ 22.056461] generic_shutdown_super+0x70/0x590
[ 22.056879] kill_anon_super+0x3a/0x60
[ 22.057234] rpc_kill_sb+0x121/0x200 |
Missing Authorization vulnerability in printcart Printcart Web to Print Product Designer for WooCommerce allows Exploiting Incorrectly Configured Access Control Security Levels. This issue affects Printcart Web to Print Product Designer for WooCommerce: from n/a through 2.4.3. |
In the Linux kernel, the following vulnerability has been resolved:
ipvlan: add ipvlan_route_v6_outbound() helper
Inspired by syzbot reports using a stack of multiple ipvlan devices.
Reduce stack size needed in ipvlan_process_v6_outbound() by moving
the flowi6 struct used for the route lookup in an non inlined
helper. ipvlan_route_v6_outbound() needs 120 bytes on the stack,
immediately reclaimed.
Also make sure ipvlan_process_v4_outbound() is not inlined.
We might also have to lower MAX_NEST_DEV, because only syzbot uses
setups with more than four stacked devices.
BUG: TASK stack guard page was hit at ffffc9000e803ff8 (stack is ffffc9000e804000..ffffc9000e808000)
stack guard page: 0000 [#1] SMP KASAN
CPU: 0 PID: 13442 Comm: syz-executor.4 Not tainted 6.1.52-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
RIP: 0010:kasan_check_range+0x4/0x2a0 mm/kasan/generic.c:188
Code: 48 01 c6 48 89 c7 e8 db 4e c1 03 31 c0 5d c3 cc 0f 0b eb 02 0f 0b b8 ea ff ff ff 5d c3 cc 00 00 cc cc 00 00 cc cc 55 48 89 e5 <41> 57 41 56 41 55 41 54 53 b0 01 48 85 f6 0f 84 a4 01 00 00 48 89
RSP: 0018:ffffc9000e804000 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff817e5bf2
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffffff887c6568
RBP: ffffc9000e804000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: dffffc0000000001 R12: 1ffff92001d0080c
R13: dffffc0000000000 R14: ffffffff87e6b100 R15: 0000000000000000
FS: 00007fd0c55826c0(0000) GS:ffff8881f6800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffc9000e803ff8 CR3: 0000000170ef7000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<#DF>
</#DF>
<TASK>
[<ffffffff81f281d1>] __kasan_check_read+0x11/0x20 mm/kasan/shadow.c:31
[<ffffffff817e5bf2>] instrument_atomic_read include/linux/instrumented.h:72 [inline]
[<ffffffff817e5bf2>] _test_bit include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline]
[<ffffffff817e5bf2>] cpumask_test_cpu include/linux/cpumask.h:506 [inline]
[<ffffffff817e5bf2>] cpu_online include/linux/cpumask.h:1092 [inline]
[<ffffffff817e5bf2>] trace_lock_acquire include/trace/events/lock.h:24 [inline]
[<ffffffff817e5bf2>] lock_acquire+0xe2/0x590 kernel/locking/lockdep.c:5632
[<ffffffff8563221e>] rcu_lock_acquire+0x2e/0x40 include/linux/rcupdate.h:306
[<ffffffff8561464d>] rcu_read_lock include/linux/rcupdate.h:747 [inline]
[<ffffffff8561464d>] ip6_pol_route+0x15d/0x1440 net/ipv6/route.c:2221
[<ffffffff85618120>] ip6_pol_route_output+0x50/0x80 net/ipv6/route.c:2606
[<ffffffff856f65b5>] pol_lookup_func include/net/ip6_fib.h:584 [inline]
[<ffffffff856f65b5>] fib6_rule_lookup+0x265/0x620 net/ipv6/fib6_rules.c:116
[<ffffffff85618009>] ip6_route_output_flags_noref+0x2d9/0x3a0 net/ipv6/route.c:2638
[<ffffffff8561821a>] ip6_route_output_flags+0xca/0x340 net/ipv6/route.c:2651
[<ffffffff838bd5a3>] ip6_route_output include/net/ip6_route.h:100 [inline]
[<ffffffff838bd5a3>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:473 [inline]
[<ffffffff838bd5a3>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline]
[<ffffffff838bd5a3>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline]
[<ffffffff838bd5a3>] ipvlan_queue_xmit+0xc33/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677
[<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229
[<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline]
[<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline]
[<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660
[<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324
[<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline]
[<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline]
[<f
---truncated--- |
DNN (formerly DotNetNuke) is an open-source web content management platform (CMS) in the Microsoft ecosystem. Prior to version 10.1.0, arbitrary themes can be loaded through query parameters. If an installed theme had a vulnerability, even if it was not used on any page, this could be loaded on unsuspecting clients without knowledge of the site owner. This issue has been patched in version 10.1.0. |
In the Linux kernel, the following vulnerability has been resolved:
cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails
Commit 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in
cxl_region_attach()") tried to avoid 'eiw' initialization errors when
->nr_targets exceeded 16, by just decrementing ->nr_targets when
cxl_region_setup_targets() failed.
Commit 86987c766276 ("cxl/region: Cleanup target list on attach error")
extended that cleanup to also clear cxled->pos and p->targets[pos]. The
initialization error was incidentally fixed separately by:
Commit 8d4285425714 ("cxl/region: Fix port setup uninitialized variable
warnings") which was merged a few days after 5e42bcbc3fef.
But now the original cleanup when cxl_region_setup_targets() fails
prevents endpoint and switch decoder resources from being reused:
1) the cleanup does not set the decoder's region to NULL, which results
in future dpa_size_store() calls returning -EBUSY
2) the decoder is not properly freed, which results in future commit
errors associated with the upstream switch
Now that the initialization errors were fixed separately, the proper
cleanup for this case is to just return immediately. Then the resources
associated with this target get cleanup up as normal when the failed
region is deleted.
The ->nr_targets decrement in the error case also helped prevent
a p->targets[] array overflow, so add a new check to prevent against
that overflow.
Tested by trying to create an invalid region for a 2 switch * 2 endpoint
topology, and then following up with creating a valid region. |
In the Linux kernel, the following vulnerability has been resolved:
mm/vmalloc: combine all TLB flush operations of KASAN shadow virtual address into one operation
When compiling kernel source 'make -j $(nproc)' with the up-and-running
KASAN-enabled kernel on a 256-core machine, the following soft lockup is
shown:
watchdog: BUG: soft lockup - CPU#28 stuck for 22s! [kworker/28:1:1760]
CPU: 28 PID: 1760 Comm: kworker/28:1 Kdump: loaded Not tainted 6.10.0-rc5 #95
Workqueue: events drain_vmap_area_work
RIP: 0010:smp_call_function_many_cond+0x1d8/0xbb0
Code: 38 c8 7c 08 84 c9 0f 85 49 08 00 00 8b 45 08 a8 01 74 2e 48 89 f1 49 89 f7 48 c1 e9 03 41 83 e7 07 4c 01 e9 41 83 c7 03 f3 90 <0f> b6 01 41 38 c7 7c 08 84 c0 0f 85 d4 06 00 00 8b 45 08 a8 01 75
RSP: 0018:ffffc9000cb3fb60 EFLAGS: 00000202
RAX: 0000000000000011 RBX: ffff8883bc4469c0 RCX: ffffed10776e9949
RDX: 0000000000000002 RSI: ffff8883bb74ca48 RDI: ffffffff8434dc50
RBP: ffff8883bb74ca40 R08: ffff888103585dc0 R09: ffff8884533a1800
R10: 0000000000000004 R11: ffffffffffffffff R12: ffffed1077888d39
R13: dffffc0000000000 R14: ffffed1077888d38 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff8883bc400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005577b5c8d158 CR3: 0000000004850000 CR4: 0000000000350ef0
Call Trace:
<IRQ>
? watchdog_timer_fn+0x2cd/0x390
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x300/0x6d0
? sched_clock_cpu+0x69/0x4e0
? __pfx___hrtimer_run_queues+0x10/0x10
? srso_return_thunk+0x5/0x5f
? ktime_get_update_offsets_now+0x7f/0x2a0
? srso_return_thunk+0x5/0x5f
? srso_return_thunk+0x5/0x5f
? hrtimer_interrupt+0x2ca/0x760
? __sysvec_apic_timer_interrupt+0x8c/0x2b0
? sysvec_apic_timer_interrupt+0x6a/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? smp_call_function_many_cond+0x1d8/0xbb0
? __pfx_do_kernel_range_flush+0x10/0x10
on_each_cpu_cond_mask+0x20/0x40
flush_tlb_kernel_range+0x19b/0x250
? srso_return_thunk+0x5/0x5f
? kasan_release_vmalloc+0xa7/0xc0
purge_vmap_node+0x357/0x820
? __pfx_purge_vmap_node+0x10/0x10
__purge_vmap_area_lazy+0x5b8/0xa10
drain_vmap_area_work+0x21/0x30
process_one_work+0x661/0x10b0
worker_thread+0x844/0x10e0
? srso_return_thunk+0x5/0x5f
? __kthread_parkme+0x82/0x140
? __pfx_worker_thread+0x10/0x10
kthread+0x2a5/0x370
? __pfx_kthread+0x10/0x10
ret_from_fork+0x30/0x70
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
Debugging Analysis:
1. The following ftrace log shows that the lockup CPU spends too much
time iterating vmap_nodes and flushing TLB when purging vm_area
structures. (Some info is trimmed).
kworker: funcgraph_entry: | drain_vmap_area_work() {
kworker: funcgraph_entry: | mutex_lock() {
kworker: funcgraph_entry: 1.092 us | __cond_resched();
kworker: funcgraph_exit: 3.306 us | }
... ...
kworker: funcgraph_entry: | flush_tlb_kernel_range() {
... ...
kworker: funcgraph_exit: # 7533.649 us | }
... ...
kworker: funcgraph_entry: 2.344 us | mutex_unlock();
kworker: funcgraph_exit: $ 23871554 us | }
The drain_vmap_area_work() spends over 23 seconds.
There are 2805 flush_tlb_kernel_range() calls in the ftrace log.
* One is called in __purge_vmap_area_lazy().
* Others are called by purge_vmap_node->kasan_release_vmalloc.
purge_vmap_node() iteratively releases kasan vmalloc
allocations and flushes TLB for each vmap_area.
- [Rough calculation] Each flush_tlb_kernel_range() runs
about 7.5ms.
-- 2804 * 7.5ms = 21.03 seconds.
-- That's why a soft lock is triggered.
2. Extending the soft lockup time can work around the issue (For example,
# echo
---truncated--- |