CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
In the Linux kernel, the following vulnerability has been resolved:
ASoC: mediatek: mt8173: Enable IRQ when pdata is ready
If the device does not come straight from reset, we might receive an IRQ
before we are ready to handle it.
[ 2.334737] Unable to handle kernel read from unreadable memory at virtual address 00000000000001e4
[ 2.522601] Call trace:
[ 2.525040] regmap_read+0x1c/0x80
[ 2.528434] mt8173_afe_irq_handler+0x40/0xf0
...
[ 2.598921] start_kernel+0x338/0x42c |
In the Linux kernel, the following vulnerability has been resolved:
mmc: vub300: fix warning - do not call blocking ops when !TASK_RUNNING
vub300_enable_sdio_irq() works with mutex and need TASK_RUNNING here.
Ensure that we mark current as TASK_RUNNING for sleepable context.
[ 77.554641] do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff92a72c1d>] sdio_irq_thread+0x17d/0x5b0
[ 77.554652] WARNING: CPU: 2 PID: 1983 at kernel/sched/core.c:9813 __might_sleep+0x116/0x160
[ 77.554905] CPU: 2 PID: 1983 Comm: ksdioirqd/mmc1 Tainted: G OE 6.1.0-rc5 #1
[ 77.554910] Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0081.2020.0504.1834 05/04/2020
[ 77.554912] RIP: 0010:__might_sleep+0x116/0x160
[ 77.554920] RSP: 0018:ffff888107b7fdb8 EFLAGS: 00010282
[ 77.554923] RAX: 0000000000000000 RBX: ffff888118c1b740 RCX: 0000000000000000
[ 77.554926] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffed1020f6ffa9
[ 77.554928] RBP: ffff888107b7fde0 R08: 0000000000000001 R09: ffffed1043ea60ba
[ 77.554930] R10: ffff88821f5305cb R11: ffffed1043ea60b9 R12: ffffffff93aa3a60
[ 77.554932] R13: 000000000000011b R14: 7fffffffffffffff R15: ffffffffc0558660
[ 77.554934] FS: 0000000000000000(0000) GS:ffff88821f500000(0000) knlGS:0000000000000000
[ 77.554937] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 77.554939] CR2: 00007f8a44010d68 CR3: 000000024421a003 CR4: 00000000003706e0
[ 77.554942] Call Trace:
[ 77.554944] <TASK>
[ 77.554952] mutex_lock+0x78/0xf0
[ 77.554973] vub300_enable_sdio_irq+0x103/0x3c0 [vub300]
[ 77.554981] sdio_irq_thread+0x25c/0x5b0
[ 77.555006] kthread+0x2b8/0x370
[ 77.555017] ret_from_fork+0x1f/0x30
[ 77.555023] </TASK>
[ 77.555025] ---[ end trace 0000000000000000 ]--- |
In the Linux kernel, the following vulnerability has been resolved:
blk-mq: fix possible memleak when register 'hctx' failed
There's issue as follows when do fault injection test:
unreferenced object 0xffff888132a9f400 (size 512):
comm "insmod", pid 308021, jiffies 4324277909 (age 509.733s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 08 f4 a9 32 81 88 ff ff ...........2....
08 f4 a9 32 81 88 ff ff 00 00 00 00 00 00 00 00 ...2............
backtrace:
[<00000000e8952bb4>] kmalloc_node_trace+0x22/0xa0
[<00000000f9980e0f>] blk_mq_alloc_and_init_hctx+0x3f1/0x7e0
[<000000002e719efa>] blk_mq_realloc_hw_ctxs+0x1e6/0x230
[<000000004f1fda40>] blk_mq_init_allocated_queue+0x27e/0x910
[<00000000287123ec>] __blk_mq_alloc_disk+0x67/0xf0
[<00000000a2a34657>] 0xffffffffa2ad310f
[<00000000b173f718>] 0xffffffffa2af824a
[<0000000095a1dabb>] do_one_initcall+0x87/0x2a0
[<00000000f32fdf93>] do_init_module+0xdf/0x320
[<00000000cbe8541e>] load_module+0x3006/0x3390
[<0000000069ed1bdb>] __do_sys_finit_module+0x113/0x1b0
[<00000000a1a29ae8>] do_syscall_64+0x35/0x80
[<000000009cd878b0>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
Fault injection context as follows:
kobject_add
blk_mq_register_hctx
blk_mq_sysfs_register
blk_register_queue
device_add_disk
null_add_dev.part.0 [null_blk]
As 'blk_mq_register_hctx' may already add some objects when failed halfway,
but there isn't do fallback, caller don't know which objects add failed.
To solve above issue just do fallback when add objects failed halfway in
'blk_mq_register_hctx'. |
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: hci_conn: Fix crash on hci_create_cis_sync
When attempting to connect multiple ISO sockets without using
DEFER_SETUP may result in the following crash:
BUG: KASAN: null-ptr-deref in hci_create_cis_sync+0x18b/0x2b0
Read of size 2 at addr 0000000000000036 by task kworker/u3:1/50
CPU: 0 PID: 50 Comm: kworker/u3:1 Not tainted
6.0.0-rc7-02243-gb84a13ff4eda #4373
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.16.0-1.fc36 04/01/2014
Workqueue: hci0 hci_cmd_sync_work
Call Trace:
<TASK>
dump_stack_lvl+0x19/0x27
kasan_report+0xbc/0xf0
? hci_create_cis_sync+0x18b/0x2b0
hci_create_cis_sync+0x18b/0x2b0
? get_link_mode+0xd0/0xd0
? __ww_mutex_lock_slowpath+0x10/0x10
? mutex_lock+0xe0/0xe0
? get_link_mode+0xd0/0xd0
hci_cmd_sync_work+0x111/0x190
process_one_work+0x427/0x650
worker_thread+0x87/0x750
? process_one_work+0x650/0x650
kthread+0x14e/0x180
? kthread_exit+0x50/0x50
ret_from_fork+0x22/0x30
</TASK> |
In the Linux kernel, the following vulnerability has been resolved:
clk: samsung: Fix memory leak in _samsung_clk_register_pll()
If clk_register() fails, @pll->rate_table may have allocated memory by
kmemdup(), so it needs to be freed, otherwise will cause memory leak
issue, this patch fixes it. |
In the Linux kernel, the following vulnerability has been resolved:
libbpf: Use elf_getshdrnum() instead of e_shnum
This commit replace e_shnum with the elf_getshdrnum() helper to fix two
oss-fuzz-reported heap-buffer overflow in __bpf_object__open. Both
reports are incorrectly marked as fixed and while still being
reproducible in the latest libbpf.
# clusterfuzz-testcase-minimized-bpf-object-fuzzer-5747922482888704
libbpf: loading object 'fuzz-object' from buffer
libbpf: sec_cnt is 0
libbpf: elf: section(1) .data, size 0, link 538976288, flags 2020202020202020, type=2
libbpf: elf: section(2) .data, size 32, link 538976288, flags 202020202020ff20, type=1
=================================================================
==13==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000000c0 at pc 0x0000005a7b46 bp 0x7ffd12214af0 sp 0x7ffd12214ae8
WRITE of size 4 at 0x6020000000c0 thread T0
SCARINESS: 46 (4-byte-write-heap-buffer-overflow-far-from-bounds)
#0 0x5a7b45 in bpf_object__elf_collect /src/libbpf/src/libbpf.c:3414:24
#1 0x5733c0 in bpf_object_open /src/libbpf/src/libbpf.c:7223:16
#2 0x5739fd in bpf_object__open_mem /src/libbpf/src/libbpf.c:7263:20
...
The issue lie in libbpf's direct use of e_shnum field in ELF header as
the section header count. Where as libelf implemented an extra logic
that, when e_shnum == 0 && e_shoff != 0, will use sh_size member of the
initial section header as the real section header count (part of ELF
spec to accommodate situation where section header counter is larger
than SHN_LORESERVE).
The above inconsistency lead to libbpf writing into a zero-entry calloc
area. So intead of using e_shnum directly, use the elf_getshdrnum()
helper provided by libelf to retrieve the section header counter into
sec_cnt. |
In the Linux kernel, the following vulnerability has been resolved:
fs/ntfs3: Fix memory leak on ntfs_fill_super() error path
syzbot reported kmemleak as below:
BUG: memory leak
unreferenced object 0xffff8880122f1540 (size 32):
comm "a.out", pid 6664, jiffies 4294939771 (age 25.500s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 ed ff ed ff 00 00 00 00 ................
backtrace:
[<ffffffff81b16052>] ntfs_init_fs_context+0x22/0x1c0
[<ffffffff8164aaa7>] alloc_fs_context+0x217/0x430
[<ffffffff81626dd4>] path_mount+0x704/0x1080
[<ffffffff81627e7c>] __x64_sys_mount+0x18c/0x1d0
[<ffffffff84593e14>] do_syscall_64+0x34/0xb0
[<ffffffff84600087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
This patch fixes this issue by freeing mount options on error path of
ntfs_fill_super(). |
In the Linux kernel, the following vulnerability has been resolved:
nfs: fix possible null-ptr-deref when parsing param
According to commit "vfs: parse: deal with zero length string value",
kernel will set the param->string to null pointer in vfs_parse_fs_string()
if fs string has zero length.
Yet the problem is that, nfs_fs_context_parse_param() will dereferences the
param->string, without checking whether it is a null pointer, which may
trigger a null-ptr-deref bug.
This patch solves it by adding sanity check on param->string
in nfs_fs_context_parse_param(). |
In the Linux kernel, the following vulnerability has been resolved:
mtd: core: Fix refcount error in del_mtd_device()
del_mtd_device() will call of_node_put() to mtd_get_of_node(mtd), which
is mtd->dev.of_node. However, memset(&mtd->dev, 0) is called before
of_node_put(). As the result, of_node_put() won't do anything in
del_mtd_device(), and causes the refcount leak.
del_mtd_device()
memset(&mtd->dev, 0, sizeof(mtd->dev) # clear mtd->dev
of_node_put()
mtd_get_of_node(mtd) # mtd->dev is cleared, can't locate of_node
# of_node_put(NULL) won't do anything
Fix the error by caching the pointer of the device_node.
OF: ERROR: memory leak, expected refcount 1 instead of 2,
of_node_get()/of_node_put() unbalanced - destroy cset entry: attach
overlay node /spi/spi-sram@0
CPU: 3 PID: 275 Comm: python3 Tainted: G N 6.1.0-rc3+ #54
0d8a1edddf51f172ff5226989a7565c6313b08e2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x67/0x83
kobject_get+0x155/0x160
of_node_get+0x1f/0x30
of_fwnode_get+0x43/0x70
fwnode_handle_get+0x54/0x80
fwnode_get_nth_parent+0xc9/0xe0
fwnode_full_name_string+0x3f/0xa0
device_node_string+0x30f/0x750
pointer+0x598/0x7a0
vsnprintf+0x62d/0x9b0
...
cfs_overlay_release+0x30/0x90
config_item_release+0xbe/0x1a0
config_item_put+0x5e/0x80
configfs_rmdir+0x3bd/0x540
vfs_rmdir+0x18c/0x320
do_rmdir+0x198/0x330
__x64_sys_rmdir+0x2c/0x40
do_syscall_64+0x37/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
[<miquel.raynal@bootlin.com>: Light reword of the commit log] |
In the Linux kernel, the following vulnerability has been resolved:
mt76: mt7915: Fix PCI device refcount leak in mt7915_pci_init_hif2()
As comment of pci_get_device() says, it returns a pci_device with its
refcount increased. We need to call pci_dev_put() to decrease the
refcount. Save the return value of pci_get_device() and call
pci_dev_put() to decrease the refcount. |
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Fix null ndlp ptr dereference in abnormal exit path for GFT_ID
An error case exit from lpfc_cmpl_ct_cmd_gft_id() results in a call to
lpfc_nlp_put() with a null pointer to a nodelist structure.
Changed lpfc_cmpl_ct_cmd_gft_id() to initialize nodelist pointer upon
entry. |
In the Linux kernel, the following vulnerability has been resolved:
platform/chrome: cros_usbpd_notify: Fix error handling in cros_usbpd_notify_init()
The following WARNING message was given when rmmod cros_usbpd_notify:
Unexpected driver unregister!
WARNING: CPU: 0 PID: 253 at drivers/base/driver.c:270 driver_unregister+0x8a/0xb0
Modules linked in: cros_usbpd_notify(-)
CPU: 0 PID: 253 Comm: rmmod Not tainted 6.1.0-rc3 #24
...
Call Trace:
<TASK>
cros_usbpd_notify_exit+0x11/0x1e [cros_usbpd_notify]
__x64_sys_delete_module+0x3c7/0x570
? __ia32_sys_delete_module+0x570/0x570
? lock_is_held_type+0xe3/0x140
? syscall_enter_from_user_mode+0x17/0x50
? rcu_read_lock_sched_held+0xa0/0xd0
? syscall_enter_from_user_mode+0x1c/0x50
do_syscall_64+0x37/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f333fe9b1b7
The reason is that the cros_usbpd_notify_init() does not check the return
value of platform_driver_register(), and the cros_usbpd_notify can
install successfully even if platform_driver_register() failed.
Fix by checking the return value of platform_driver_register() and
unregister cros_usbpd_notify_plat_driver when it failed. |
In the Linux kernel, the following vulnerability has been resolved:
ext4: remove a BUG_ON in ext4_mb_release_group_pa()
If a malicious fuzzer overwrites the ext4 superblock while it is
mounted such that the s_first_data_block is set to a very large
number, the calculation of the block group can underflow, and trigger
a BUG_ON check. Change this to be an ext4_warning so that we don't
crash the kernel. |
In the Linux kernel, the following vulnerability has been resolved:
drm/radeon: free iio for atombios when driver shutdown
Fix below kmemleak when unload radeon driver:
unreferenced object 0xffff9f8608ede200 (size 512):
comm "systemd-udevd", pid 326, jiffies 4294682822 (age 716.338s)
hex dump (first 32 bytes):
00 00 00 00 c4 aa ec aa 14 ab 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<0000000062fadebe>] kmem_cache_alloc_trace+0x2f1/0x500
[<00000000b6883cea>] atom_parse+0x117/0x230 [radeon]
[<00000000158c23fd>] radeon_atombios_init+0xab/0x170 [radeon]
[<00000000683f672e>] si_init+0x57/0x750 [radeon]
[<00000000566cc31f>] radeon_device_init+0x559/0x9c0 [radeon]
[<0000000046efabb3>] radeon_driver_load_kms+0xc1/0x1a0 [radeon]
[<00000000b5155064>] drm_dev_register+0xdd/0x1d0
[<0000000045fec835>] radeon_pci_probe+0xbd/0x100 [radeon]
[<00000000e69ecca3>] pci_device_probe+0xe1/0x160
[<0000000019484b76>] really_probe.part.0+0xc1/0x2c0
[<000000003f2649da>] __driver_probe_device+0x96/0x130
[<00000000231c5bb1>] driver_probe_device+0x24/0xf0
[<0000000000a42377>] __driver_attach+0x77/0x190
[<00000000d7574da6>] bus_for_each_dev+0x7f/0xd0
[<00000000633166d2>] driver_attach+0x1e/0x30
[<00000000313b05b8>] bus_add_driver+0x12c/0x1e0
iio was allocated in atom_index_iio() called by atom_parse(),
but it doesn't got released when the dirver is shutdown.
Fix this kmemleak by free it in radeon_atombios_fini(). |
In the Linux kernel, the following vulnerability has been resolved:
HID: mcp-2221: prevent UAF in delayed work
If the device is plugged/unplugged without giving time for mcp_init_work()
to complete, we might kick in the devm free code path and thus have
unavailable struct mcp_2221 while in delayed work.
Canceling the delayed_work item is enough to solve the issue, because
cancel_delayed_work_sync will prevent the work item to requeue itself. |
In the Linux kernel, the following vulnerability has been resolved:
io_uring: wait interruptibly for request completions on exit
WHen the ring exits, cleanup is done and the final cancelation and
waiting on completions is done by io_ring_exit_work. That function is
invoked by kworker, which doesn't take any signals. Because of that, it
doesn't really matter if we wait for completions in TASK_INTERRUPTIBLE
or TASK_UNINTERRUPTIBLE state. However, it does matter to the hung task
detection checker!
Normally we expect cancelations and completions to happen rather
quickly. Some test cases, however, will exit the ring and park the
owning task stopped (eg via SIGSTOP). If the owning task needs to run
task_work to complete requests, then io_ring_exit_work won't make any
progress until the task is runnable again. Hence io_ring_exit_work can
trigger the hung task detection, which is particularly problematic if
panic-on-hung-task is enabled.
As the ring exit doesn't take signals to begin with, have it wait
interruptibly rather than uninterruptibly. io_uring has a separate
stuck-exit warning that triggers independently anyway, so we're not
really missing anything by making this switch. |
In the Linux kernel, the following vulnerability has been resolved:
hsr: Fix uninit-value access in fill_frame_info()
Syzbot reports the following uninit-value access problem.
=====================================================
BUG: KMSAN: uninit-value in fill_frame_info net/hsr/hsr_forward.c:601 [inline]
BUG: KMSAN: uninit-value in hsr_forward_skb+0x9bd/0x30f0 net/hsr/hsr_forward.c:616
fill_frame_info net/hsr/hsr_forward.c:601 [inline]
hsr_forward_skb+0x9bd/0x30f0 net/hsr/hsr_forward.c:616
hsr_dev_xmit+0x192/0x330 net/hsr/hsr_device.c:223
__netdev_start_xmit include/linux/netdevice.h:4889 [inline]
netdev_start_xmit include/linux/netdevice.h:4903 [inline]
xmit_one net/core/dev.c:3544 [inline]
dev_hard_start_xmit+0x247/0xa10 net/core/dev.c:3560
__dev_queue_xmit+0x34d0/0x52a0 net/core/dev.c:4340
dev_queue_xmit include/linux/netdevice.h:3082 [inline]
packet_xmit+0x9c/0x6b0 net/packet/af_packet.c:276
packet_snd net/packet/af_packet.c:3087 [inline]
packet_sendmsg+0x8b1d/0x9f30 net/packet/af_packet.c:3119
sock_sendmsg_nosec net/socket.c:730 [inline]
sock_sendmsg net/socket.c:753 [inline]
__sys_sendto+0x781/0xa30 net/socket.c:2176
__do_sys_sendto net/socket.c:2188 [inline]
__se_sys_sendto net/socket.c:2184 [inline]
__ia32_sys_sendto+0x11f/0x1c0 net/socket.c:2184
do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
__do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
entry_SYSENTER_compat_after_hwframe+0x70/0x82
Uninit was created at:
slab_post_alloc_hook+0x12f/0xb70 mm/slab.h:767
slab_alloc_node mm/slub.c:3478 [inline]
kmem_cache_alloc_node+0x577/0xa80 mm/slub.c:3523
kmalloc_reserve+0x148/0x470 net/core/skbuff.c:559
__alloc_skb+0x318/0x740 net/core/skbuff.c:644
alloc_skb include/linux/skbuff.h:1286 [inline]
alloc_skb_with_frags+0xc8/0xbd0 net/core/skbuff.c:6299
sock_alloc_send_pskb+0xa80/0xbf0 net/core/sock.c:2794
packet_alloc_skb net/packet/af_packet.c:2936 [inline]
packet_snd net/packet/af_packet.c:3030 [inline]
packet_sendmsg+0x70e8/0x9f30 net/packet/af_packet.c:3119
sock_sendmsg_nosec net/socket.c:730 [inline]
sock_sendmsg net/socket.c:753 [inline]
__sys_sendto+0x781/0xa30 net/socket.c:2176
__do_sys_sendto net/socket.c:2188 [inline]
__se_sys_sendto net/socket.c:2184 [inline]
__ia32_sys_sendto+0x11f/0x1c0 net/socket.c:2184
do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
__do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
entry_SYSENTER_compat_after_hwframe+0x70/0x82
It is because VLAN not yet supported in hsr driver. Return error
when protocol is ETH_P_8021Q in fill_frame_info() now to fix it. |
In the Linux kernel, the following vulnerability has been resolved:
ibmvnic: Do not reset dql stats on NON_FATAL err
All ibmvnic resets, make a call to netdev_tx_reset_queue() when
re-opening the device. netdev_tx_reset_queue() resets the num_queued
and num_completed byte counters. These stats are used in Byte Queue
Limit (BQL) algorithms. The difference between these two stats tracks
the number of bytes currently sitting on the physical NIC. ibmvnic
increases the number of queued bytes though calls to
netdev_tx_sent_queue() in the drivers xmit function. When, VIOS reports
that it is done transmitting bytes, the ibmvnic device increases the
number of completed bytes through calls to netdev_tx_completed_queue().
It is important to note that the driver batches its transmit calls and
num_queued is increased every time that an skb is added to the next
batch, not necessarily when the batch is sent to VIOS for transmission.
Unlike other reset types, a NON FATAL reset will not flush the sub crq
tx buffers. Therefore, it is possible for the batched skb array to be
partially full. So if there is call to netdev_tx_reset_queue() when
re-opening the device, the value of num_queued (0) would not account
for the skb's that are currently batched. Eventually, when the batch
is sent to VIOS, the call to netdev_tx_completed_queue() would increase
num_completed to a value greater than the num_queued. This causes a
BUG_ON crash:
ibmvnic 30000002: Firmware reports error, cause: adapter problem.
Starting recovery...
ibmvnic 30000002: tx error 600
ibmvnic 30000002: tx error 600
ibmvnic 30000002: tx error 600
ibmvnic 30000002: tx error 600
------------[ cut here ]------------
kernel BUG at lib/dynamic_queue_limits.c:27!
Oops: Exception in kernel mode, sig: 5
[....]
NIP dql_completed+0x28/0x1c0
LR ibmvnic_complete_tx.isra.0+0x23c/0x420 [ibmvnic]
Call Trace:
ibmvnic_complete_tx.isra.0+0x3f8/0x420 [ibmvnic] (unreliable)
ibmvnic_interrupt_tx+0x40/0x70 [ibmvnic]
__handle_irq_event_percpu+0x98/0x270
---[ end trace ]---
Therefore, do not reset the dql stats when performing a NON_FATAL reset. |
In the Linux kernel, the following vulnerability has been resolved:
scsi: iscsi_tcp: Check that sock is valid before iscsi_set_param()
The validity of sock should be checked before assignment to avoid incorrect
values. Commit 57569c37f0ad ("scsi: iscsi: iscsi_tcp: Fix null-ptr-deref
while calling getpeername()") introduced this change which may lead to
inconsistent values of tcp_sw_conn->sendpage and conn->datadgst_en.
Fix the issue by moving the position of the assignment. |
In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu/gfx: disable gfx9 cp_ecc_error_irq only when enabling legacy gfx ras
gfx9 cp_ecc_error_irq is only enabled when legacy gfx ras is assert.
So in gfx_v9_0_hw_fini, interrupt disablement for cp_ecc_error_irq
should be executed under such condition, otherwise, an amdgpu_irq_put
calltrace will occur.
[ 7283.170322] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu]
[ 7283.170964] RSP: 0018:ffff9a5fc3967d00 EFLAGS: 00010246
[ 7283.170967] RAX: ffff98d88afd3040 RBX: ffff98d89da20000 RCX: 0000000000000000
[ 7283.170969] RDX: 0000000000000000 RSI: ffff98d89da2bef8 RDI: ffff98d89da20000
[ 7283.170971] RBP: ffff98d89da20000 R08: ffff98d89da2ca18 R09: 0000000000000006
[ 7283.170973] R10: ffffd5764243c008 R11: 0000000000000000 R12: 0000000000001050
[ 7283.170975] R13: ffff98d89da38978 R14: ffffffff999ae15a R15: ffff98d880130105
[ 7283.170978] FS: 0000000000000000(0000) GS:ffff98d996f00000(0000) knlGS:0000000000000000
[ 7283.170981] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7283.170983] CR2: 00000000f7a9d178 CR3: 00000001c42ea000 CR4: 00000000003506e0
[ 7283.170986] Call Trace:
[ 7283.170988] <TASK>
[ 7283.170989] gfx_v9_0_hw_fini+0x1c/0x6d0 [amdgpu]
[ 7283.171655] amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu]
[ 7283.172245] amdgpu_device_suspend+0x103/0x180 [amdgpu]
[ 7283.172823] amdgpu_pmops_freeze+0x21/0x60 [amdgpu]
[ 7283.173412] pci_pm_freeze+0x54/0xc0
[ 7283.173419] ? __pfx_pci_pm_freeze+0x10/0x10
[ 7283.173425] dpm_run_callback+0x98/0x200
[ 7283.173430] __device_suspend+0x164/0x5f0
v2: drop gfx11 as it's fixed in a different solution by retiring cp_ecc_irq funcs(Hawking) |