CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
In the Linux kernel, the following vulnerability has been resolved:
nexthop: Fix memory leaks in nexthop notification chain listeners
syzkaller discovered memory leaks [1] that can be reduced to the
following commands:
# ip nexthop add id 1 blackhole
# devlink dev reload pci/0000:06:00.0
As part of the reload flow, mlxsw will unregister its netdevs and then
unregister from the nexthop notification chain. Before unregistering
from the notification chain, mlxsw will receive delete notifications for
nexthop objects using netdevs registered by mlxsw or their uppers. mlxsw
will not receive notifications for nexthops using netdevs that are not
dismantled as part of the reload flow. For example, the blackhole
nexthop above that internally uses the loopback netdev as its nexthop
device.
One way to fix this problem is to have listeners flush their nexthop
tables after unregistering from the notification chain. This is
error-prone as evident by this patch and also not symmetric with the
registration path where a listener receives a dump of all the existing
nexthops.
Therefore, fix this problem by replaying delete notifications for the
listener being unregistered. This is symmetric to the registration path
and also consistent with the netdev notification chain.
The above means that unregister_nexthop_notifier(), like
register_nexthop_notifier(), will have to take RTNL in order to iterate
over the existing nexthops and that any callers of the function cannot
hold RTNL. This is true for mlxsw and netdevsim, but not for the VXLAN
driver. To avoid a deadlock, change the latter to unregister its nexthop
listener without holding RTNL, making it symmetric to the registration
path.
[1]
unreferenced object 0xffff88806173d600 (size 512):
comm "syz-executor.0", pid 1290, jiffies 4295583142 (age 143.507s)
hex dump (first 32 bytes):
41 9d 1e 60 80 88 ff ff 08 d6 73 61 80 88 ff ff A..`......sa....
08 d6 73 61 80 88 ff ff 01 00 00 00 00 00 00 00 ..sa............
backtrace:
[<ffffffff81a6b576>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
[<ffffffff81a6b576>] slab_post_alloc_hook+0x96/0x490 mm/slab.h:522
[<ffffffff81a716d3>] slab_alloc_node mm/slub.c:3206 [inline]
[<ffffffff81a716d3>] slab_alloc mm/slub.c:3214 [inline]
[<ffffffff81a716d3>] kmem_cache_alloc_trace+0x163/0x370 mm/slub.c:3231
[<ffffffff82e8681a>] kmalloc include/linux/slab.h:591 [inline]
[<ffffffff82e8681a>] kzalloc include/linux/slab.h:721 [inline]
[<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_group_create drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:4918 [inline]
[<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_new drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:5054 [inline]
[<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_event+0x59a/0x2910 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:5239
[<ffffffff813ef67d>] notifier_call_chain+0xbd/0x210 kernel/notifier.c:83
[<ffffffff813f0662>] blocking_notifier_call_chain kernel/notifier.c:318 [inline]
[<ffffffff813f0662>] blocking_notifier_call_chain+0x72/0xa0 kernel/notifier.c:306
[<ffffffff8384b9c6>] call_nexthop_notifiers+0x156/0x310 net/ipv4/nexthop.c:244
[<ffffffff83852bd8>] insert_nexthop net/ipv4/nexthop.c:2336 [inline]
[<ffffffff83852bd8>] nexthop_add net/ipv4/nexthop.c:2644 [inline]
[<ffffffff83852bd8>] rtm_new_nexthop+0x14e8/0x4d10 net/ipv4/nexthop.c:2913
[<ffffffff833e9a78>] rtnetlink_rcv_msg+0x448/0xbf0 net/core/rtnetlink.c:5572
[<ffffffff83608703>] netlink_rcv_skb+0x173/0x480 net/netlink/af_netlink.c:2504
[<ffffffff833de032>] rtnetlink_rcv+0x22/0x30 net/core/rtnetlink.c:5590
[<ffffffff836069de>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
[<ffffffff836069de>] netlink_unicast+0x5ae/0x7f0 net/netlink/af_netlink.c:1340
[<ffffffff83607501>] netlink_sendmsg+0x8e1/0xe30 net/netlink/af_netlink.c:1929
[<ffffffff832fde84>] sock_sendmsg_nosec net/socket.c:704 [inline
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
enetc: Fix illegal access when reading affinity_hint
irq_set_affinity_hit() stores a reference to the cpumask_t
parameter in the irq descriptor, and that reference can be
accessed later from irq_affinity_hint_proc_show(). Since
the cpu_mask parameter passed to irq_set_affinity_hit() has
only temporary storage (it's on the stack memory), later
accesses to it are illegal. Thus reads from the corresponding
procfs affinity_hint file can result in paging request oops.
The issue is fixed by the get_cpu_mask() helper, which provides
a permanent storage for the cpumask_t parameter. |
In the Linux kernel, the following vulnerability has been resolved:
scsi: megaraid_sas: Fix resource leak in case of probe failure
The driver doesn't clean up all the allocated resources properly when
scsi_add_host(), megasas_start_aen() function fails during the PCI device
probe.
Clean up all those resources. |
In the Linux kernel, the following vulnerability has been resolved:
cpufreq: CPPC: Fix potential memleak in cppc_cpufreq_cpu_init
It's a classic example of memleak, we allocate something, we fail and
never free the resources.
Make sure we free all resources on policy ->init() failures. |
In the Linux kernel, the following vulnerability has been resolved:
net: sched: fix memory leak in tcindex_partial_destroy_work
Syzbot reported memory leak in tcindex_set_parms(). The problem was in
non-freed perfect hash in tcindex_partial_destroy_work().
In tcindex_set_parms() new tcindex_data is allocated and some fields from
old one are copied to new one, but not the perfect hash. Since
tcindex_partial_destroy_work() is the destroy function for old
tcindex_data, we need to free perfect hash to avoid memory leak. |
In the Linux kernel, the following vulnerability has been resolved:
isdn: mISDN: netjet: Fix crash in nj_probe:
'nj_setup' in netjet.c might fail with -EIO and in this case
'card->irq' is initialized and is bigger than zero. A subsequent call to
'nj_release' will free the irq that has not been requested.
Fix this bug by deleting the previous assignment to 'card->irq' and just
keep the assignment before 'request_irq'.
The KASAN's log reveals it:
[ 3.354615 ] WARNING: CPU: 0 PID: 1 at kernel/irq/manage.c:1826
free_irq+0x100/0x480
[ 3.355112 ] Modules linked in:
[ 3.355310 ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
5.13.0-rc1-00144-g25a1298726e #13
[ 3.355816 ] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[ 3.356552 ] RIP: 0010:free_irq+0x100/0x480
[ 3.356820 ] Code: 6e 08 74 6f 4d 89 f4 e8 5e ac 09 00 4d 8b 74 24 18
4d 85 f6 75 e3 e8 4f ac 09 00 8b 75 c8 48 c7 c7 78 c1 2e 85 e8 e0 cf f5
ff <0f> 0b 48 8b 75 c0 4c 89 ff e8 72 33 0b 03 48 8b 43 40 4c 8b a0 80
[ 3.358012 ] RSP: 0000:ffffc90000017b48 EFLAGS: 00010082
[ 3.358357 ] RAX: 0000000000000000 RBX: ffff888104dc8000 RCX:
0000000000000000
[ 3.358814 ] RDX: ffff8881003c8000 RSI: ffffffff8124a9e6 RDI:
00000000ffffffff
[ 3.359272 ] RBP: ffffc90000017b88 R08: 0000000000000000 R09:
0000000000000000
[ 3.359732 ] R10: ffffc900000179f0 R11: 0000000000001d04 R12:
0000000000000000
[ 3.360195 ] R13: ffff888107dc6000 R14: ffff888107dc6928 R15:
ffff888104dc80a8
[ 3.360652 ] FS: 0000000000000000(0000) GS:ffff88817bc00000(0000)
knlGS:0000000000000000
[ 3.361170 ] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.361538 ] CR2: 0000000000000000 CR3: 000000000582e000 CR4:
00000000000006f0
[ 3.362003 ] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 3.362175 ] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 3.362175 ] Call Trace:
[ 3.362175 ] nj_release+0x51/0x1e0
[ 3.362175 ] nj_probe+0x450/0x950
[ 3.362175 ] ? pci_device_remove+0x110/0x110
[ 3.362175 ] local_pci_probe+0x45/0xa0
[ 3.362175 ] pci_device_probe+0x12b/0x1d0
[ 3.362175 ] really_probe+0x2a9/0x610
[ 3.362175 ] driver_probe_device+0x90/0x1d0
[ 3.362175 ] ? mutex_lock_nested+0x1b/0x20
[ 3.362175 ] device_driver_attach+0x68/0x70
[ 3.362175 ] __driver_attach+0x124/0x1b0
[ 3.362175 ] ? device_driver_attach+0x70/0x70
[ 3.362175 ] bus_for_each_dev+0xbb/0x110
[ 3.362175 ] ? rdinit_setup+0x45/0x45
[ 3.362175 ] driver_attach+0x27/0x30
[ 3.362175 ] bus_add_driver+0x1eb/0x2a0
[ 3.362175 ] driver_register+0xa9/0x180
[ 3.362175 ] __pci_register_driver+0x82/0x90
[ 3.362175 ] ? w6692_init+0x38/0x38
[ 3.362175 ] nj_init+0x36/0x38
[ 3.362175 ] do_one_initcall+0x7f/0x3d0
[ 3.362175 ] ? rdinit_setup+0x45/0x45
[ 3.362175 ] ? rcu_read_lock_sched_held+0x4f/0x80
[ 3.362175 ] kernel_init_freeable+0x2aa/0x301
[ 3.362175 ] ? rest_init+0x2c0/0x2c0
[ 3.362175 ] kernel_init+0x18/0x190
[ 3.362175 ] ? rest_init+0x2c0/0x2c0
[ 3.362175 ] ? rest_init+0x2c0/0x2c0
[ 3.362175 ] ret_from_fork+0x1f/0x30
[ 3.362175 ] Kernel panic - not syncing: panic_on_warn set ...
[ 3.362175 ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
5.13.0-rc1-00144-g25a1298726e #13
[ 3.362175 ] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[ 3.362175 ] Call Trace:
[ 3.362175 ] dump_stack+0xba/0xf5
[ 3.362175 ] ? free_irq+0x100/0x480
[ 3.362175 ] panic+0x15a/0x3f2
[ 3.362175 ] ? __warn+0xf2/0x150
[ 3.362175 ] ? free_irq+0x100/0x480
[ 3.362175 ] __warn+0x108/0x150
[ 3.362175 ] ? free_irq+0x100/0x480
[ 3.362175 ] report_bug+0x119/0x1c0
[ 3.362175 ] handle_bug+0x3b/0x80
[ 3.362175 ] exc_invalid_op+0x18/0x70
[ 3.362175 ] asm_exc_invalid_op+0x12/0x20
[ 3.362175 ] RIP: 0010:free_irq+0x100
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
powerpc/64s: Fix pte update for kernel memory on radix
When adding a PTE a ptesync is needed to order the update of the PTE
with subsequent accesses otherwise a spurious fault may be raised.
radix__set_pte_at() does not do this for performance gains. For
non-kernel memory this is not an issue as any faults of this kind are
corrected by the page fault handler. For kernel memory these faults
are not handled. The current solution is that there is a ptesync in
flush_cache_vmap() which should be called when mapping from the
vmalloc region.
However, map_kernel_page() does not call flush_cache_vmap(). This is
troublesome in particular for code patching with Strict RWX on radix.
In do_patch_instruction() the page frame that contains the instruction
to be patched is mapped and then immediately patched. With no ordering
or synchronization between setting up the PTE and writing to the page
it is possible for faults.
As the code patching is done using __put_user_asm_goto() the resulting
fault is obscured - but using a normal store instead it can be seen:
BUG: Unable to handle kernel data access on write at 0xc008000008f24a3c
Faulting instruction address: 0xc00000000008bd74
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: nop_module(PO+) [last unloaded: nop_module]
CPU: 4 PID: 757 Comm: sh Tainted: P O 5.10.0-rc5-01361-ge3c1b78c8440-dirty #43
NIP: c00000000008bd74 LR: c00000000008bd50 CTR: c000000000025810
REGS: c000000016f634a0 TRAP: 0300 Tainted: P O (5.10.0-rc5-01361-ge3c1b78c8440-dirty)
MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 44002884 XER: 00000000
CFAR: c00000000007c68c DAR: c008000008f24a3c DSISR: 42000000 IRQMASK: 1
This results in the kind of issue reported here:
https://lore.kernel.org/linuxppc-dev/15AC5B0E-A221-4B8C-9039-FA96B8EF7C88@lca.pw/
Chris Riedl suggested a reliable way to reproduce the issue:
$ mount -t debugfs none /sys/kernel/debug
$ (while true; do echo function > /sys/kernel/debug/tracing/current_tracer ; echo nop > /sys/kernel/debug/tracing/current_tracer ; done) &
Turning ftrace on and off does a large amount of code patching which
in usually less then 5min will crash giving a trace like:
ftrace-powerpc: (____ptrval____): replaced (4b473b11) != old (60000000)
------------[ ftrace bug ]------------
ftrace failed to modify
[<c000000000bf8e5c>] napi_busy_loop+0xc/0x390
actual: 11:3b:47:4b
Setting ftrace call site to call ftrace function
ftrace record flags: 80000001
(1)
expected tramp: c00000000006c96c
------------[ cut here ]------------
WARNING: CPU: 4 PID: 809 at kernel/trace/ftrace.c:2065 ftrace_bug+0x28c/0x2e8
Modules linked in: nop_module(PO-) [last unloaded: nop_module]
CPU: 4 PID: 809 Comm: sh Tainted: P O 5.10.0-rc5-01360-gf878ccaf250a #1
NIP: c00000000024f334 LR: c00000000024f330 CTR: c0000000001a5af0
REGS: c000000004c8b760 TRAP: 0700 Tainted: P O (5.10.0-rc5-01360-gf878ccaf250a)
MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28008848 XER: 20040000
CFAR: c0000000001a9c98 IRQMASK: 0
GPR00: c00000000024f330 c000000004c8b9f0 c000000002770600 0000000000000022
GPR04: 00000000ffff7fff c000000004c8b6d0 0000000000000027 c0000007fe9bcdd8
GPR08: 0000000000000023 ffffffffffffffd8 0000000000000027 c000000002613118
GPR12: 0000000000008000 c0000007fffdca00 0000000000000000 0000000000000000
GPR16: 0000000023ec37c5 0000000000000000 0000000000000000 0000000000000008
GPR20: c000000004c8bc90 c0000000027a2d20 c000000004c8bcd0 c000000002612fe8
GPR24: 0000000000000038 0000000000000030 0000000000000028 0000000000000020
GPR28: c000000000ff1b68 c000000000bf8e5c c00000000312f700 c000000000fbb9b0
NIP ftrace_bug+0x28c/0x2e8
LR ftrace_bug+0x288/0x2e8
Call T
---truncated--- |
In the Linux kernel, the following vulnerability has been resolved:
net: marvell: prestera: fix port event handling on init
For some reason there might be a crash during ports creation if port
events are handling at the same time because fw may send initial
port event with down state.
The crash points to cancel_delayed_work() which is called when port went
is down. Currently I did not find out the real cause of the issue, so
fixed it by cancel port stats work only if previous port's state was up
& runnig.
The following is the crash which can be triggered:
[ 28.311104] Unable to handle kernel paging request at virtual address
000071775f776600
[ 28.319097] Mem abort info:
[ 28.321914] ESR = 0x96000004
[ 28.324996] EC = 0x25: DABT (current EL), IL = 32 bits
[ 28.330350] SET = 0, FnV = 0
[ 28.333430] EA = 0, S1PTW = 0
[ 28.336597] Data abort info:
[ 28.339499] ISV = 0, ISS = 0x00000004
[ 28.343362] CM = 0, WnR = 0
[ 28.346354] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000100bf7000
[ 28.352842] [000071775f776600] pgd=0000000000000000,
p4d=0000000000000000
[ 28.359695] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 28.365310] Modules linked in: prestera_pci(+) prestera
uio_pdrv_genirq
[ 28.372005] CPU: 0 PID: 1291 Comm: kworker/0:1H Not tainted
5.11.0-rc4 #1
[ 28.378846] Hardware name: DNI AmazonGo1 A7040 board (DT)
[ 28.384283] Workqueue: prestera_fw_wq prestera_fw_evt_work_fn
[prestera_pci]
[ 28.391413] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
[ 28.397468] pc : get_work_pool+0x48/0x60
[ 28.401442] lr : try_to_grab_pending+0x6c/0x1b0
[ 28.406018] sp : ffff80001391bc60
[ 28.409358] x29: ffff80001391bc60 x28: 0000000000000000
[ 28.414725] x27: ffff000104fc8b40 x26: ffff80001127de88
[ 28.420089] x25: 0000000000000000 x24: ffff000106119760
[ 28.425452] x23: ffff00010775dd60 x22: ffff00010567e000
[ 28.430814] x21: 0000000000000000 x20: ffff80001391bcb0
[ 28.436175] x19: ffff00010775deb8 x18: 00000000000000c0
[ 28.441537] x17: 0000000000000000 x16: 000000008d9b0e88
[ 28.446898] x15: 0000000000000001 x14: 00000000000002ba
[ 28.452261] x13: 80a3002c00000002 x12: 00000000000005f4
[ 28.457622] x11: 0000000000000030 x10: 000000000000000c
[ 28.462985] x9 : 000000000000000c x8 : 0000000000000030
[ 28.468346] x7 : ffff800014400000 x6 : ffff000106119758
[ 28.473708] x5 : 0000000000000003 x4 : ffff00010775dc60
[ 28.479068] x3 : 0000000000000000 x2 : 0000000000000060
[ 28.484429] x1 : 000071775f776600 x0 : ffff00010775deb8
[ 28.489791] Call trace:
[ 28.492259] get_work_pool+0x48/0x60
[ 28.495874] cancel_delayed_work+0x38/0xb0
[ 28.500011] prestera_port_handle_event+0x90/0xa0 [prestera]
[ 28.505743] prestera_evt_recv+0x98/0xe0 [prestera]
[ 28.510683] prestera_fw_evt_work_fn+0x180/0x228 [prestera_pci]
[ 28.516660] process_one_work+0x1e8/0x360
[ 28.520710] worker_thread+0x44/0x480
[ 28.524412] kthread+0x154/0x160
[ 28.527670] ret_from_fork+0x10/0x38
[ 28.531290] Code: a8c17bfd d50323bf d65f03c0 9278dc21 (f9400020)
[ 28.537429] ---[ end trace 5eced933df3a080b ]--- |
In the Linux kernel, the following vulnerability has been resolved:
net: Only allow init netns to set default tcp cong to a restricted algo
tcp_set_default_congestion_control() is netns-safe in that it writes
to &net->ipv4.tcp_congestion_control, but it also sets
ca->flags |= TCP_CONG_NON_RESTRICTED which is not namespaced.
This has the unintended side-effect of changing the global
net.ipv4.tcp_allowed_congestion_control sysctl, despite the fact that it
is read-only: 97684f0970f6 ("net: Make tcp_allowed_congestion_control
readonly in non-init netns")
Resolve this netns "leak" by only allowing the init netns to set the
default algorithm to one that is restricted. This restriction could be
removed if tcp_allowed_congestion_control were namespace-ified in the
future.
This bug was uncovered with
https://github.com/JonathonReinhart/linux-netns-sysctl-verify |
In the Linux kernel, the following vulnerability has been resolved:
KVM: VMX: Disable preemption when probing user return MSRs
Disable preemption when probing a user return MSR via RDSMR/WRMSR. If
the MSR holds a different value per logical CPU, the WRMSR could corrupt
the host's value if KVM is preempted between the RDMSR and WRMSR, and
then rescheduled on a different CPU.
Opportunistically land the helper in common x86, SVM will use the helper
in a future commit. |
In the Linux kernel, the following vulnerability has been resolved:
tracing: Restructure trace_clock_global() to never block
It was reported that a fix to the ring buffer recursion detection would
cause a hung machine when performing suspend / resume testing. The
following backtrace was extracted from debugging that case:
Call Trace:
trace_clock_global+0x91/0xa0
__rb_reserve_next+0x237/0x460
ring_buffer_lock_reserve+0x12a/0x3f0
trace_buffer_lock_reserve+0x10/0x50
__trace_graph_return+0x1f/0x80
trace_graph_return+0xb7/0xf0
? trace_clock_global+0x91/0xa0
ftrace_return_to_handler+0x8b/0xf0
? pv_hash+0xa0/0xa0
return_to_handler+0x15/0x30
? ftrace_graph_caller+0xa0/0xa0
? trace_clock_global+0x91/0xa0
? __rb_reserve_next+0x237/0x460
? ring_buffer_lock_reserve+0x12a/0x3f0
? trace_event_buffer_lock_reserve+0x3c/0x120
? trace_event_buffer_reserve+0x6b/0xc0
? trace_event_raw_event_device_pm_callback_start+0x125/0x2d0
? dpm_run_callback+0x3b/0xc0
? pm_ops_is_empty+0x50/0x50
? platform_get_irq_byname_optional+0x90/0x90
? trace_device_pm_callback_start+0x82/0xd0
? dpm_run_callback+0x49/0xc0
With the following RIP:
RIP: 0010:native_queued_spin_lock_slowpath+0x69/0x200
Since the fix to the recursion detection would allow a single recursion to
happen while tracing, this lead to the trace_clock_global() taking a spin
lock and then trying to take it again:
ring_buffer_lock_reserve() {
trace_clock_global() {
arch_spin_lock() {
queued_spin_lock_slowpath() {
/* lock taken */
(something else gets traced by function graph tracer)
ring_buffer_lock_reserve() {
trace_clock_global() {
arch_spin_lock() {
queued_spin_lock_slowpath() {
/* DEAD LOCK! */
Tracing should *never* block, as it can lead to strange lockups like the
above.
Restructure the trace_clock_global() code to instead of simply taking a
lock to update the recorded "prev_time" simply use it, as two events
happening on two different CPUs that calls this at the same time, really
doesn't matter which one goes first. Use a trylock to grab the lock for
updating the prev_time, and if it fails, simply try again the next time.
If it failed to be taken, that means something else is already updating
it.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212761 |
In the Linux kernel, the following vulnerability has been resolved:
x86/xen: Drop USERGS_SYSRET64 paravirt call
commit afd30525a659ac0ae0904f0cb4a2ca75522c3123 upstream.
USERGS_SYSRET64 is used to return from a syscall via SYSRET, but
a Xen PV guest will nevertheless use the IRET hypercall, as there
is no sysret PV hypercall defined.
So instead of testing all the prerequisites for doing a sysret and
then mangling the stack for Xen PV again for doing an iret just use
the iret exit from the beginning.
This can easily be done via an ALTERNATIVE like it is done for the
sysenter compat case already.
It should be noted that this drops the optimization in Xen for not
restoring a few registers when returning to user mode, but it seems
as if the saved instructions in the kernel more than compensate for
this drop (a kernel build in a Xen PV guest was slightly faster with
this patch applied).
While at it remove the stale sysret32 remnants.
[ pawan: Brad Spengler and Salvatore Bonaccorso <carnil@debian.org>
reported a problem with the 5.10 backport commit edc702b4a820
("x86/entry_64: Add VERW just before userspace transition").
When CONFIG_PARAVIRT_XXL=y, CLEAR_CPU_BUFFERS is not executed in
syscall_return_via_sysret path as USERGS_SYSRET64 is runtime
patched to:
.cpu_usergs_sysret64 = { 0x0f, 0x01, 0xf8,
0x48, 0x0f, 0x07 }, // swapgs; sysretq
which is missing CLEAR_CPU_BUFFERS. It turns out dropping
USERGS_SYSRET64 simplifies the code, allowing CLEAR_CPU_BUFFERS
to be explicitly added to syscall_return_via_sysret path. Below
is with CONFIG_PARAVIRT_XXL=y and this patch applied:
syscall_return_via_sysret:
...
<+342>: swapgs
<+345>: xchg %ax,%ax
<+347>: verw -0x1a2(%rip) <------
<+354>: sysretq
] |
.NET Denial of Service Vulnerability |
.NET Denial of Service Vulnerability |
Windows DNS Client Denial of Service Vulnerability |
Microsoft QUIC Denial of Service Vulnerability |
.NET and Visual Studio Denial of Service Vulnerability |
DHCP Server Service Denial of Service Vulnerability |
DHCP Server Service Denial of Service Vulnerability |
DHCP Server Service Denial of Service Vulnerability |