Description
In the Linux kernel, the following vulnerability has been resolved:

RDMA/mlx5: Fix UMR hang in LAG error state unload

During firmware reset in LAG mode, a race condition causes the driver
to hang indefinitely while waiting for UMR completion during device
unload. See [1].

In LAG mode the bond device is only registered on the master, so it
never sees sys_error events from the slave.
During firmware reset this causes UMR waits to hang forever on unload
as the slave is dead but the master hasn't entered error state yet, so
UMR posts succeed but completions never arrive.

Fix this by adding a sys_error notifier that gets registered before
MLX5_IB_STAGE_IB_REG and stays alive until after ib_unregister_device().
This ensures error events reach the bond device throughout teardown.

[1]
Call Trace:
__schedule+0x2bd/0x760
schedule+0x37/0xa0
schedule_preempt_disabled+0xa/0x10
__mutex_lock.isra.6+0x2b5/0x4a0
__mlx5_ib_dereg_mr+0x606/0x870 [mlx5_ib]
? __xa_erase+0x4a/0xa0
? _cond_resched+0x15/0x30
? wait_for_completion+0x31/0x100
ib_dereg_mr_user+0x48/0xc0 [ib_core]
? rdmacg_uncharge_hierarchy+0xa0/0x100
destroy_hw_idr_uobject+0x20/0x50 [ib_uverbs]
uverbs_destroy_uobject+0x37/0x150 [ib_uverbs]
__uverbs_cleanup_ufile+0xda/0x140 [ib_uverbs]
uverbs_destroy_ufile_hw+0x3a/0xf0 [ib_uverbs]
ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs]
remove_client_context+0x8b/0xd0 [ib_core]
disable_device+0x8c/0x130 [ib_core]
__ib_unregister_device+0x10d/0x180 [ib_core]
ib_unregister_device+0x21/0x30 [ib_core]
__mlx5_ib_remove+0x1e4/0x1f0 [mlx5_ib]
auxiliary_bus_remove+0x1e/0x30
device_release_driver_internal+0x103/0x1f0
bus_remove_device+0xf7/0x170
device_del+0x181/0x410
mlx5_rescan_drivers_locked.part.10+0xa9/0x1d0 [mlx5_core]
mlx5_disable_lag+0x253/0x260 [mlx5_core]
mlx5_lag_disable_change+0x89/0xc0 [mlx5_core]
mlx5_eswitch_disable+0x67/0xa0 [mlx5_core]
mlx5_unload+0x15/0xd0 [mlx5_core]
mlx5_unload_one+0x71/0xc0 [mlx5_core]
mlx5_sync_reset_reload_work+0x83/0x100 [mlx5_core]
process_one_work+0x1a7/0x360
worker_thread+0x30/0x390
? create_worker+0x1a0/0x1a0
kthread+0x116/0x130
? kthread_flush_work_fn+0x10/0x10
ret_from_fork+0x22/0x40
Published: 2026-05-27
Score: 5.5 Medium
EPSS: < 1% Very Low
KEV: No
Impact: n/a
Action: n/a
AI Analysis

Impact

A race condition in the Linux mlx5 RDMA driver causes the kernel to wait indefinitely for a User‑Mode Registration (UMR) completion while unloading a device in Link Aggregation Group (LAG) mode. Because the bond device only sees the master and not the slave, the completion never arrives, resulting in a driver hang that can freeze the entire system. This vulnerability illustrates an improper synchronization flaw (CWE‑833).

Affected Systems

All Linux kernels containing the mlx5 driver when LAG bonding is enabled are potentially affected, regardless of distribution. No specific kernel release is listed, but any kernel prior to the commit that adds a sys_error notifier to the mlx5 driver is vulnerable. The vulnerability exists in the kernel’s RDMA subsystem and is unrelated to vendor-specific patches.

Risk and Exploitability

The CVSS score of 5.5 indicates moderate severity, and the EPSS score is < 1% (≈0.00155). The vulnerability has not been listed in the CISA KEV catalog. Based on the description, the attack vector appears to be local, requiring the attacker to trigger a firmware reset or otherwise cause the driver to unload a device in LAG mode. Successful exploitation would lead to a system-wide denial of service by freezing the kernel. While remote exploitation is not documented, local privilege escalation or a compromised RDMA device could enable the conditions for exploitation. The risk is therefore primarily availability risk with a moderate severity posture.

Generated by OpenCVE AI on June 17, 2026 at 00:28 UTC.

Remediation

No vendor fix or workaround currently provided.

OpenCVE Recommended Actions

  • Upgrade the Linux kernel to a version that includes the sys_error notifier fix for the mlx5 driver.
  • Before patching, disable LAG mode or avoid performing firmware resets on RDMA devices until the update is applied to eliminate the race condition.
  • Monitor kernel logs for messages indicating stalled UMR completions or device unload hang events and investigate any occurrences promptly.

Generated by OpenCVE AI on June 17, 2026 at 00:28 UTC.

Tracking

Sign in to view the affected projects.

Advisories

No advisories yet.

History

Tue, 16 Jun 2026 06:30:00 +0000

Type Values Removed Values Added
Weaknesses NVD-CWE-noinfo

Thu, 28 May 2026 04:00:00 +0000

Type Values Removed Values Added
Weaknesses CWE-362
CWE-754

Thu, 28 May 2026 00:15:00 +0000

Type Values Removed Values Added
Weaknesses CWE-833
References
Metrics threat_severity

None

cvssV3_1

{'score': 5.5, 'vector': 'CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H'}

threat_severity

Moderate


Wed, 27 May 2026 17:30:00 +0000

Type Values Removed Values Added
Weaknesses CWE-362
CWE-754

Wed, 27 May 2026 14:15:00 +0000

Type Values Removed Values Added
Description In the Linux kernel, the following vulnerability has been resolved: RDMA/mlx5: Fix UMR hang in LAG error state unload During firmware reset in LAG mode, a race condition causes the driver to hang indefinitely while waiting for UMR completion during device unload. See [1]. In LAG mode the bond device is only registered on the master, so it never sees sys_error events from the slave. During firmware reset this causes UMR waits to hang forever on unload as the slave is dead but the master hasn't entered error state yet, so UMR posts succeed but completions never arrive. Fix this by adding a sys_error notifier that gets registered before MLX5_IB_STAGE_IB_REG and stays alive until after ib_unregister_device(). This ensures error events reach the bond device throughout teardown. [1] Call Trace: __schedule+0x2bd/0x760 schedule+0x37/0xa0 schedule_preempt_disabled+0xa/0x10 __mutex_lock.isra.6+0x2b5/0x4a0 __mlx5_ib_dereg_mr+0x606/0x870 [mlx5_ib] ? __xa_erase+0x4a/0xa0 ? _cond_resched+0x15/0x30 ? wait_for_completion+0x31/0x100 ib_dereg_mr_user+0x48/0xc0 [ib_core] ? rdmacg_uncharge_hierarchy+0xa0/0x100 destroy_hw_idr_uobject+0x20/0x50 [ib_uverbs] uverbs_destroy_uobject+0x37/0x150 [ib_uverbs] __uverbs_cleanup_ufile+0xda/0x140 [ib_uverbs] uverbs_destroy_ufile_hw+0x3a/0xf0 [ib_uverbs] ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs] remove_client_context+0x8b/0xd0 [ib_core] disable_device+0x8c/0x130 [ib_core] __ib_unregister_device+0x10d/0x180 [ib_core] ib_unregister_device+0x21/0x30 [ib_core] __mlx5_ib_remove+0x1e4/0x1f0 [mlx5_ib] auxiliary_bus_remove+0x1e/0x30 device_release_driver_internal+0x103/0x1f0 bus_remove_device+0xf7/0x170 device_del+0x181/0x410 mlx5_rescan_drivers_locked.part.10+0xa9/0x1d0 [mlx5_core] mlx5_disable_lag+0x253/0x260 [mlx5_core] mlx5_lag_disable_change+0x89/0xc0 [mlx5_core] mlx5_eswitch_disable+0x67/0xa0 [mlx5_core] mlx5_unload+0x15/0xd0 [mlx5_core] mlx5_unload_one+0x71/0xc0 [mlx5_core] mlx5_sync_reset_reload_work+0x83/0x100 [mlx5_core] process_one_work+0x1a7/0x360 worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x116/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x22/0x40
Title RDMA/mlx5: Fix UMR hang in LAG error state unload
First Time appeared Linux
Linux linux Kernel
CPEs cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
Vendors & Products Linux
Linux linux Kernel
References

Subscriptions

Linux Linux Kernel
cve-icon MITRE

Status: PUBLISHED

Assigner: Linux

Published:

Updated: 2026-05-27T12:18:32.270Z

Reserved: 2026-05-13T15:03:33.090Z

Link: CVE-2026-45973

cve-icon Vulnrichment

No data.

cve-icon NVD

Status : Analyzed

Published: 2026-05-27T14:17:14.293

Modified: 2026-06-16T02:42:00.687

Link: CVE-2026-45973

cve-icon Redhat

Severity : Moderate

Publid Date: 2026-05-27T00:00:00Z

Links: CVE-2026-45973 - Bugzilla

cve-icon OpenCVE Enrichment

Updated: 2026-06-17T00:30:15Z

Weaknesses