Description
In the Linux kernel, the following vulnerability has been resolved:

iommu/vt-d: Skip dev-iotlb flush for inaccessible PCIe device without scalable mode

PCIe endpoints with ATS enabled and passed through to userspace
(e.g., QEMU, DPDK) can hard-lock the host when their link drops,
either by surprise removal or by a link fault.

Commit 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation
request when device is disconnected") adds pci_dev_is_disconnected()
to devtlb_invalidation_with_pasid() so ATS invalidation is skipped
only when the device is being safely removed, but it applies only
when Intel IOMMU scalable mode is enabled.

With scalable mode disabled or unsupported, a system hard-lock
occurs when a PCIe endpoint's link drops because the Intel IOMMU
waits indefinitely for an ATS invalidation that cannot complete.

Call Trace:
qi_submit_sync
qi_flush_dev_iotlb
__context_flush_dev_iotlb.part.0
domain_context_clear_one_cb
pci_for_each_dma_alias
device_block_translation
blocking_domain_attach_dev
iommu_deinit_device
__iommu_group_remove_device
iommu_release_device
iommu_bus_notifier
blocking_notifier_call_chain
bus_notify
device_del
pci_remove_bus_device
pci_stop_and_remove_bus_device
pciehp_unconfigure_device
pciehp_disable_slot
pciehp_handle_presence_or_link_change
pciehp_ist

Commit 81e921fd3216 ("iommu/vt-d: Fix NULL domain on device release")
adds intel_pasid_teardown_sm_context() to intel_iommu_release_device(),
which calls qi_flush_dev_iotlb() and can also hard-lock the system
when a PCIe endpoint's link drops.

Call Trace:
qi_submit_sync
qi_flush_dev_iotlb
__context_flush_dev_iotlb.part.0
intel_context_flush_no_pasid
device_pasid_table_teardown
pci_pasid_table_teardown
pci_for_each_dma_alias
intel_pasid_teardown_sm_context
intel_iommu_release_device
iommu_deinit_device
__iommu_group_remove_device
iommu_release_device
iommu_bus_notifier
blocking_notifier_call_chain
bus_notify
device_del
pci_remove_bus_device
pci_stop_and_remove_bus_device
pciehp_unconfigure_device
pciehp_disable_slot
pciehp_handle_presence_or_link_change
pciehp_ist

Sometimes the endpoint loses connection without a link-down event
(e.g., due to a link fault); killing the process (virsh destroy)
then hard-locks the host.

Call Trace:
qi_submit_sync
qi_flush_dev_iotlb
__context_flush_dev_iotlb.part.0
domain_context_clear_one_cb
pci_for_each_dma_alias
device_block_translation
blocking_domain_attach_dev
__iommu_attach_device
__iommu_device_set_domain
__iommu_group_set_domain_internal
iommu_detach_group
vfio_iommu_type1_detach_group
vfio_group_detach_container
vfio_group_fops_release
__fput

pci_dev_is_disconnected() only covers safe-removal paths;
pci_device_is_present() tests accessibility by reading
vendor/device IDs and internally calls pci_dev_is_disconnected().
On a ConnectX-5 (8 GT/s, x2) this costs ~70 µs.

Since __context_flush_dev_iotlb() is only called on
{attach,release}_dev paths (not hot), add pci_device_is_present()
there to skip inaccessible devices and avoid the hard-lock.
Published: 2026-05-06
Score: n/a
EPSS: n/a
KEV: No
Impact: n/a
Action: n/a
AI Analysis

Impact

The vulnerability resides in the Linux kernel’s IOMMU subsystem, where a missing flush of the device IOTLB can cause the kernel to wait indefinitely for an ATS invalidation on a PCIe endpoint that has lost its link. This results in a hard lock of the host, disabling all further operations and effectively denying service to all users. The failure originates from an improper check of device connectivity and leads to a system state where the kernel is effectively hung.

Affected Systems

The flaw is present in all Linux kernel releases that lack the recent commit patches. It affects all Linux distributions that compile the generic kernel and employ IOMMU pass‑through devices such as those used by QEMU or DPDK. The issue occurs when the IOMMU is operating without scalable mode, whether due to a hardware limitation or an explicit kernel setting, and especially on Intel platforms that support ATS. Users of any kernel that has not integrated the commit adding pci_device_is_present checks are vulnerable.

Risk and Exploitability

Because the flaw is triggered by an external condition—a PCIe endpoint link loss—the attacker or even a faulted device can easily induce the hang. The lack of a public CVSS score or EPSS score in the CVE data means the quantitative risk is unclear, but a system lock is a high‑impact outcome. The vulnerability is listed as not included in CISA KEV, yet the ease of exploitation by forcing a link drop suggests a high likelihood of real‑world impact on systems that use the IOMMU for device passthrough. The mitigation requires applying the kernel patch or enabling scalable mode to prevent the wait condition.

Generated by OpenCVE AI on May 6, 2026 at 14:17 UTC.

Remediation

No vendor fix or workaround currently provided.

OpenCVE Recommended Actions

  • Update the kernel to a version that includes the commit adding pci_device_is_present checks and proper ATS invalidation handling
  • If a patched kernel is unavailable, enable IOMMU scalable mode on supported Intel hardware to bypass the problematic flush logic
  • As a short‑term workaround, avoid using PCIe devices with ATS in passthrough mode on kernels not yet patched, or ensure that devices are cleanly detached before removal

Generated by OpenCVE AI on May 6, 2026 at 14:17 UTC.

Tracking

Sign in to view the affected projects.

Advisories

No advisories yet.

History

Wed, 06 May 2026 14:45:00 +0000

Type Values Removed Values Added
Weaknesses CWE-20

Wed, 06 May 2026 12:15:00 +0000

Type Values Removed Values Added
Description In the Linux kernel, the following vulnerability has been resolved: iommu/vt-d: Skip dev-iotlb flush for inaccessible PCIe device without scalable mode PCIe endpoints with ATS enabled and passed through to userspace (e.g., QEMU, DPDK) can hard-lock the host when their link drops, either by surprise removal or by a link fault. Commit 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected") adds pci_dev_is_disconnected() to devtlb_invalidation_with_pasid() so ATS invalidation is skipped only when the device is being safely removed, but it applies only when Intel IOMMU scalable mode is enabled. With scalable mode disabled or unsupported, a system hard-lock occurs when a PCIe endpoint's link drops because the Intel IOMMU waits indefinitely for an ATS invalidation that cannot complete. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 domain_context_clear_one_cb pci_for_each_dma_alias device_block_translation blocking_domain_attach_dev iommu_deinit_device __iommu_group_remove_device iommu_release_device iommu_bus_notifier blocking_notifier_call_chain bus_notify device_del pci_remove_bus_device pci_stop_and_remove_bus_device pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist Commit 81e921fd3216 ("iommu/vt-d: Fix NULL domain on device release") adds intel_pasid_teardown_sm_context() to intel_iommu_release_device(), which calls qi_flush_dev_iotlb() and can also hard-lock the system when a PCIe endpoint's link drops. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 intel_context_flush_no_pasid device_pasid_table_teardown pci_pasid_table_teardown pci_for_each_dma_alias intel_pasid_teardown_sm_context intel_iommu_release_device iommu_deinit_device __iommu_group_remove_device iommu_release_device iommu_bus_notifier blocking_notifier_call_chain bus_notify device_del pci_remove_bus_device pci_stop_and_remove_bus_device pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist Sometimes the endpoint loses connection without a link-down event (e.g., due to a link fault); killing the process (virsh destroy) then hard-locks the host. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 domain_context_clear_one_cb pci_for_each_dma_alias device_block_translation blocking_domain_attach_dev __iommu_attach_device __iommu_device_set_domain __iommu_group_set_domain_internal iommu_detach_group vfio_iommu_type1_detach_group vfio_group_detach_container vfio_group_fops_release __fput pci_dev_is_disconnected() only covers safe-removal paths; pci_device_is_present() tests accessibility by reading vendor/device IDs and internally calls pci_dev_is_disconnected(). On a ConnectX-5 (8 GT/s, x2) this costs ~70 µs. Since __context_flush_dev_iotlb() is only called on {attach,release}_dev paths (not hot), add pci_device_is_present() there to skip inaccessible devices and avoid the hard-lock.
Title iommu/vt-d: Skip dev-iotlb flush for inaccessible PCIe device without scalable mode
First Time appeared Linux
Linux linux Kernel
CPEs cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
Vendors & Products Linux
Linux linux Kernel
References

Subscriptions

Linux Linux Kernel
cve-icon MITRE

Status: PUBLISHED

Assigner: Linux

Published:

Updated: 2026-05-06T11:27:39.881Z

Reserved: 2026-05-01T14:12:55.990Z

Link: CVE-2026-43161

cve-icon Vulnrichment

No data.

cve-icon NVD

Status : Awaiting Analysis

Published: 2026-05-06T12:16:34.137

Modified: 2026-05-06T13:07:51.607

Link: CVE-2026-43161

cve-icon Redhat

No data.

cve-icon OpenCVE Enrichment

Updated: 2026-05-06T14:30:05Z

Weaknesses