Description
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, ll temperature validation gates use comparison operators (<, >), which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. This vulnerability is fixed in 0.23.1rc0.
Published: 2026-06-22
Score: 6.9 Medium
EPSS: n/a
KEV: No
Impact: n/a
Action: n/a
AI Analysis

Impact

The flaw in vLLM’s temperature validation logic arises from the use of comparison operators that treat Python’s IEEE‑754 NaN and positive Infinity values as false, allowing them to pass every guard. When those invalid temperature values reach the GPU sampling kernels, they trigger undefined behavior or cause CUDA errors that crash the inference worker. This vulnerability, identified as CWE‑1287, does not compromise confidentiality or integrity but disrupts the availability of the inference service.

Affected Systems

The vLLM inference engine, released by the vllm‑project, is affected in all versions before 0.23.1rc0. Upgrading to 0.23.1rc0 or more recent releases removes the validation bug.

Risk and Exploitability

With a CVSS score of 6.9 the vulnerability is considered medium severity, and no EPSS score is provided. It is not listed in the CISA KEV catalog. Based on the description, the likely attack vector is remote; an attacker can submit a request to the public inference API with a temperature value of NaN or positive Infinity, and if the backend processes it, the worker will crash, leading to a denial‑of‑service. No code execution or data exfiltration is possible.

Generated by OpenCVE AI on June 23, 2026 at 00:20 UTC.

Remediation

No vendor fix or workaround currently provided.

OpenCVE Recommended Actions

  • Upgrade to vLLM 0.23.1rc0 or newer to apply the validated fix.
  • If an immediate upgrade is infeasible, add a server‑side filter that explicitly rejects NaN or infinity values, for example by checking isfinite() before invoking the inference engine.
  • Establish monitoring that automatically restarts crashed inference workers to reduce downtime.

Generated by OpenCVE AI on June 23, 2026 at 00:20 UTC.

Tracking

Sign in to view the affected projects.

Advisories
Source ID Title
Github GHSA Github GHSA GHSA-7h4p-rffg-7823 vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels
History

Mon, 22 Jun 2026 22:45:00 +0000

Type Values Removed Values Added
Description vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, ll temperature validation gates use comparison operators (<, >), which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. This vulnerability is fixed in 0.23.1rc0.
Title vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels
Weaknesses CWE-1287
References
Metrics cvssV4_0

{'score': 6.9, 'vector': 'CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N'}


Subscriptions

No data.

cve-icon MITRE

Status: PUBLISHED

Assigner: GitHub_M

Published:

Updated: 2026-06-22T21:59:02.710Z

Reserved: 2026-06-12T16:25:43.084Z

Link: CVE-2026-54235

cve-icon Vulnrichment

No data.

cve-icon NVD

No data.

cve-icon Redhat

No data.

cve-icon OpenCVE Enrichment

Updated: 2026-06-23T00:30:06Z

Weaknesses
  • CWE-1287

    Improper Validation of Specified Type of Input