Impact
The flaw in vLLM’s temperature validation logic arises from the use of comparison operators that treat Python’s IEEE‑754 NaN and positive Infinity values as false, allowing them to pass every guard. When those invalid temperature values reach the GPU sampling kernels, they trigger undefined behavior or cause CUDA errors that crash the inference worker. This vulnerability, identified as CWE‑1287, does not compromise confidentiality or integrity but disrupts the availability of the inference service.
Affected Systems
The vLLM inference engine, released by the vllm‑project, is affected in all versions before 0.23.1rc0. Upgrading to 0.23.1rc0 or more recent releases removes the validation bug.
Risk and Exploitability
With a CVSS score of 6.9 the vulnerability is considered medium severity, and no EPSS score is provided. It is not listed in the CISA KEV catalog. Based on the description, the likely attack vector is remote; an attacker can submit a request to the public inference API with a temperature value of NaN or positive Infinity, and if the backend processes it, the worker will crash, leading to a denial‑of‑service. No code execution or data exfiltration is possible.
OpenCVE Enrichment
Github GHSA