Impact
vLLM is an inference engine for large language models that, in versions 0.10.1 through 0.13.x, automatically loads Hugging Face’s auto_map modules when resolving a model. If the model repository or local directory is under attacker control, the dynamic module import allows arbitrary Python code to run at server startup, before any request handling or authentication. This code injection capability is identified as CWE-94 and grants the attacker full control over the host running vLLM, potentially compromising confidentiality, integrity, and availability of the underlying system.
Affected Systems
The vulnerability affects the vLLM project’s vLLM engine, specifically any deployment using a version between 0.10.1 and 0.13.x inclusive. All releases of vLLM in that range are susceptible; version 0.14.0 and later contain the fix that gates dynamic module loading with the trust_remote_code flag. Systems running prior to v0.14.0 should be considered exposed.
Risk and Exploitability
The CVSS score is 8.8, indicating high severity. The EPSS score is below 1 %, suggesting the likelihood of exploitation is currently low, and the vulnerability is not listed in the CISA Known Exploited Vulnerabilities catalog. The likely attack vector requires an adversary to influence the model path during initialisation—either by placing a malicious local repository or by selecting a target Hugging Face repo—without the need for network authentication or API access. Once the payload executes, the attacker achieves arbitrary code execution on the host, with no barrier to entry beyond control of the model path.
OpenCVE Enrichment
Github GHSA