Description
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.19.0, a Denial of Service vulnerability exists in the vLLM OpenAI-compatible API server. Due to the lack of an upper bound validation on the n parameter in the ChatCompletionRequest and CompletionRequest Pydantic models, an unauthenticated attacker can send a single HTTP request with an astronomically large n value. This completely blocks the Python asyncio event loop and causes immediate Out-Of-Memory crashes by allocating millions of request object copies in the heap before the request even reaches the scheduling queue. This vulnerability is fixed in 0.19.0.
Published: 2026-04-06
Score: 6.5 Medium
EPSS: < 1% Very Low
KEV: No
Impact: Denial of Service via Out‑of‑Memory crash
Action: Immediate Patch
AI Analysis

Impact

The vulnerability stems from the lack of an upper bound on the 'n' parameter in vLLM's OpenAI-compatible API server. An unauthenticated attacker can send a request with an astronomically large 'n' value, causing the server to allocate millions of request objects and exhaust memory before the request reaches the scheduling queue. This leads to an immediate Out‑of‑Memory crash and blocks the Python asyncio event loop, effectively denying service to all clients.

Affected Systems

Affected instances are vLLM deployments using versions from 0.1.0 up through 0.18.x. The issue is resolved in vLLM 0.19.0 and later. The affected product is the vllm-project vllm inference and serving engine for large language models.

Risk and Exploitability

The CVSS score of 6.5 classifies the vulnerability as moderate severity, with a strong impact on availability and no authentication required. The EPSS score is not available, so the exploitation likelihood cannot be quantified, and the vulnerability is not listed in the CISA KEV catalog. Based on the description it is inferred that the attack vector is remote, performed by sending an unauthenticated HTTP request to the API server's completion endpoint.

Generated by OpenCVE AI on April 7, 2026 at 01:43 UTC.

Remediation

No vendor fix or workaround currently provided.

OpenCVE Recommended Actions

  • Upgrade vLLM to version 0.19.0 or newer, which includes the bound validation for the 'n' parameter.
  • If an immediate upgrade is not possible, restrict external access to the OpenAI-compatible API endpoint to trusted internal networks or apply firewall rules to block requests from untrusted sources.
  • As a temporary safeguard, implement application‑level rate limiting or a reverse‑proxy that enforces a maximum 'n' value to prevent oversized requests from reaching the server.
  • Continuously monitor request logs for unusually large 'n' values and be prepared to restart the service quickly if an OOM event occurs.

Generated by OpenCVE AI on April 7, 2026 at 01:43 UTC.

Tracking

Sign in to view the affected projects.

Advisories
Source ID Title
Github GHSA Github GHSA GHSA-3mwp-wvh9-7528 vLLM: Unauthenticated OOM Denial of Service via Unbounded `n` Parameter in OpenAI API Server
History

Mon, 20 Apr 2026 18:45:00 +0000

Type Values Removed Values Added
First Time appeared Vllm
Vllm vllm
CPEs cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*
Vendors & Products Vllm
Vllm vllm

Tue, 07 Apr 2026 15:15:00 +0000

Type Values Removed Values Added
Metrics ssvc

{'options': {'Automatable': 'no', 'Exploitation': 'none', 'Technical Impact': 'partial'}, 'version': '2.0.3'}


Tue, 07 Apr 2026 00:00:00 +0000

Type Values Removed Values Added
First Time appeared Vllm-project
Vllm-project vllm
Weaknesses CWE-1284
Vendors & Products Vllm-project
Vllm-project vllm
References
Metrics threat_severity

None

threat_severity

Important


Mon, 06 Apr 2026 16:45:00 +0000

Type Values Removed Values Added
Description vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.19.0, a Denial of Service vulnerability exists in the vLLM OpenAI-compatible API server. Due to the lack of an upper bound validation on the n parameter in the ChatCompletionRequest and CompletionRequest Pydantic models, an unauthenticated attacker can send a single HTTP request with an astronomically large n value. This completely blocks the Python asyncio event loop and causes immediate Out-Of-Memory crashes by allocating millions of request object copies in the heap before the request even reaches the scheduling queue. This vulnerability is fixed in 0.19.0.
Title vLLM Affected by Unauthenticated OOM Denial of Service via Unbounded `n` Parameter in OpenAI API Server
Weaknesses CWE-770
References
Metrics cvssV3_1

{'score': 6.5, 'vector': 'CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H'}


cve-icon MITRE

Status: PUBLISHED

Assigner: GitHub_M

Published:

Updated: 2026-04-07T14:17:12.597Z

Reserved: 2026-03-30T19:17:10.225Z

Link: CVE-2026-34756

cve-icon Vulnrichment

Updated: 2026-04-07T14:16:38.139Z

cve-icon NVD

Status : Analyzed

Published: 2026-04-06T16:16:36.610

Modified: 2026-04-20T18:30:39.493

Link: CVE-2026-34756

cve-icon Redhat

Severity : Important

Publid Date: 2026-04-06T15:40:03Z

Links: CVE-2026-34756 - Bugzilla

cve-icon OpenCVE Enrichment

Updated: 2026-04-07T06:54:58Z

Weaknesses