vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat template. With the right chat_template_kwargs parameters, it is possible to block processing of the API server for long periods of time, delaying all other requests. This issue has been patched in version 0.11.1.
Metrics
Affected Vendors & Products
Advisories
| Source | ID | Title |
|---|---|---|
Github GHSA |
GHSA-69j4-grxj-j64p | vLLM vulnerable to DoS via large Chat Completion or Tokenization requests with specially crafted `chat_template_kwargs` |
Fixes
Solution
No solution given by the vendor.
Workaround
No workaround given by the vendor.
References
History
Fri, 21 Nov 2025 01:45:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| Description | vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat template. With the right chat_template_kwargs parameters, it is possible to block processing of the API server for long periods of time, delaying all other requests. This issue has been patched in version 0.11.1. | |
| Title | vLLM vulnerable to DoS via large Chat Completion or Tokenization requests with specially crafted `chat_template_kwargs` | |
| Weaknesses | CWE-770 | |
| References |
|
|
| Metrics |
cvssV3_1
|
Status: PUBLISHED
Assigner: GitHub_M
Published:
Updated: 2025-11-21T01:21:29.546Z
Reserved: 2025-10-13T16:26:12.180Z
Link: CVE-2025-62426
No data.
Status : Received
Published: 2025-11-21T02:15:43.570
Modified: 2025-11-21T02:15:43.570
Link: CVE-2025-62426
No data.
OpenCVE Enrichment
No data.
Github GHSA