Skip to content

vLLM deserialization vulnerability leading to DoS and potential RCE

High severity GitHub Reviewed Published Nov 20, 2025 in vllm-project/vllm • Updated Nov 21, 2025

Package

pip vllm (pip)

Affected versions

>= 0.10.2, < 0.11.1

Patched versions

0.11.1

Description

Summary

A memory corruption vulnerability that leading to a crash (denial-of-service) and potentially remote code execution (RCE) exists in vLLM versions 0.10.2 and later, in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation.

Due to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM.

Details

A vulnerability that can lead to RCE from the completions API endpoint exists in vllm, where due to missing checks when loading user-provided tensors, an out-of-bounds write can be triggered. This happens because the default behavior of torch.load(tensor, weights_only=True) since pytorch 2.8.0 is to not perform validity checks for sparse tensors, and this needs to be enabled explicitly using the torch.sparse.check_sparse_tensor_invariants context manager.

The vulnerability is in the following code in vllm/entrypoints/renderer.py:148

    def _load_and_validate_embed(embed: bytes) -> EngineEmbedsPrompt:
        tensor = torch.load(
            io.BytesIO(pybase64.b64decode(embed, validate=True)),
            weights_only=True,
            map_location=torch.device("cpu"),
        )
        assert isinstance(tensor, torch.Tensor) and tensor.dtype in (
            torch.float32,
            torch.bfloat16,
            torch.float16,
        )
        tensor = tensor.to_dense()

Because of the missing checks, loading invalid prompt embedding tensors provided by the user can cause an out-of-bounds write in the call to to_dense .

Impact

All users with access to this API are able to exploit this vulnerability. Unsafe deserialization of untrusted input can be abused to achieve DoS and potentially remote code execution (RCE) in the vLLM server process. This impacts deployments running vLLM as a server or any instance that deserializes untrusted/model-provided payloads.

Fix

vllm-project/vllm#27204

Acknowledgements

Finder: AXION Security Research Team (Omri Fainaro, Bary Levy): discovery and coordinated disclosure.

References

@russellb russellb published to vllm-project/vllm Nov 20, 2025
Published to the GitHub Advisory Database Nov 20, 2025
Reviewed Nov 20, 2025
Published by the National Vulnerability Database Nov 21, 2025
Last updated Nov 21, 2025

Severity

High

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
Low
Privileges required
Low
User interaction
None
Scope
Unchanged
Confidentiality
High
Integrity
High
Availability
High

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

EPSS score

Exploit Prediction Scoring System (EPSS)

This score estimates the probability of this vulnerability being exploited within the next 30 days. Data provided by FIRST.
(43rd percentile)

Weaknesses

Improper Input Validation

The product receives input or data, but it does not validate or incorrectly validates that the input has the properties that are required to process the data safely and correctly. Learn more on MITRE.

Write-what-where Condition

Any condition where the attacker has the ability to write an arbitrary value to an arbitrary location, often as the result of a buffer overflow. Learn more on MITRE.

Deserialization of Untrusted Data

The product deserializes untrusted data without sufficiently verifying that the resulting data will be valid. Learn more on MITRE.

Out-of-bounds Write

The product writes data past the end, or before the beginning, of the intended buffer. Learn more on MITRE.

CVE ID

CVE-2025-62164

GHSA ID

GHSA-mrw7-hf4f-83pf

Source code

Credits

Loading Checking history
See something to contribute? Suggest improvements for this vulnerability.