Skip to content

DoS with incorrect shape of multimodal embedding inputs

Moderate
russellb published GHSA-pmqf-x6x8-p7qw Nov 20, 2025

Package

pip vllm (pip)

Affected versions

>=0.5.5

Patched versions

>=0.11.1

Description

Summary

Users can crash the vLLM engine serving multimodal models by passing multimodal embedding inputs with correct ndim but incorrect shape (e.g. hidden dimension is wrong), regardless of whether the model is intended to support such inputs (as defined in the Supported Models page).

The issue has existed ever since we added support for image embedding inputs, i.e. #6613 (released in v0.5.5)

Details

Using image embeddings as an example:

  • For models that support image embedding inputs, the engine crashes when scattering the embeddings to inputs_embeds (mismatched shape)
  • For models that don't support image embedding inputs, the engine crashes when validating the inputs inside get_input_embeddings (validation fails).

This happens because we only validate ndim of the tensor, but not the full shape, in input processor (via MultiModalDataParser).

Impact

  • Denial of service by crashing the engine

Mitigation

  • Use API key to limit access to trusted users.
  • Set --limit-mm-per-prompt to 0 for all non-text modalities to ban multimodal inputs, which includes multimodal embedding inputs. However, the model would then only accept text, defeating the purpose of using a multi-modal model.

Resolution

Severity

Moderate

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
Low
Privileges required
Low
User interaction
None
Scope
Unchanged
Confidentiality
None
Integrity
None
Availability
High

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

CVE ID

CVE-2025-62372

Weaknesses

Improper Validation of Array Index

The product uses untrusted input when calculating or using an array index, but the product does not validate or incorrectly validates the index to ensure the index references a valid position within the array. Learn more on MITRE.

Credits