[Bugfix] Fix compressed-tensors models failing to load with transformers backend #30287

mgoin · 2025-12-09T01:30:29Z

Purpose

The transformers backend's WeightsMapper uses a broad prefix mapping ("" -> "model.") to transform weight paths. This mapper is also applied to quantization config targets via apply_vllm_mapper. However, compressed-tensors targets can be module class names (e.g., "Linear") or regex patterns (e.g., "re:.*proj"), not just layer paths.

When "Linear" is transformed to "model.Linear", the target matching fails because the module class name check looks for "Linear" in "model.Linear" (substring match), which fails.

The fix filters which targets get transformed: only layer paths (containing . and not starting with re:) are mapped. Class names and regex patterns are preserved.

Test Plan

# Previously failing, now passing - compressed-tensors with transformers backend
vllm serve RedHatAI/Qwen3-0.6B-FP8-BLOCK --model-impl transformers

# Verify native backend still works
vllm serve RedHatAI/Qwen3-0.6B-FP8-BLOCK

# Verify fp8 (non-compressed-tensors) still works with transformers
vllm serve Qwen/Qwen3-0.6B-FP8 --model-impl transformers

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: mgoin <[email protected]>

gemini-code-assist

Code Review

This pull request correctly identifies and addresses an issue where the WeightsMapper was incorrectly transforming non-path targets in compressed-tensors quantization configurations. The approach of conditionally applying the mapping is sound. However, I've identified a subtle bug in the implementation of the helper functions _apply_dict and _apply_list. They use a truthiness check that would incorrectly filter out targets mapping to an empty string, potentially leading to silent misconfigurations. My review includes a suggested fix to ensure correct behavior by using an explicit is not None check, aligning with the original WeightsMapper logic.

vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors.py

…pressed_tensors.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Michael Goin <[email protected]>

Isotr0py

LGTM

mgoin · 2025-12-09T01:55:44Z

cc @eldarkurtic to fix your issue reported offline

…ers backend (vllm-project#30287) Signed-off-by: mgoin <[email protected]> Signed-off-by: Michael Goin <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: mayoohee <[email protected]>

Fix compressed-tensors models failing to load with transformers backend

617dd12

Signed-off-by: mgoin <[email protected]>

mgoin requested review from pavanimajety, robertgshaw2-redhat, tlrmchlsmth and yewentao256 as code owners December 9, 2025 01:30

github-project-automation bot added this to Transformers backend Dec 9, 2025

github-project-automation bot moved this to Todo in Transformers backend Dec 9, 2025

mgoin added bug Something isn't working quantization ready ONLY add when PR is ready to merge/full CI is needed labels Dec 9, 2025

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors.py Outdated Show resolved Hide resolved

Update vllm/model_executor/layers/quantization/compressed_tensors/com…

0d84c65

…pressed_tensors.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Michael Goin <[email protected]>

Isotr0py approved these changes Dec 9, 2025

View reviewed changes

vllm-bot merged commit 03b91f7 into vllm-project:main Dec 9, 2025
52 of 54 checks passed

github-project-automation bot moved this from Todo to Done in Transformers backend Dec 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Fix compressed-tensors models failing to load with transformers backend #30287

[Bugfix] Fix compressed-tensors models failing to load with transformers backend #30287

mgoin commented Dec 9, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Isotr0py left a comment

Uh oh!

mgoin commented Dec 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Bugfix] Fix compressed-tensors models failing to load with transformers backend #30287

[Bugfix] Fix compressed-tensors models failing to load with transformers backend #30287

Conversation

mgoin commented Dec 9, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Isotr0py left a comment

Choose a reason for hiding this comment

Uh oh!

mgoin commented Dec 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mgoin commented Dec 9, 2025 •

edited by github-actions bot

Loading