Skip to content

Commit 03416ea

Browse files
[bugfix][quantization] Fix fp8 per_tensor scale shape (#30257)
Signed-off-by: Haoyang Li <[email protected]>
1 parent c72ea10 commit 03416ea

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/_custom_ops.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1726,7 +1726,7 @@ def scaled_fp8_quant(
17261726
output, input, scale, scale_ub
17271727
)
17281728
else:
1729-
scale = torch.empty((1, 1), device=input.device, dtype=torch.float32)
1729+
scale = torch.empty(1, device=input.device, dtype=torch.float32)
17301730
torch.ops._C.dynamic_scaled_fp8_quant(output, input, scale)
17311731
else:
17321732
assert scale.numel() == 1, f"{scale.shape}"

0 commit comments

Comments
 (0)