Merge pull request #666 from codinglover222/deepseek-doc-fix

GeeeekExplorer · web-flow · commit 4cc6253d5c22 · 2025-04-09T09:50:40.000+08:00
fix an args description.
diff --git a/inference/kernel.py b/inference/kernel.py
@@ -87,7 +87,7 @@ def weight_dequant(x: torch.Tensor, s: torch.Tensor, block_size: int = 128) -> t
 
     Args:
         x (torch.Tensor): The quantized weight tensor of shape (M, N).
-        s (torch.Tensor): The scale tensor of shape (M, N).
+        s (torch.Tensor): The scale tensor of shape (M//block_size, N//block_size).
         block_size (int, optional): The block size to use for dequantization. Defaults to 128.
 
     Returns: