Skip to content

Commit b7a869d

Browse files
committed
fix
Signed-off-by: Barbara Suslova <[email protected]>
1 parent f475f81 commit b7a869d

File tree

2 files changed

+3
-2
lines changed

2 files changed

+3
-2
lines changed

csrc/moe/moe_fused_gate.cu

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
// copied from
2+
// https://github.com/sgl-project/sglang/blob/v0.5.5/sgl-kernel/csrc/moe/moe_fused_gate.cu
13
#include <ATen/cuda/CUDAContext.h>
24
#include <cuda_runtime.h>
35
#include <cutlass/array.h>

vllm/model_executor/layers/fused_moe/layer.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@
55
from collections.abc import Callable, Iterable
66
from contextlib import nullcontext
77
from enum import Enum
8-
from functools import partial
98
from typing import Literal, cast, get_args, overload
109

1110
import torch
@@ -2053,7 +2052,7 @@ def combine_output(states: torch.Tensor) -> torch.Tensor:
20532052

20542053
return states
20552054

2056-
if self.shared_experts is not None and self.num_fused_shared_experts == 0:
2055+
if self.shared_experts is not None:
20572056
return (
20582057
final_hidden_states[0],
20592058
combine_output(final_hidden_states[1]),

0 commit comments

Comments
 (0)