@@ -2648,6 +2648,7 @@ At this point, multi-node distributed inference with **PP + TP** is running, coo
26482648| DeepSeek-R1-0528-Qwen3-8B | language model | |
26492649| DeepSeek-R1-Distill-1.5B/7B/8B/14B/32B/70B | language model | |
26502650| Qwen3-8B/14B/32B | language model | |
2651+ | DeepSeek-V2-Lite | language model | export VLLM_MLA_DISABLE=1 |
26512652| QwQ-32B | language model | |
26522653| Ministral-8B | language model | |
26532654| Mixtral-8x7B | language model | |
@@ -2656,6 +2657,8 @@ At this point, multi-node distributed inference with **PP + TP** is running, coo
26562657| codegeex4-all-9b | language model | with chat_template |
26572658| DeepSeek-Coder-33B | language model | |
26582659| GLM-4-0414-9B/32B | language model | |
2660+ | Seed-OSS-36B-Instruct | language model | |
2661+ | Hunyuan-0.5B/7B-Instruct | language model | follow the guide in [ here] ( #31-how-to-use-hunyuan-7b-instruct ) |
26592662| Qwen3 30B-A3B/Coder-30B-A3B-Instruct| language MOE model | |
26602663| GLM-4.5-Air | language MOE model | |
26612664| Qwen2-VL-7B-Instruct | multimodal model | |
@@ -2665,6 +2668,7 @@ At this point, multi-node distributed inference with **PP + TP** is running, coo
26652668| InternVL2-8B | multimodal model | |
26662669| InternVL3-8B | multimodal model | |
26672670| InternVL3_5-8B | multimodal model | |
2671+ | InternVL3_5-30B-A3B | multimodal MOE model | |
26682672| GLM-4.1V-Thinking | multimodal model | |
26692673| dots.ocr | multimodal model | |
26702674| Qwen2.5-VL 7B/32B/72B | multimodal model | pip install transformers==4.52.4 |
@@ -2674,11 +2678,35 @@ At this point, multi-node distributed inference with **PP + TP** is running, coo
26742678| Qwen2.5-Omni-7B | omni model | pip install librosa soundfile |
26752679| whisper-medium/large-v3-turb| audio model | pip install transformers==4.52.4 librosa |
26762680| Qwen3-Embedding | Embedding | |
2677- | bge-large, bge-m3 | Embedding | |
2681+ | bge-large,bge-m3,bce-base-v1 | Embedding | |
26782682| Qwen3-Reranker | Rerank | |
26792683| bge-reranker-large, bge-reranker-v2-m3 | Rerank | |
26802684---
26812685
2686+ ### 3.1 how to use Hunyuan-7B-Instruct
2687+ install new transformers version
2688+ ``` bash
2689+ pip install transformers==4.56.1
2690+ ```
2691+
2692+ Need to use the followng format like [ here] ( https://huggingface.co/tencent/Hunyuan-7B-Instruct#use-with-transformers ) , and you can decide to use ` think ` or not.
2693+ ``` bash
2694+ curl http://localhost:8001/v1/chat/completions -H ' Content-Type: application/json' -d ' {
2695+ "model": "Hunyuan-7B-Instruct",
2696+ "messages": [
2697+ {
2698+ "role": "system",
2699+ "content": [{"type": "text", "text": "You are a helpful assistant."}]
2700+ },
2701+ {
2702+ "role": "user",
2703+ "content": [{"type": "text", "text": "/no_thinkWhat is AI?"}]
2704+ }
2705+ ],
2706+ "max_tokens": 128
2707+ }'
2708+ ```
2709+
26822710## 4. Troubleshooting
26832711
26842712### 4.1 ModuleNotFoundError: No module named 'vllm.\_ C'
0 commit comments