Skip to content

Commit 9e7f445

Browse files
authored
Update readme for hunyuan and other new models (#97)
* update readme * update
1 parent 927de0e commit 9e7f445

File tree

1 file changed

+29
-1
lines changed

1 file changed

+29
-1
lines changed

vllm/README.md

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2648,6 +2648,7 @@ At this point, multi-node distributed inference with **PP + TP** is running, coo
26482648
| DeepSeek-R1-0528-Qwen3-8B | language model | |
26492649
| DeepSeek-R1-Distill-1.5B/7B/8B/14B/32B/70B | language model | |
26502650
| Qwen3-8B/14B/32B | language model | |
2651+
| DeepSeek-V2-Lite | language model | export VLLM_MLA_DISABLE=1 |
26512652
| QwQ-32B | language model | |
26522653
| Ministral-8B | language model | |
26532654
| Mixtral-8x7B | language model | |
@@ -2656,6 +2657,8 @@ At this point, multi-node distributed inference with **PP + TP** is running, coo
26562657
| codegeex4-all-9b | language model | with chat_template |
26572658
| DeepSeek-Coder-33B | language model | |
26582659
| GLM-4-0414-9B/32B | language model | |
2660+
| Seed-OSS-36B-Instruct | language model | |
2661+
| Hunyuan-0.5B/7B-Instruct | language model | follow the guide in [here](#31-how-to-use-hunyuan-7b-instruct) |
26592662
|Qwen3 30B-A3B/Coder-30B-A3B-Instruct| language MOE model | |
26602663
| GLM-4.5-Air | language MOE model | |
26612664
| Qwen2-VL-7B-Instruct | multimodal model | |
@@ -2665,6 +2668,7 @@ At this point, multi-node distributed inference with **PP + TP** is running, coo
26652668
| InternVL2-8B | multimodal model | |
26662669
| InternVL3-8B | multimodal model | |
26672670
| InternVL3_5-8B | multimodal model | |
2671+
| InternVL3_5-30B-A3B | multimodal MOE model | |
26682672
| GLM-4.1V-Thinking | multimodal model | |
26692673
| dots.ocr | multimodal model | |
26702674
| Qwen2.5-VL 7B/32B/72B | multimodal model | pip install transformers==4.52.4 |
@@ -2674,11 +2678,35 @@ At this point, multi-node distributed inference with **PP + TP** is running, coo
26742678
| Qwen2.5-Omni-7B | omni model | pip install librosa soundfile |
26752679
| whisper-medium/large-v3-turb| audio model | pip install transformers==4.52.4 librosa |
26762680
| Qwen3-Embedding | Embedding | |
2677-
| bge-large, bge-m3 | Embedding | |
2681+
| bge-large,bge-m3,bce-base-v1 | Embedding | |
26782682
| Qwen3-Reranker | Rerank | |
26792683
| bge-reranker-large, bge-reranker-v2-m3 | Rerank | |
26802684
---
26812685

2686+
### 3.1 how to use Hunyuan-7B-Instruct
2687+
install new transformers version
2688+
```bash
2689+
pip install transformers==4.56.1
2690+
```
2691+
2692+
Need to use the followng format like [here](https://huggingface.co/tencent/Hunyuan-7B-Instruct#use-with-transformers), and you can decide to use `think` or not.
2693+
```bash
2694+
curl http://localhost:8001/v1/chat/completions -H 'Content-Type: application/json' -d '{
2695+
"model": "Hunyuan-7B-Instruct",
2696+
"messages": [
2697+
{
2698+
"role": "system",
2699+
"content": [{"type": "text", "text": "You are a helpful assistant."}]
2700+
},
2701+
{
2702+
"role": "user",
2703+
"content": [{"type": "text", "text": "/no_thinkWhat is AI?"}]
2704+
}
2705+
],
2706+
"max_tokens": 128
2707+
}'
2708+
```
2709+
26822710
## 4. Troubleshooting
26832711

26842712
### 4.1 ModuleNotFoundError: No module named 'vllm.\_C'

0 commit comments

Comments
 (0)