-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Prerequisites
- I have searched existing issues and reviewed documentation.
Problem Description
May I ask what parallel techniques you have implemented? When setting CUDA_VISIBLE_DEVICES=0,1,2,3, all four cards have a certain load and are performing computations. I see that your TODO list states that expert parallelism is still being implemented. I don't quite understand the current behavior of the program. Here is the status of my compute cards:
`
(test) moeserve@test~/MoE-Infinity$ CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/interface_example.py --model_name_or_path "/home/moeserve/.cache/modelscope/hub/models/AI-ModelScope/Mixtral-8x7B-Instruct-v0.1" --offload_dir ~/offload/
Every 1.0s: nvidia-smi test-NF5468-A7-A0-R0-00: Wed Mar 19 10:45:27 2025
Wed Mar 19 10:45:27 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.15 Driver Version: 570.86.15 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 Off | Off |
| 31% 44C P0 106W / 450W | 23974MiB / 49140MiB | 12% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce RTX 4090 Off | 00000000:41:00.0 Off | Off |
| 30% 43C P0 100W / 450W | 23138MiB / 49140MiB | 10% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA GeForce RTX 4090 Off | 00000000:81:00.0 Off | Off |
| 31% 42C P0 88W / 450W | 23158MiB / 49140MiB | 11% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA GeForce RTX 4090 Off | 00000000:C1:00.0 Off | Off |
| 30% 43C P0 82W / 450W | 23408MiB / 49140MiB | 11% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 3328026 C python 23790MiB |
| 1 N/A N/A 3328026 C python 22954MiB |
| 2 N/A N/A 3328026 C python 22974MiB |
| 3 N/A N/A 3328026 C python 23224MiB |
+-----------------------------------------------------------------------------------------+
`
Proposed Solution
I
Alternatives Considered
No response
Additional Context
No response
Importance
Nice to have
Usage Statistics (Optional)
No response