You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/version3.x/pipeline_usage/PaddleOCR-VL.en.md
+14-18Lines changed: 14 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -89,20 +89,16 @@ Currently, PaddleOCR-VL offers four inference methods, with varying levels of su
89
89
</tbody>
90
90
</table>
91
91
92
-
TIP:
93
-
1. When using NVIDIA GPU for inference, ensure that the Compute Capability (CC) and CUDA version meet the requirements:
94
-
95
-
- PaddlePaddle: CC ≥ 7.0, CUDA ≥ 11.8
96
-
- vLLM: CC ≥ 8.0, CUDA ≥ 12.6
97
-
- SGLang: 8.0 ≤ CC < 12.0, CUDA ≥ 12.6
98
-
- FastDeploy: 8.0 ≤ CC < 12.0, CUDA ≥ 12.6
99
-
- Common GPUs with CC ≥ 8 include RTX 30/40/50 series and A10/A100, etc. For more models, refer to [CUDA GPU Compute Capability](https://developer.nvidia.com/cuda-gpus)
100
-
101
-
2. vLLM compatibility note: Although vLLM can be launched on NVIDIA GPUs with CC 7.x such as T4/V100, timeout or OOM issues may occur, and its use is not recommended.
102
-
103
-
3. Currently, PaddleOCR-VL does not support ARM architecture CPUs. More hardware support will be expanded based on actual needs in the future, so stay tuned!
104
-
105
-
4. vLLM, SGLang, and FastDeploy cannot run natively on Windows or macOS. Please use the Docker images we provide.
92
+
> TIP:
93
+
> - When using NVIDIA GPU for inference, ensure that the Compute Capability (CC) and CUDA version meet the requirements:
94
+
> > - PaddlePaddle: CC ≥ 7.0, CUDA ≥ 11.8
95
+
> > - vLLM: CC ≥ 8.0, CUDA ≥ 12.6
96
+
> > - SGLang: 8.0 ≤ CC < 12.0, CUDA ≥ 12.6
97
+
> > - FastDeploy: 8.0 ≤ CC < 12.0, CUDA ≥ 12.6
98
+
> > - Common GPUs with CC ≥ 8 include RTX 30/40/50 series and A10/A100, etc. For more models, refer to [CUDA GPU Compute Capability](https://developer.nvidia.com/cuda-gpus)
99
+
> - vLLM compatibility note: Although vLLM can be launched on NVIDIA GPUs with CC 7.x such as T4/V100, timeout or OOM issues may occur, and its use is not recommended.
100
+
> - Currently, PaddleOCR-VL does not support ARM architecture CPUs. More hardware support will be expanded based on actual needs in the future, so stay tuned!
101
+
> - vLLM, SGLang, and FastDeploy cannot run natively on Windows or macOS. Please use the Docker images we provide.
106
102
107
103
Since different hardware requires different dependencies, if your hardware meets the requirements in the table above, please refer to the following table for the corresponding tutorial to configure your environment:
108
104
@@ -136,7 +132,7 @@ Since different hardware requires different dependencies, if your hardware meets
136
132
</tbody>
137
133
</table>
138
134
139
-
> [!TIP]
135
+
> TIP:
140
136
> For example, if you are using an RTX 50 series GPU that meets the device requirements for both PaddlePaddle and vLLM inference methods, please refer to the [PaddleOCR-VL NVIDIA Blackwell Architecture GPU Environment Configuration Tutorial](./PaddleOCR-VL-NVIDIA-Blackwell.en.md) to complete the environment configuration before using PaddleOCR-VL.
> **Please ensure that you install PaddlePaddle framework version 3.2.1 or above, along with the special version of safetensors.** For macOS users, please use Docker to set up the environment.
207
203
208
204
## 2. Quick Start
209
205
210
206
PaddleOCR-VL supports two usage methods: CLI command line and Python API. The CLI command line method is simpler and suitable for quickly verifying functionality, while the Python API method is more flexible and suitable for integration into existing projects.
211
207
212
-
> [!TIP]
208
+
> TIP:
213
209
> The methods introduced in this section are primarily for rapid validation. Their inference speed, memory usage, and stability may not meet the requirements of a production environment. **If deployment to a production environment is needed, we strongly recommend using a dedicated inference acceleration framework**. For specific methods, please refer to the next section.
0 commit comments