Skip to content

Commit 108efb4

Browse files
authored
Add qianfan support for MCP servers (#17222)
1 parent 830b64e commit 108efb4

File tree

6 files changed

+151
-44
lines changed

6 files changed

+151
-44
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ log/
2525

2626
build/
2727
dist/
28-
paddleocr.egg-info/
28+
*.egg-info/
2929
/deploy/android_demo/app/OpenCV/
3030
/deploy/android_demo/app/PaddleLite/
3131
/deploy/android_demo/app/.cxx/

docs/version3.x/deployment/mcp_server.en.md

Lines changed: 54 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -18,12 +18,15 @@ This project provides a lightweight [Model Context Protocol (MCP)](https://model
1818
- **Supported Working Modes**
1919
- **Local Python Library**: Runs PaddleOCR pipelines directly on the local machine. This mode requires a suitable local environment and hardware, and is ideal for offline use or privacy-sensitive scenarios.
2020
- **PaddleOCR Official Website Service**: Invokes services provided by the [PaddleOCR Official Website](https://aistudio.baidu.com/paddleocr?lang=en). This is suitable for quick testing, prototyping, or no-code scenarios.
21+
- **Qianfan Platform Service**: Calls the cloud services provided by Baidu AI Cloud's Qianfan large model platform.
2122
- **Self-hosted Service**: Invokes the user's self-hosted PaddleOCR services. This mode offers the advantages of serving and high flexibility. It is suitable for scenarios requiring customized service configurations, as well as those with strict data privacy requirements. **Currently, only the basic serving solution is supported.**
2223

2324
## Examples:
25+
2426
The following showcases creative use cases built with PaddleOCR MCP server combined with other tools:
2527

2628
### Demo 1
29+
2730
In Claude for Desktop, extract handwritten content from images and save to note-taking software Notion. The PaddleOCR MCP server extracts text, formulas and other information from images while preserving document structure.
2831
<div align="center">
2932
<img width="65%" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/mcp_demo/note_to_notion.gif" alt="note_to_notion">
@@ -35,6 +38,7 @@ In Claude for Desktop, extract handwritten content from images and save to note-
3538
---
3639

3740
### Demo 2
41+
3842
In VSCode, convert handwritten ideas or pseudocode into runnable Python scripts that comply with project coding standards with one click, and upload them to GitHub repositories. The PaddleOCR MCP server extracts explicitly handwritten code from images for subsequent processing.
3943

4044
<div align="center">
@@ -46,15 +50,18 @@ In VSCode, convert handwritten ideas or pseudocode into runnable Python scripts
4650
---
4751

4852
### Demo 3
53+
4954
In Claude for Desktop, convert PDF documents or images containing complex tables, formulas, handwritten text and other content into locally editable files.
5055

5156
#### Demo 3.1
57+
5258
Convert complex PDF documents with tables and watermarks to editable doc/Word format:
5359
<div align="center">
5460
<img width="70%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/mcp_demo/pdf_to_file.gif" alt="pdf_to_file">
5561
</div>
5662

5763
#### Demo 3.2
64+
5865
Convert images containing formulas and tables to editable csv/Excel format:
5966
<div align="center">
6067
<img width="70%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/00136903a4d0b5f11bd978cb0ef5d3c44f3aa5e9/images/paddleocr/mcp_demo/table_to_excel1.png" alt="table_to_excel1">
@@ -83,30 +90,25 @@ Convert images containing formulas and tables to editable csv/Excel format:
8390

8491
This section explains how to install the `paddleocr-mcp` library via pip.
8592

86-
- For the local Python library mode, you need to install both `paddleocr-mcp` and the PaddlePaddle framework along with PaddleOCR, as per the [PaddleOCR installation documentation](../installation.en.md).
87-
- For the PaddleOCR official website service or the self-hosted service modes, if used within MCP hosts like Claude for Desktop, the server can also be run without installation via tools like `uvx`. See [2. Using with Claude for Desktop](#2-using-with-claude-for-desktop) for details.
88-
89-
For the local Python library mode you may optionally choose convenience extras (helpful to reduce manual dependency steps):
90-
91-
- `paddleocr-mcp[local]`: Includes PaddleOCR (but NOT the PaddlePaddle framework itself).
92-
- `paddleocr-mcp[local-cpu]`: Includes PaddleOCR AND the CPU version of the PaddlePaddle framework.
93-
94-
It is still recommended to use an isolated virtual environment to avoid conflicts.
93+
- For the local Python library mode, in addition to installing `paddleocr-mcp`, you also need to install the PaddlePaddle framework and PaddleOCR by referring to the [PaddleOCR installation guide](../installation.en.md).
94+
- For the local Python library mode, you may also consider installing the corresponding optional dependencies:
95+
- `paddleocr-mcp[local]`: includes PaddleOCR (without the PaddlePaddle framework).
96+
- `paddleocr-mcp[local-cpu]`: based on `local`, additionally includes the CPU version of the PaddlePaddle framework.
97+
- PaddleOCR also supports running the server without installation through methods like `uvx`. For details, please refer to the instructions in [2. Using with Claude for Desktop](#2-using-with-claude-for-desktop).
9598

9699
To install `paddleocr-mcp` using pip:
97100

98101
```bash
99102
# Install from PyPI
100103
pip install -U paddleocr-mcp
101104

102-
# Or install from source
103-
# git clone https://github.com/PaddlePaddle/PaddleOCR.git
104-
# pip install -e mcp_server
105+
# Install from source
106+
git clone https://github.com/PaddlePaddle/PaddleOCR.git
107+
pip install -e mcp_server
105108

106109
# Install with optional extras (choose ONE of the following if you prefer convenience installs)
107110
# Install PaddleOCR together with the MCP server (framework not included):
108111
pip install "paddleocr-mcp[local]"
109-
110112
# Install PaddleOCR and CPU PaddlePaddle framework together:
111113
pip install "paddleocr-mcp[local-cpu]"
112114
```
@@ -262,7 +264,37 @@ Configuration example:
262264

263265
- Do not expose your access token.
264266

265-
#### Mode 3: Self-hosted Service
267+
#### Mode 3: Qianfan Platform Services
268+
269+
1. Install `paddleocr-mcp`.
270+
2. Obtain an API key by referring to the [Qianfan Platform Official Documentation](https://cloud.baidu.com/doc/qianfan-api/s/ym9chdsy5).
271+
3. Modify the `claude_desktop_config.json` file according to the configuration example below. Set `PADDLEOCR_MCP_QIANFAN_API_KEY` to your Qianfan platform API key.
272+
4. Restart the MCP host.
273+
274+
Configuration example:
275+
276+
```json
277+
{
278+
"mcpServers": {
279+
"paddleocr-ocr": {
280+
"command": "paddleocr_mcp",
281+
"args": [],
282+
"env": {
283+
"PADDLEOCR_MCP_PIPELINE": "PaddleOCR-VL",
284+
"PADDLEOCR_MCP_PPOCR_SOURCE": "qianfan",
285+
"PADDLEOCR_MCP_SERVER_URL": "https://qianfan.baidubce.com/v2/ocr",
286+
"PADDLEOCR_MCP_QIANFAN_API_KEY": "<your-api-key>"
287+
}
288+
}
289+
}
290+
}
291+
```
292+
293+
**Note**:
294+
295+
- The Qianfan platform service currently only supports PaddleOCR-VL.
296+
297+
#### Mode 4: Self-hosted Service
266298

267299
1. In the environment where you need to run the PaddleOCR inference server, run the inference server as per the [PaddleOCR serving documentation](./serving.en.md).
268300
2. Install `paddleocr-mcp` where the MCP server will run.
@@ -294,7 +326,7 @@ Configuration example:
294326

295327
### 2.4 Using `uvx`
296328

297-
Currently, for the PaddleOCR official website and self-hosted modes, and (for CPU inference) the local mode, starting the MCP server via `uvx` is also supported. With this approach, manual installation of `paddleocr-mcp` is not required. The main steps are as follows:
329+
PaddleOCR also supports starting the MCP server via `uvx`. With this approach, manual installation of `paddleocr-mcp` is not required. The main steps are as follows:
298330

299331
1. Install [uv](https://docs.astral.sh/uv/#installation).
300332
2. Modify `claude_desktop_config.json`. Examples:
@@ -321,7 +353,7 @@ Currently, for the PaddleOCR official website and self-hosted modes, and (for CP
321353
}
322354
```
323355

324-
Local mode (CPU, using the `local-cpu` extra):
356+
Local mode (inference on CPUs, using the `local-cpu` extra):
325357

326358
```json
327359
{
@@ -342,7 +374,9 @@ Currently, for the PaddleOCR official website and self-hosted modes, and (for CP
342374
}
343375
```
344376

345-
Because a different startup method is used (`uvx` pulls and runs the package on-demand), only `command` and `args` differ from earlier examples; available environment variables and CLI arguments remain identical.
377+
For information on local mode dependencies, performance tuning, and production configuration, please refer to the [3.1 Quick Start](#21-quick-start) section.
378+
379+
Due to the use of a different startup method, the `command` and `args` settings in the configuration file differ from the previously described approach. However, the command-line arguments and environment variables supported by the MCP service (such as `PADDLEOCR_MCP_SERVER_URL`) can still be set in the same way.
346380

347381
## 3. Running the Server
348382

@@ -376,9 +410,9 @@ You can control the MCP server via environment variables or CLI arguments.
376410
| Environment Variable | CLI Argument | Type | Description | Options | Default |
377411
| ------------------------------------- | ------------------------- | ------ | --------------------------------------------------------------------- | ---------------------------------------- | ------------- |
378412
| `PADDLEOCR_MCP_PIPELINE` | `--pipeline` | `str` | Pipeline to run. | `"OCR"`, `"PP-StructureV3"`, `"PaddleOCR-VL"` | `"OCR"` |
379-
| `PADDLEOCR_MCP_PPOCR_SOURCE` | `--ppocr_source` | `str` | Source of PaddleOCR capabilities. | `"local"` (local Python library), `"aistudio"` (PaddleOCR official website service), `"self_hosted"` (self-hosted service) | `"local"` |
380-
| `PADDLEOCR_MCP_SERVER_URL` | `--server_url` | `str` | Base URL for the underlying service (`aistudio` or `self_hosted` mode only). | - | `None` |
381-
| `PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN` | `--aistudio_access_token` | `str` | AI Studio access token (`aistudio` mode only). | - | `None` |
413+
| `PADDLEOCR_MCP_PPOCR_SOURCE` | `--ppocr_source` | `str` | Source of PaddleOCR capabilities. | `"local"` (local Python library), `"aistudio"` (PaddleOCR official website service), `"qianfan"` (Qianfan platform service), `"self_hosted"` (self-hosted service) | `"local"` |
414+
| `PADDLEOCR_MCP_SERVER_URL` | `--server_url` | `str` | Base URL for the underlying service (required for `aistudio`, `qianfan`, or `self_hosted` modes). | - | `None` |
415+
| `PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN` | `--aistudio_access_token` | `str` | AI Studio access token (required for `aistudio` mode). | - | `None` |
382416
| `PADDLEOCR_MCP_TIMEOUT` | `--timeout` | `int` | Read timeout for the underlying requests (seconds). | - | `60` |
383417
| `PADDLEOCR_MCP_DEVICE` | `--device` | `str` | Device for inference (`local` mode only). | - | `None` |
384418
| `PADDLEOCR_MCP_PIPELINE_CONFIG` | `--pipeline_config` | `str` | Path to pipeline config file (`local` mode only). | - | `None` |

0 commit comments

Comments
 (0)