You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/version3.x/deployment/mcp_server.en.md
+28-26Lines changed: 28 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,9 +14,10 @@ This project provides a lightweight [Model Context Protocol (MCP)](https://model
14
14
-**Currently Supported Tools**
15
15
-**OCR**: Performs text detection and recognition on images and PDF files.
16
16
-**PP-StructureV3**: Identifies and extracts text blocks, titles, paragraphs, images, tables, and other layout elements from images or PDF files, converting the input into Markdown documents.
17
+
-**PaddleOCR-VL**: Identifies and extracts text blocks, titles, paragraphs, images, tables, and other layout elements from images or PDF files, converting the input into Markdown documents. A VLM-based approach is used.
17
18
-**Supported Working Modes**
18
19
-**Local Python Library**: Runs PaddleOCR pipelines directly on the local machine. This mode requires a suitable local environment and hardware, and is ideal for offline use or privacy-sensitive scenarios.
19
-
-**AI Studio Community Service**: Invokes services hosted on the [PaddlePaddle AI Studio Community](https://aistudio.baidu.com/pipeline/mine). This is suitable for quick testing, prototyping, or no-code scenarios.
20
+
-**PaddleOCR Official Website Service**: Invokes services provided by the [PaddleOCR Official Website](https://aistudio.baidu.com/paddleocr?lang=en). This is suitable for quick testing, prototyping, or no-code scenarios.
20
21
-**Self-hosted Service**: Invokes the user's self-hosted PaddleOCR services. This mode offers the advantages of serving and high flexibility. It is suitable for scenarios requiring customized service configurations, as well as those with strict data privacy requirements. **Currently, only the basic serving solution is supported.**
21
22
22
23
## Examples:
@@ -83,7 +84,7 @@ Convert images containing formulas and tables to editable csv/Excel format:
83
84
This section explains how to install the `paddleocr-mcp` library via pip.
84
85
85
86
- For the local Python library mode, you need to install both `paddleocr-mcp` and the PaddlePaddle framework along with PaddleOCR, as per the [PaddleOCR installation documentation](../installation.en.md).
86
-
- For the AI Studio community service or the self-hosted service modes, if used within MCP hosts like Claude for Desktop, the server can also be run without installation via tools like `uvx`. See [2. Using with Claude for Desktop](#2-using-with-claude-for-desktop) for details.
87
+
- For the PaddleOCR official website service or the self-hosted service modes, if used within MCP hosts like Claude for Desktop, the server can also be run without installation via tools like `uvx`. See [2. Using with Claude for Desktop](#2-using-with-claude-for-desktop) for details.
87
88
88
89
For the local Python library mode you may optionally choose convenience extras (helpful to reduce manual dependency steps):
89
90
@@ -95,19 +96,19 @@ It is still recommended to use an isolated virtual environment to avoid conflict
@@ -159,6 +160,7 @@ This section explains how to use the PaddleOCR MCP server within Claude for Desk
159
160
160
161
**Notes**:
161
162
163
+
- `PADDLEOCR_MCP_PIPELINE` should be set to the pipeline name. See Section 4 for more details.
162
164
- `PADDLEOCR_MCP_PIPELINE_CONFIG` is optional; if not set, the default pipeline configuration will be used. If you need to adjust the configuration, such as changing the model, please refer to the [PaddleOCR documentation](../paddleocr_and_paddlex.md) to export the pipeline configuration file, and set `PADDLEOCR_MCP_PIPELINE_CONFIG` to the absolute path of this configuration file.
163
165
164
166
- **Inference Performance Tips**:
@@ -195,6 +197,8 @@ This section explains how to use the PaddleOCR MCP server within Claude for Desk
**For PaddleOCR-VL, it is note recommended to use CPUs for inference.**
201
+
198
202
**Important**:
199
203
200
204
- If `paddleocr_mcp` is not in your system's `PATH`, set `command` to the absolute path of the executable.
@@ -219,16 +223,15 @@ You can configure the MCP server according to your requirements to run in differ
219
223
220
224
See [2.1 Quick Start](#21-quick-start).
221
225
222
-
#### Mode 2: AI Studio Community Service
226
+
#### Mode 2: PaddleOCR Official Website Service
223
227
224
228
1. Install `paddleocr-mcp`.
225
-
2. Set up AI Studio community service.
226
-
- Visit [PaddlePaddle AI Studio Community](https://aistudio.baidu.com/pipeline/mine) and log in.
227
-
- Under "PaddleX Pipeline" in the "More" section on the left, click in sequence: [Create Pipeline] - [OCR] - [General OCR] - [Deploy Directly] - [Start Deployment].
228
-
- After deployment, obtain your **service base URL** (e.g., `https://xxxxxx.aistudio-hub.baidu.com`).
229
-
- Get your **access token** from [this page](https://aistudio.baidu.com/index/accessToken).
229
+
2. Obtain the service base URL and AI Studio Community access token.
230
+
231
+
On this page, click "API" in the upper-left corner. Copy the `API_URL` corresponding to "Text Recognition (PP-OCRv5)", and remove the trailing endpoint (`/ocr`) to get the base URL of the service (e.g., `https://xxxxxx.aistudio-app.com`). Also copy the `TOKEN`, which is your access token. You may need to register and log in to your PaddlePaddle AI Studio Community account.
232
+
230
233
3. Refer to the configuration example below to modify the contents of the `claude_desktop_config.json` file.
231
-
3. Restart the MCP host.
234
+
4. Restart the MCP host.
232
235
233
236
Configuration example:
234
237
@@ -251,22 +254,20 @@ Configuration example:
251
254
252
255
**Notes**:
253
256
254
-
- Replace `<your-server-url>` with your AI Studio service base URL, e.g., `https://xxxxx.aistudio-hub.baidu.com`. Make sure not to include the endpoint path (such as `/ocr`).
257
+
-`PADDLEOCR_MCP_PIPELINE` should be set to the pipeline name. See Section 4 for more details.
258
+
- Replace `<your-server-url>` with your service base URL.
255
259
- Replace `<your-access-token>` with your access token.
256
260
257
261
**Important**:
258
262
259
263
- Do not expose your access token.
260
264
261
-
You may also train and deploy custom models on the platform.
262
-
263
265
#### Mode 3: Self-hosted Service
264
266
265
267
1. In the environment where you need to run the PaddleOCR inference server, run the inference server as per the [PaddleOCR serving documentation](./serving.en.md).
266
268
2. Install `paddleocr-mcp` where the MCP server will run.
267
-
3. Refer to the configuration example below to modify the contents of the `claude_desktop_config.json` file.
268
-
4. Set `PADDLEOCR_MCP_SERVER_URL` (e.g., `"http://127.0.0.1:8000"`).
269
-
5. Restart the MCP host.
269
+
3. Refer to the configuration example below to modify the contents of the `claude_desktop_config.json` file. Set `PADDLEOCR_MCP_SERVER_URL` (e.g., `"http://127.0.0.1:8000"`).
270
+
4. Restart the MCP host.
270
271
271
272
Configuration example:
272
273
@@ -288,11 +289,12 @@ Configuration example:
288
289
289
290
**Note**:
290
291
292
+
-`PADDLEOCR_MCP_PIPELINE` should be set to the pipeline name. See Section 4 for more details.
291
293
- Replace `<your-server-url>` with your service’s base URL (e.g., `http://127.0.0.1:8000`).
292
294
293
295
### 2.4 Using `uvx`
294
296
295
-
Currently, for the AI Studio and self-hosted modes, and (for CPU inference) the local mode, starting the MCP server via `uvx` is also supported. With this approach, manual installation of `paddleocr-mcp` is not required. The main steps are as follows:
297
+
Currently, for the PaddleOCR official website and self-hosted modes, and (for CPU inference) the local mode, starting the MCP server via `uvx` is also supported. With this approach, manual installation of `paddleocr-mcp` is not required. The main steps are as follows:
|`PADDLEOCR_MCP_PIPELINE`|`--pipeline`|`str`| Pipeline to run. |`"OCR"`, `"PP-StructureV3"`|`"OCR"`|
377
-
|`PADDLEOCR_MCP_PPOCR_SOURCE`|`--ppocr_source`|`str`| Source of PaddleOCR capabilities. |`"local"` (local Python library), `"aistudio"` (AI Studio community service), `"self_hosted"` (self-hosted service) |`"local"`|
378
+
|`PADDLEOCR_MCP_PIPELINE`|`--pipeline`|`str`| Pipeline to run. |`"OCR"`, `"PP-StructureV3"`, `"PaddleOCR-VL"`|`"OCR"`|
379
+
|`PADDLEOCR_MCP_PPOCR_SOURCE`|`--ppocr_source`|`str`| Source of PaddleOCR capabilities. |`"local"` (local Python library), `"aistudio"` (PaddleOCR official website service), `"self_hosted"` (self-hosted service) |`"local"`|
378
380
|`PADDLEOCR_MCP_SERVER_URL`|`--server_url`|`str`| Base URL for the underlying service (`aistudio` or `self_hosted` mode only). | - |`None`|
379
381
|`PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN`|`--aistudio_access_token`|`str`| AI Studio access token (`aistudio` mode only). | - |`None`|
380
382
|`PADDLEOCR_MCP_TIMEOUT`|`--timeout`|`int`| Read timeout for the underlying requests (seconds). | - |`60`|
@@ -389,4 +391,4 @@ You can control the MCP server via environment variables or CLI arguments.
389
391
390
392
- In the local Python library mode, the current tools cannot process PDF document inputs that are Base64 encoded.
391
393
- In the local Python library mode, the current tools do not infer the file type based on the model's `file_type` prompt, and may fail to process some complex URLs.
392
-
- For the PP-StructureV3 pipeline, if the input file contains images, the returned results may significantly increase token usage. If image content is not needed, you can explicitly exclude it through prompts to reduce resource consumption.
394
+
- For the PP-StructureV3 and PaddleOCR-VL pipelines, if the input file contains images, the returned results may significantly increase token usage. If image content is not needed, you can explicitly exclude it through prompts to reduce resource consumption.
0 commit comments