Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Parse documents using PaddleOCR's API.
Parse documents using PaddleOCR's API.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Use Document Parsing for: Documents with tables (invoices, financial reports, spreadsheets) Documents with mathematical formulas (academic papers, scientific documents) Documents with charts and diagrams Multi-column layouts (newspapers, magazines, brochures) Complex document structures requiring layout analysis Any document requiring structured understanding Use Text Recognition instead for: Simple text-only extraction Quick OCR tasks where speed is critical Screenshots or simple images with clear text
Install Python dependencies before using this skill. From the skill directory (skills/paddleocr-doc-parsing): pip install -r scripts/requirements.txt Optional โ for document optimization and split_pdf.py (page extraction): pip install -r scripts/requirements-optimize.txt
โ MANDATORY RESTRICTIONS - DO NOT VIOLATE โ ONLY use PaddleOCR Document Parsing API - Execute the script python scripts/vl_caller.py NEVER parse documents directly - Do NOT parse documents yourself NEVER offer alternatives - Do NOT suggest "I can try to analyze it" or similar IF API fails - Display the error message and STOP immediately NO fallback methods - Do NOT attempt document parsing any other way If the script execution fails (API not configured, network error, etc.): Show the error message to the user Do NOT offer to help using your vision capabilities Do NOT ask "Would you like me to try parsing it?" Simply stop and wait for user to fix the configuration
Execute document parsing: python scripts/vl_caller.py --file-url "URL provided by user" --pretty Or for local files: python scripts/vl_caller.py --file-path "file path" --pretty Optional: explicitly set file type: python scripts/vl_caller.py --file-url "URL provided by user" --file-type 0 --pretty --file-type 0: PDF --file-type 1: image If omitted, the service can infer file type from input. Default behavior: save raw JSON to a temp file: If --output is omitted, the script saves automatically under the system temp directory Default path pattern: <system-temp>/paddleocr/doc-parsing/results/result_<timestamp>_<id>.json If --output is provided, it overrides the default temp-file destination If --stdout is provided, JSON is printed to stdout and no file is saved In save mode, the script prints the absolute saved path on stderr: Result saved to: /absolute/path/... In default/custom save mode, read and parse the saved JSON file before responding In save mode, always tell the user the saved file path and that full raw JSON is available there Use --stdout only when you explicitly want to skip file persistence The output JSON contains COMPLETE content with all document data: Headers, footers, page numbers Main text content Tables with structure Formulas (with LaTeX) Figures and charts Footnotes and references Seals and stamps Layout and reading order Input type note: Supported file types depend on the model and endpoint configuration. Always follow the file type constraints documented by your endpoint API. Extract what the user needs from the output JSON using these fields: Top-level text result[n].markdown result[n].prunedResult
The output JSON uses an envelope wrapping the raw API result: { "ok": true, "text": "Full markdown/HTML text extracted from all pages", "result": { ... }, // raw provider response "error": null } Key fields: text โ extracted markdown text from all pages (use this for quick text display) result - raw provider response object result[n].prunedResult - structured parsing output for each page (layout/content/confidence and related metadata) result[n].markdown โ full rendered page output in markdown/HTML Raw result location (default): the temp-file path printed by the script on stderr
Example 1: Extract Full Document Text python scripts/vl_caller.py \ --file-url "https://example.com/paper.pdf" \ --pretty Then use: Top-level text for quick full-text output result[n].markdown when page-level output is needed Example 2: Extract Structured Page Data python scripts/vl_caller.py \ --file-path "./financial_report.pdf" \ --pretty Then use: result[n].prunedResult for structured parsing data (layout/content/confidence) result[n].markdown for rendered page content Example 3: Print JSON Without Saving python scripts/vl_caller.py \ --file-url "URL" \ --stdout \ --pretty Then return: Full text when user asks for full document content result[n].prunedResult and result[n].markdown when user needs complete structured page data
There is no file size limit for the API. For PDFs, the maximum is 100 pages per request. Tips for large files: Use URL for Large Local Files (Recommended) For very large local files, prefer --file-url over --file-path to avoid base64 encoding overhead: python scripts/vl_caller.py --file-url "https://your-server.com/large_file.pdf" Process Specific Pages (PDF Only) If you only need certain pages from a large PDF, extract them first: # Extract pages 1-5 python scripts/split_pdf.py large.pdf pages_1_5.pdf --pages "1-5" # Mixed ranges are supported python scripts/split_pdf.py large.pdf selected_pages.pdf --pages "1-5,8,10-12" # Then process the smaller file python scripts/vl_caller.py --file-path "pages_1_5.pdf"
Authentication failed (403): error: Authentication failed โ Token is invalid, reconfigure with correct credentials API quota exceeded (429): error: API quota exceeded โ Daily API quota exhausted, inform user to wait or upgrade Unsupported format: error: Unsupported file format โ File format not supported, convert to PDF/PNG/JPG
The script NEVER filters content - It always returns complete data The AI agent decides what to present - Based on user's specific request All data is always available - Can be re-interpreted for different needs No information is lost - Complete document structure preserved
references/output_schema.md - Output format specification Note: Model version and capabilities are determined by your API endpoint (PADDLEOCR_DOC_PARSING_API_URL). Load these reference documents into context when: Debugging complex parsing issues Need to understand output format Working with provider API details
To verify the skill is working properly: python scripts/smoke_test.py This tests configuration and optionally API connectivity.
Code helpers, APIs, CLIs, browser automation, testing, and developer operations.
Largest current source with strong distribution and engagement signals.