Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Extract text, tables, and images from PDFs or images using Mistral OCR API and output in Markdown, JSON, or HTML formats.
Extract text, tables, and images from PDFs or images using Mistral OCR API and output in Markdown, JSON, or HTML formats.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
IMPORTANT - READ BEFORE INSTALLING: This skill uploads your files to Mistral's cloud servers for OCR processing. Do NOT use with sensitive or confidential documents unless: You trust Mistral's data handling policies You have reviewed Mistral's privacy policy You accept that file contents will be transmitted and processed remotely For sensitive documents, use offline/local OCR tools instead.
A powerful OCR tool that converts PDF files and images into Markdown, JSON, or HTML formats using Mistral's state-of-the-art OCR API.
# Clone or download this repository git clone https://github.com/YZDame/Mistral-OCR-SKILL.git cd Mistral-OCR-SKILL # Install dependencies pip install -r requirements.txt
Get your API key: ๐ https://console.mistral.ai/home Set the environment variable: export MISTRAL_API_KEY=your_api_key
cd scripts # Process PDF to Markdown python3 mistral_ocr.py -i input.pdf # Process PDF to JSON python3 mistral_ocr.py -i input.pdf -f json # Specify output directory python3 mistral_ocr.py -i input.pdf -o ~/my_ocr_results
FlagDescription-i, --inputInput file path (required)-f, --formatOutput format: markdown/json/html (default: markdown)-o, --outputOutput directory
What happens to your files: Files are uploaded to Mistral's OCR API Files are processed on Mistral servers Processing results are returned to you Files are not stored on Mistral servers (per Mistral policy) For more details, see: https://mistral.ai/privacy-policy
MIT
Code helpers, APIs, CLIs, browser automation, testing, and developer operations.
Largest current source with strong distribution and engagement signals.