Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Convert PDFs, DOCX, PPTX, and images to Markdown using zerox with GPT-4o vision, including OCR for scanned documents.
Convert PDFs, DOCX, PPTX, and images to Markdown using zerox with GPT-4o vision, including OCR for scanned documents.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
Convert various document formats to Markdown using the zerox library and GPT-4o vision.
PDF (scanned and text-based) Microsoft Word (DOCX) Microsoft PowerPoint (PPTX) Images (PNG, JPG, etc.) And more via OCR
For small files (< 30 seconds): node {baseDir}/scripts/convert.mjs <filePath> [outputPath]
# Convert PDF - saves to {baseDir}/output/document.md by default node {baseDir}/scripts/convert.mjs "/path/to/document.pdf" # Convert PDF with custom output path node {baseDir}/scripts/convert.mjs "/path/to/document.pdf" "/path/to/output.md" # Convert Word document - saves to {baseDir}/output/document.md node {baseDir}/scripts/convert.mjs "/path/to/document.docx"
For large files or scanned PDFs that take minutes: node {baseDir}/scripts/convert-bg.mjs <filePath> [outputPath]
Runs conversion in background (no timeout issues) Logs progress to {baseDir}/output/convert-bg.log Sends macOS notification when complete Detached from terminal (safe to close)
# Convert large scanned PDF in background node {baseDir}/scripts/convert-bg.mjs "/path/to/scanned-document.pdf" # Monitor progress tail -f {baseDir}/output/convert-bg.log
APIYI_API_KEY: Your OpenAI-compatible API key (environment variable)
The conversion uses GPT-4o vision to extract text, so it works even with scanned documents Large documents may take some time to process Output is plain Markdown text
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.