← All skills
Tencent SkillHub · Developer Tools

MinerU PDF Parser

用 MinerU API 解析 PDF/Word/PPT/图片为 Markdown,支持公式、表格、OCR。适用于论文解析、文档提取。

skill openclawclawhub Free
0 Downloads
0 Stars
0 Installs
0 Score
High Signal

用 MinerU API 解析 PDF/Word/PPT/图片为 Markdown,支持公式、表格、OCR。适用于论文解析、文档提取。

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Quick setup
  1. Download the package from Yavira.
  2. Extract the archive and review SKILL.md first.
  3. Import or place the package into your OpenClaw setup.

Requirements

Target platform
OpenClaw
Install method
Manual import
Extraction
Extract archive
Prerequisites
OpenClaw
Primary doc
SKILL.md

Package facts

Download mode
Yavira redirect
Package format
ZIP package
Source platform
Tencent SkillHub
What's included
SKILL.md

Validation

  • Use the Yavira download entry.
  • Review SKILL.md after the package is downloaded.
  • Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

  1. Download the package from Yavira.
  2. Extract it into a folder your agent can access.
  3. Paste one of the prompts below and point your agent at the extracted folder.
New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.

Trust & source

Release facts

Source
Tencent SkillHub
Verification
Indexed source record
Version
1.0.1

Documentation

ClawHub primary doc Primary doc: SKILL.md 15 sections Open source page

📄 MinerU - 文档解析神器

OpenDataLab 出品 PDF/Word/PPT/图片 → 结构化 Markdown,公式表格全保留!

🔗 资源链接

资源链接官网https://mineru.net/API 文档https://mineru.net/apiManage/docsGitHubhttps://github.com/opendatalab/MinerU

支持的文件类型

类型格式📕 PDF论文、书籍、扫描件📝 Word.docx📊 PPT.pptx🖼️ 图片.jpg, .png (OCR)

核心优势

公式完美保留 - LaTeX 格式输出 表格结构识别 - 复杂表格也能搞定 多语言 OCR - 中英文混排无压力 版面分析 - 多栏、图文混排自动处理

认证

# Header 认证 Authorization: Bearer {YOUR_API_KEY}

单文件解析

# 1. 提交任务 curl -X POST "https://mineru.net/api/v4/extract/task" \ -H "Authorization: Bearer $MINERU_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "url": "https://arxiv.org/pdf/2410.17247", "enable_formula": true, "enable_table": true, "layout_model": "doclayout_yolo", "language": "en" }' # 返回: {"task_id": "xxx", "status": "pending"} # 2. 轮询结果 curl "https://mineru.net/api/v4/extract/task/{task_id}" \ -H "Authorization: Bearer $MINERU_TOKEN" # 返回: {"status": "done", "result": {...}}

批量解析

# 1. 获取上传 URL curl -X POST "https://mineru.net/api/v4/file-urls/batch" \ -H "Authorization: Bearer $MINERU_TOKEN" \ -d '{"file_names": ["paper1.pdf", "paper2.pdf"]}' # 2. 上传文件到返回的 presigned URLs # 3. 批量提交任务 curl -X POST "https://mineru.net/api/v4/extract/task/batch" \ -H "Authorization: Bearer $MINERU_TOKEN" \ -d '{"files": [{"url": "...", "name": "paper1.pdf"}, ...]}'

⚙️ 参数说明

参数类型说明urlstring文件 URL (支持 http/https)enable_formulabool启用公式识别 (默认 true)enable_tablebool启用表格识别 (默认 true)layout_modelstringdoclayout_yolo (快) / layoutlmv3 (准)languagestringen / ch / automodel_versionstringpipeline / vlm / MinerU-HTML

模型版本对比

版本速度准确度适用场景pipeline⚡ 快高常规文档vlm🐢 慢最高复杂版面MinerU-HTML⚡ 快高网页样式输出

📂 输出结构

解析完成后下载的 ZIP 包含: output/ ├── full.md # 完整 Markdown ├── content_list.json # 结构化内容 ├── images/ # 提取的图片 └── layout.json # 版面分析结果

论文解析流程

# 1. 创建论文目录 mkdir -p "./paper-reading/[CVPR 2025] NewPaper" cd "./paper-reading/[CVPR 2025] NewPaper" # 2. 提交解析任务 TASK_ID=$(curl -s -X POST "https://mineru.net/api/v4/extract/task" \ -H "Authorization: Bearer $MINERU_TOKEN" \ -H "Content-Type: application/json" \ -d '{"url": "https://arxiv.org/pdf/XXXX.XXXXX"}' | jq -r '.task_id') # 3. 等待完成 & 下载 # (轮询 status 直到 done,然后下载 result.zip) # 4. 解压 unzip result.zip -d .

环境变量

在 ~/.bashrc 或 OpenClaw config 中设置: export MINERU_TOKEN="your_api_key_here"

⚠️ 限制

限制数值单文件大小200 MB单文件页数600 页并发任务数根据套餐

💡 使用技巧

arXiv 论文直接用 URL https://arxiv.org/pdf/2410.17247 中文论文用 language: ch 复杂表格用 vlm 模型 批量处理省 quota 一次提交多个文件,比单个提交更高效

📚 相关资源

Paper Banana Skill - 论文配图生成 论文解析不再手动复制粘贴!📖

Category context

Code helpers, APIs, CLIs, browser automation, testing, and developer operations.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package
1 Docs
  • SKILL.md Primary doc