# Send Links to PDFs to your agent
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
## Fast path
- Download the package from Yavira.
- Extract it into a folder your agent can access.
- Paste one of the prompts below and point your agent at the extracted folder.
## Suggested prompts
### New install

```text
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
```
### Upgrade existing

```text
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
```
## Machine-readable fields
```json
{
  "schemaVersion": "1.0",
  "item": {
    "slug": "links-to-pdfs",
    "name": "Links to PDFs",
    "source": "tencent",
    "type": "skill",
    "category": "效率提升",
    "sourceUrl": "https://clawhub.ai/chrisling-dev/links-to-pdfs",
    "canonicalUrl": "https://clawhub.ai/chrisling-dev/links-to-pdfs",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadUrl": "/downloads/links-to-pdfs",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=links-to-pdfs",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "packageFormat": "ZIP package",
    "primaryDoc": "SKILL.md",
    "includedAssets": [
      "SKILL.md"
    ],
    "downloadMode": "redirect",
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-23T16:43:11.935Z",
      "expiresAt": "2026-04-30T16:43:11.935Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=4claw-imageboard",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=4claw-imageboard",
        "contentDisposition": "attachment; filename=\"4claw-imageboard-1.0.1.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/links-to-pdfs"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    }
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/links-to-pdfs",
    "downloadUrl": "https://openagent3.xyz/downloads/links-to-pdfs",
    "agentUrl": "https://openagent3.xyz/skills/links-to-pdfs/agent",
    "manifestUrl": "https://openagent3.xyz/skills/links-to-pdfs/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/links-to-pdfs/agent.md"
  }
}
```
## Documentation

### docs-scraper

CLI tool that scrapes documents from various sources into local PDF files using browser automation.

### Installation

npm install -g docs-scraper

### Quick start

Scrape any document URL to PDF:

docs-scraper scrape https://example.com/document

Returns local path: ~/.docs-scraper/output/1706123456-abc123.pdf

### Basic scraping

Scrape with daemon (recommended, keeps browser warm):

docs-scraper scrape <url>

Scrape with named profile (for authenticated sites):

docs-scraper scrape <url> -p <profile-name>

Scrape with pre-filled data (e.g., email for DocSend):

docs-scraper scrape <url> -D email=user@example.com

Direct mode (single-shot, no daemon):

docs-scraper scrape <url> --no-daemon

### Authentication workflow

When a document requires authentication (login, email verification, passcode):

Initial scrape returns a job ID:
docs-scraper scrape https://docsend.com/view/xxx
# Output: Scrape blocked
#         Job ID: abc123



Retry with data:
docs-scraper update abc123 -D email=user@example.com
# or with password
docs-scraper update abc123 -D email=user@example.com -D password=1234

### Profile management

Profiles store session cookies for authenticated sites.

docs-scraper profiles list     # List saved profiles
docs-scraper profiles clear    # Clear all profiles
docs-scraper scrape <url> -p myprofile  # Use a profile

### Daemon management

The daemon keeps browser instances warm for faster scraping.

docs-scraper daemon status     # Check status
docs-scraper daemon start      # Start manually
docs-scraper daemon stop       # Stop daemon

Note: Daemon auto-starts when running scrape commands.

### Cleanup

PDFs are stored in ~/.docs-scraper/output/. The daemon automatically cleans up files older than 1 hour.

Manual cleanup:

docs-scraper cleanup                    # Delete all PDFs
docs-scraper cleanup --older-than 1h    # Delete PDFs older than 1 hour

### Job management

docs-scraper jobs list         # List blocked jobs awaiting auth

### Supported sources

Direct PDF links - Downloads PDF directly
Notion pages - Exports Notion page to PDF
DocSend documents - Handles DocSend viewer
LLM fallback - Uses Claude API for any other webpage

### Scraper Reference

Each scraper accepts specific -D data fields. Use the appropriate fields based on the URL type.

### DirectPdfScraper

Handles: URLs ending in .pdf

Data fields: None (downloads directly)

Example:

docs-scraper scrape https://example.com/document.pdf

### DocsendScraper

Handles: docsend.com/view/*, docsend.com/v/*, and subdomains (e.g., org-a.docsend.com)

URL patterns:

Documents: https://docsend.com/view/{id} or https://docsend.com/v/{id}
Folders: https://docsend.com/view/s/{id}
Subdomains: https://{subdomain}.docsend.com/view/{id}

Data fields:

FieldTypeDescriptionemailemailEmail address for document accesspasswordpasswordPasscode/password for protected documentsnametextYour name (required for NDA-gated documents)

Examples:

# Pre-fill email for DocSend
docs-scraper scrape https://docsend.com/view/abc123 -D email=user@example.com

# With password protection
docs-scraper scrape https://docsend.com/view/abc123 -D email=user@example.com -D password=secret123

# With NDA name requirement
docs-scraper scrape https://docsend.com/view/abc123 -D email=user@example.com -D name="John Doe"

# Retry blocked job
docs-scraper update abc123 -D email=user@example.com -D password=secret123

Notes:

DocSend may require any combination of email, password, and name
Folders are scraped as a table of contents PDF with document links
The scraper auto-checks NDA checkboxes when name is provided

### NotionScraper

Handles: notion.so/*, *.notion.site/*

Data fields:

FieldTypeDescriptionemailemailNotion account emailpasswordpasswordNotion account password

Examples:

# Public page (no auth needed)
docs-scraper scrape https://notion.so/Public-Page-abc123

# Private page with login
docs-scraper scrape https://notion.so/Private-Page-abc123 \\
  -D email=user@example.com -D password=mypassword

# Custom domain
docs-scraper scrape https://docs.company.notion.site/Page-abc123

Notes:

Public Notion pages don't require authentication
Toggle blocks are automatically expanded before PDF generation
Uses session profiles to persist login across scrapes

### LlmFallbackScraper

Handles: Any URL not matched by other scrapers (automatic fallback)

Data fields: Dynamic - determined by Claude analyzing the page

The LLM scraper uses Claude to analyze the page HTML and detect:

Login forms (extracts field names dynamically)
Cookie banners (auto-dismisses)
Expandable content (auto-expands)
CAPTCHAs (reports as blocked)
Paywalls (reports as blocked)

Common dynamic fields:

FieldTypeDescriptionemailemailLogin email (if detected)passwordpasswordLogin password (if detected)usernametextUsername (if login uses username)

Examples:

# Generic webpage (no auth)
docs-scraper scrape https://example.com/article

# Webpage requiring login
docs-scraper scrape https://members.example.com/article \\
  -D email=user@example.com -D password=secret

# When blocked, check the job for required fields
docs-scraper jobs list
# Then retry with the fields the scraper detected
docs-scraper update abc123 -D username=myuser -D password=secret

Notes:

Requires ANTHROPIC_API_KEY environment variable
Field names are extracted from the page's actual form fields
Limited to 2 login attempts before failing
CAPTCHAs require manual intervention

### Data field summary

ScraperemailpasswordnameOtherDirectPdf----DocSend✓✓✓-Notion✓✓--LLM Fallback✓*✓*-Dynamic*

*Fields detected dynamically from page analysis

### Environment setup (optional)

Only needed for LLM fallback scraper:

export ANTHROPIC_API_KEY=your_key

Optional browser settings:

export BROWSER_HEADLESS=true   # Set false for debugging

### Common patterns

Archive a Notion page:

docs-scraper scrape https://notion.so/My-Page-abc123

Download protected DocSend:

docs-scraper scrape https://docsend.com/view/xxx
# If blocked:
docs-scraper update <job-id> -D email=user@example.com -D password=1234

Batch scraping with profiles:

docs-scraper scrape https://site.com/doc1 -p mysite
docs-scraper scrape https://site.com/doc2 -p mysite

### Output

Success: Local file path (e.g., ~/.docs-scraper/output/1706123456-abc123.pdf)
Blocked: Job ID + required credential types

### Troubleshooting

Timeout: docs-scraper daemon stop && docs-scraper daemon start
Auth fails: docs-scraper jobs list to check pending jobs
Disk full: docs-scraper cleanup to remove old PDFs
## Trust
- Source: tencent
- Verification: Indexed source record
- Publisher: chrisling-dev
- Version: 0.0.1
## Source health
- Status: healthy
- Source download looks usable.
- Yavira can redirect you to the upstream package for this source.
- Health scope: source
- Reason: direct_download_ok
- Checked at: 2026-04-23T16:43:11.935Z
- Expires at: 2026-04-30T16:43:11.935Z
- Recommended action: Download for OpenClaw
## Links
- [Detail page](https://openagent3.xyz/skills/links-to-pdfs)
- [Send to Agent page](https://openagent3.xyz/skills/links-to-pdfs/agent)
- [JSON manifest](https://openagent3.xyz/skills/links-to-pdfs/agent.json)
- [Markdown brief](https://openagent3.xyz/skills/links-to-pdfs/agent.md)
- [Download page](https://openagent3.xyz/downloads/links-to-pdfs)