# Send AnyCrawl-API to your agent
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
## Fast path
- Download the package from Yavira.
- Extract it into a folder your agent can access.
- Paste one of the prompts below and point your agent at the extracted folder.
## Suggested prompts
### New install

```text
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
```
### Upgrade existing

```text
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
```
## Machine-readable fields
```json
{
  "schemaVersion": "1.0",
  "item": {
    "slug": "anycrawl",
    "name": "AnyCrawl-API",
    "source": "tencent",
    "type": "skill",
    "category": "开发工具",
    "sourceUrl": "https://clawhub.ai/techlaai/anycrawl",
    "canonicalUrl": "https://clawhub.ai/techlaai/anycrawl",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadUrl": "/downloads/anycrawl",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=anycrawl",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "packageFormat": "ZIP package",
    "primaryDoc": "SKILL.md",
    "includedAssets": [
      "index.js",
      "package.json",
      "README.md",
      "SKILL.md"
    ],
    "downloadMode": "redirect",
    "sourceHealth": {
      "source": "tencent",
      "slug": "anycrawl",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T00:34:36.569Z",
      "expiresAt": "2026-05-07T00:34:36.569Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=anycrawl",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=anycrawl",
        "contentDisposition": "attachment; filename=\"anycrawl-1.0.1.zip\"",
        "redirectLocation": null,
        "bodySnippet": null,
        "slug": "anycrawl"
      },
      "scope": "item",
      "summary": "Item download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this item.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/anycrawl"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    }
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/anycrawl",
    "downloadUrl": "https://openagent3.xyz/downloads/anycrawl",
    "agentUrl": "https://openagent3.xyz/skills/anycrawl/agent",
    "manifestUrl": "https://openagent3.xyz/skills/anycrawl/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/anycrawl/agent.md"
  }
}
```
## Documentation

### AnyCrawl Skill

AnyCrawl API integration for OpenClaw - Scrape, Crawl, and Search web content with high-performance multi-threaded crawling.

### Method 1: Environment variable (Recommended)

export ANYCRAWL_API_KEY="your-api-key"

Make it permanent by adding to ~/.bashrc or ~/.zshrc:

echo 'export ANYCRAWL_API_KEY="your-api-key"' >> ~/.bashrc
source ~/.bashrc

Get your API key at: https://anycrawl.dev

### Method 2: OpenClaw gateway config

openclaw config.patch --set ANYCRAWL_API_KEY="your-api-key"

### 1. anycrawl_scrape

Scrape a single URL and convert to LLM-ready structured data.

Parameters:

url (string, required): URL to scrape
engine (string, optional): Scraping engine - "cheerio" (default), "playwright", "puppeteer"
formats (array, optional): Output formats - ["markdown"], ["html"], ["text"], ["json"], ["screenshot"]
timeout (number, optional): Timeout in milliseconds (default: 30000)
wait_for (number, optional): Delay before extraction in ms (browser engines only)
wait_for_selector (string/object/array, optional): Wait for CSS selectors
include_tags (array, optional): Include only these HTML tags (e.g., ["h1", "p", "article"])
exclude_tags (array, optional): Exclude these HTML tags
proxy (string, optional): Proxy URL (e.g., "http://proxy:port")
json_options (object, optional): JSON extraction with schema/prompt
extract_source (string, optional): "markdown" (default) or "html"

Examples:

// Basic scrape with default cheerio
anycrawl_scrape({ url: "https://example.com" })

// Scrape SPA with Playwright
anycrawl_scrape({ 
  url: "https://spa-example.com",
  engine: "playwright",
  formats: ["markdown", "screenshot"]
})

// Extract structured JSON
anycrawl_scrape({
  url: "https://product-page.com",
  engine: "cheerio",
  json_options: {
    schema: {
      type: "object",
      properties: {
        product_name: { type: "string" },
        price: { type: "number" },
        description: { type: "string" }
      },
      required: ["product_name", "price"]
    },
    user_prompt: "Extract product details from this page"
  }
})

### 2. anycrawl_search

Search Google and return structured results.

Parameters:

query (string, required): Search query
engine (string, optional): Search engine - "google" (default)
limit (number, optional): Max results per page (default: 10)
offset (number, optional): Number of results to skip (default: 0)
pages (number, optional): Number of pages to retrieve (default: 1, max: 20)
lang (string, optional): Language locale (e.g., "en", "zh", "vi")
safe_search (number, optional): 0 (off), 1 (medium), 2 (high)
scrape_options (object, optional): Scrape each result URL with these options

Examples:

// Basic search
anycrawl_search({ query: "OpenAI ChatGPT" })

// Multi-page search in Vietnamese
anycrawl_search({ 
  query: "hướng dẫn Node.js",
  pages: 3,
  lang: "vi"
})

// Search and auto-scrape results
anycrawl_search({
  query: "best AI tools 2026",
  limit: 5,
  scrape_options: {
    engine: "cheerio",
    formats: ["markdown"]
  }
})

### 3. anycrawl_crawl_start

Start crawling an entire website (async job).

Parameters:

url (string, required): Seed URL to start crawling
engine (string, optional): "cheerio" (default), "playwright", "puppeteer"
strategy (string, optional): "all", "same-domain" (default), "same-hostname", "same-origin"
max_depth (number, optional): Max depth from seed URL (default: 10)
limit (number, optional): Max pages to crawl (default: 100)
include_paths (array, optional): Path patterns to include (e.g., ["/blog/*"])
exclude_paths (array, optional): Path patterns to exclude (e.g., ["/admin/*"])
scrape_paths (array, optional): Only scrape URLs matching these patterns
scrape_options (object, optional): Per-page scrape options

Examples:

// Crawl entire website
anycrawl_crawl_start({ 
  url: "https://docs.example.com",
  engine: "cheerio",
  max_depth: 5,
  limit: 50
})

// Crawl only blog posts
anycrawl_crawl_start({
  url: "https://example.com",
  strategy: "same-domain",
  include_paths: ["/blog/*"],
  exclude_paths: ["/blog/tags/*"],
  scrape_options: {
    formats: ["markdown"]
  }
})

// Crawl product pages only
anycrawl_crawl_start({
  url: "https://shop.example.com",
  strategy: "same-domain",
  scrape_paths: ["/products/*"],
  limit: 200
})

### 4. anycrawl_crawl_status

Check crawl job status.

Parameters:

job_id (string, required): Crawl job ID

Example:

anycrawl_crawl_status({ job_id: "7a2e165d-8f81-4be6-9ef7-23222330a396" })

### 5. anycrawl_crawl_results

Get crawl results (paginated).

Parameters:

job_id (string, required): Crawl job ID
skip (number, optional): Number of results to skip (default: 0)

Example:

// Get first 100 results
anycrawl_crawl_results({ job_id: "xxx", skip: 0 })

// Get next 100 results
anycrawl_crawl_results({ job_id: "xxx", skip: 100 })

### 6. anycrawl_crawl_cancel

Cancel a running crawl job.

Parameters:

job_id (string, required): Crawl job ID

### 7. anycrawl_search_and_scrape

Quick helper: Search Google then scrape top results.

Parameters:

query (string, required): Search query
max_results (number, optional): Max results to scrape (default: 3)
scrape_engine (string, optional): Engine for scraping (default: "cheerio")
formats (array, optional): Output formats (default: ["markdown"])
lang (string, optional): Search language

Example:

anycrawl_search_and_scrape({
  query: "latest AI news",
  max_results: 5,
  formats: ["markdown"]
})

### Engine Selection Guide

EngineBest ForSpeedJS RenderingcheerioStatic HTML, news, blogs⚡ Fastest❌ NoplaywrightSPAs, complex web apps🐢 Slower✅ YespuppeteerChrome-specific, metrics🐢 Slower✅ Yes

### Response Format

All responses follow this structure:

{
  "success": true,
  "data": { ... },
  "message": "Optional message"
}

Error response:

{
  "success": false,
  "error": "Error type",
  "message": "Human-readable message"
}

### Common Error Codes

400 - Bad Request (validation errors)
401 - Unauthorized (invalid API key)
402 - Payment Required (insufficient credits)
404 - Not Found
429 - Rate limit exceeded
500 - Internal server error

### API Limits

Rate limits apply based on your plan
Crawl jobs expire after 24 hours
Max crawl limit: depends on credits

### Links

API Docs: https://docs.anycrawl.dev
Website: https://anycrawl.dev
Playground: https://anycrawl.dev/playground
## Trust
- Source: tencent
- Verification: Indexed source record
- Publisher: techlaai
- Version: 1.0.1
## Source health
- Status: healthy
- Item download looks usable.
- Yavira can redirect you to the upstream package for this item.
- Health scope: item
- Reason: direct_download_ok
- Checked at: 2026-04-30T00:34:36.569Z
- Expires at: 2026-05-07T00:34:36.569Z
- Recommended action: Download for OpenClaw
## Links
- [Detail page](https://openagent3.xyz/skills/anycrawl)
- [Send to Agent page](https://openagent3.xyz/skills/anycrawl/agent)
- [JSON manifest](https://openagent3.xyz/skills/anycrawl/agent.json)
- [Markdown brief](https://openagent3.xyz/skills/anycrawl/agent.md)
- [Download page](https://openagent3.xyz/downloads/anycrawl)