{
  "schemaVersion": "1.0",
  "item": {
    "slug": "webscraper-pulpminer",
    "name": "PulpMiner Web Scraper - Convert Any Webpage to Realtime JSON API",
    "source": "tencent",
    "type": "skill",
    "category": "开发工具",
    "sourceUrl": "https://clawhub.ai/melvin2016/webscraper-pulpminer",
    "canonicalUrl": "https://clawhub.ai/melvin2016/webscraper-pulpminer",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/webscraper-pulpminer",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=webscraper-pulpminer",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "SKILL.md"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/webscraper-pulpminer"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/webscraper-pulpminer",
    "agentPageUrl": "https://openagent3.xyz/skills/webscraper-pulpminer/agent",
    "manifestUrl": "https://openagent3.xyz/skills/webscraper-pulpminer/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/webscraper-pulpminer/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "PulpMiner — AI Web Scraping & JSON API",
        "body": "PulpMiner converts any webpage into structured JSON using AI. You provide a URL and optionally a JSON template, and PulpMiner scrapes the page, runs it through an LLM, and returns clean structured data."
      },
      {
        "title": "Authentication",
        "body": "All API calls require the apikey header:\n\napikey: <PULPMINER_API_KEY>\n\nGet your API key from https://pulpminer.com/api — click \"Regenerate Key\" if you don't have one."
      },
      {
        "title": "Core Workflow",
        "body": "PulpMiner works in two phases:\n\nCreate a saved API — Configure a URL, scraper, LLM, and optional JSON template via the PulpMiner dashboard at https://pulpminer.com/api\nCall the saved API — Use the external endpoint with your API key to fetch structured JSON"
      },
      {
        "title": "Static API (fixed URL)",
        "body": "curl -X GET \"https://api.pulpminer.com/external/<apiId>\" \\\n  -H \"apikey: <PULPMINER_API_KEY>\"\n\nReturns JSON extracted from the configured webpage."
      },
      {
        "title": "Dynamic API (URL with variables)",
        "body": "For APIs saved with template URLs like https://example.com/search?q={{query}}&page={{page}}:\n\ncurl -X POST \"https://api.pulpminer.com/external/<apiId>\" \\\n  -H \"apikey: <PULPMINER_API_KEY>\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"query\": \"javascript frameworks\", \"page\": \"1\"}'\n\nThe {{variable}} placeholders in the saved URL get replaced with the values you provide."
      },
      {
        "title": "Response Format",
        "body": "Successful responses return:\n\n{\n  \"data\": { ... },\n  \"errors\": null\n}\n\nError responses return:\n\n{\n  \"data\": null,\n  \"errors\": \"Error message describing what went wrong\"\n}"
      },
      {
        "title": "Caching",
        "body": "API responses are cached for 24 hours by default\nIf cache is older than 15 minutes, PulpMiner serves the cached version while refreshing in the background\nCache can be disabled per-API in the dashboard settings"
      },
      {
        "title": "Configuration Options (Set in Dashboard)",
        "body": "When creating a saved API at https://pulpminer.com/api, you can configure:\n\nOptionDescriptionURLThe webpage to scrapeJSON TemplateOptional JSON structure for the LLM to follow (e.g., {\"name\": \"\", \"price\": \"\"})Render JSEnable for SPAs and JS-heavy pages (uses headless browser)CSS SelectorExtract only a specific part of the page (e.g., .product-list, #main-content)Extra InstructionsAdditional guidance for the AI (e.g., \"Only extract items with prices above $50\")Dynamic URLEnable template variables in the URL with {{variable}} syntaxCacheToggle response caching on/off"
      },
      {
        "title": "Integration with Zapier",
        "body": "For async scraping in Zapier workflows:\n\n# Static API\ncurl -X POST \"https://api.pulpminer.com/external/zapier/get/<apiId>\" \\\n  -H \"apikey: <PULPMINER_API_KEY>\" \\\n  -d '{\"callbackURL\": \"https://hooks.zapier.com/...\"}'\n\n# Dynamic API\ncurl -X POST \"https://api.pulpminer.com/external/zapier/post/<apiId>\" \\\n  -H \"apikey: <PULPMINER_API_KEY>\" \\\n  -d '{\"callbackURL\": \"https://hooks.zapier.com/...\", \"query\": \"value\"}'\n\nReturns 201 immediately. Sends scraped data to the callback URL when complete."
      },
      {
        "title": "Integration with n8n",
        "body": "Verify authentication:\n\ncurl -X GET \"https://api.pulpminer.com/external/n8n/auth\" \\\n  -H \"apikey: <PULPMINER_API_KEY>\"\n\nThen use the standard /external/<apiId> endpoints for data fetching."
      },
      {
        "title": "Credits",
        "body": "Each API call costs 0.25–0.4 credits depending on the endpoint\nJavaScript rendering adds 0.1 credits extra\nNew users get 5 free credits\nPurchase more at https://pulpminer.com/credits"
      },
      {
        "title": "Tips",
        "body": "Use CSS selectors to narrow down the scraped content and improve accuracy\nProvide a JSON template for consistent, predictable output structures\nEnable JS rendering only when needed — static pages scrape faster and cost fewer credits\nUse extra instructions to guide the AI (e.g., \"Return dates in ISO 8601 format\")\nFor monitoring use cases, keep caching enabled to reduce credit usage\nUse the playground first to verify a URL is scrapable before saving an API config\nDynamic APIs are ideal for search pages, paginated content, and parameterized URLs"
      },
      {
        "title": "Links",
        "body": "Website: https://pulpminer.com\nAPI Dashboard: https://pulpminer.com/api"
      }
    ],
    "body": "PulpMiner — AI Web Scraping & JSON API\n\nPulpMiner converts any webpage into structured JSON using AI. You provide a URL and optionally a JSON template, and PulpMiner scrapes the page, runs it through an LLM, and returns clean structured data.\n\nAuthentication\n\nAll API calls require the apikey header:\n\napikey: <PULPMINER_API_KEY>\n\n\nGet your API key from https://pulpminer.com/api — click \"Regenerate Key\" if you don't have one.\n\nCore Workflow\n\nPulpMiner works in two phases:\n\nCreate a saved API — Configure a URL, scraper, LLM, and optional JSON template via the PulpMiner dashboard at https://pulpminer.com/api\nCall the saved API — Use the external endpoint with your API key to fetch structured JSON\nCalling a Saved API\nStatic API (fixed URL)\ncurl -X GET \"https://api.pulpminer.com/external/<apiId>\" \\\n  -H \"apikey: <PULPMINER_API_KEY>\"\n\n\nReturns JSON extracted from the configured webpage.\n\nDynamic API (URL with variables)\n\nFor APIs saved with template URLs like https://example.com/search?q={{query}}&page={{page}}:\n\ncurl -X POST \"https://api.pulpminer.com/external/<apiId>\" \\\n  -H \"apikey: <PULPMINER_API_KEY>\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"query\": \"javascript frameworks\", \"page\": \"1\"}'\n\n\nThe {{variable}} placeholders in the saved URL get replaced with the values you provide.\n\nResponse Format\n\nSuccessful responses return:\n\n{\n  \"data\": { ... },\n  \"errors\": null\n}\n\n\nError responses return:\n\n{\n  \"data\": null,\n  \"errors\": \"Error message describing what went wrong\"\n}\n\nCaching\nAPI responses are cached for 24 hours by default\nIf cache is older than 15 minutes, PulpMiner serves the cached version while refreshing in the background\nCache can be disabled per-API in the dashboard settings\nConfiguration Options (Set in Dashboard)\n\nWhen creating a saved API at https://pulpminer.com/api, you can configure:\n\nOption\tDescription\nURL\tThe webpage to scrape\nJSON Template\tOptional JSON structure for the LLM to follow (e.g., {\"name\": \"\", \"price\": \"\"})\nRender JS\tEnable for SPAs and JS-heavy pages (uses headless browser)\nCSS Selector\tExtract only a specific part of the page (e.g., .product-list, #main-content)\nExtra Instructions\tAdditional guidance for the AI (e.g., \"Only extract items with prices above $50\")\nDynamic URL\tEnable template variables in the URL with {{variable}} syntax\nCache\tToggle response caching on/off\nIntegration with Zapier\n\nFor async scraping in Zapier workflows:\n\n# Static API\ncurl -X POST \"https://api.pulpminer.com/external/zapier/get/<apiId>\" \\\n  -H \"apikey: <PULPMINER_API_KEY>\" \\\n  -d '{\"callbackURL\": \"https://hooks.zapier.com/...\"}'\n\n# Dynamic API\ncurl -X POST \"https://api.pulpminer.com/external/zapier/post/<apiId>\" \\\n  -H \"apikey: <PULPMINER_API_KEY>\" \\\n  -d '{\"callbackURL\": \"https://hooks.zapier.com/...\", \"query\": \"value\"}'\n\n\nReturns 201 immediately. Sends scraped data to the callback URL when complete.\n\nIntegration with n8n\n\nVerify authentication:\n\ncurl -X GET \"https://api.pulpminer.com/external/n8n/auth\" \\\n  -H \"apikey: <PULPMINER_API_KEY>\"\n\n\nThen use the standard /external/<apiId> endpoints for data fetching.\n\nCredits\nEach API call costs 0.25–0.4 credits depending on the endpoint\nJavaScript rendering adds 0.1 credits extra\nNew users get 5 free credits\nPurchase more at https://pulpminer.com/credits\nTips\nUse CSS selectors to narrow down the scraped content and improve accuracy\nProvide a JSON template for consistent, predictable output structures\nEnable JS rendering only when needed — static pages scrape faster and cost fewer credits\nUse extra instructions to guide the AI (e.g., \"Return dates in ISO 8601 format\")\nFor monitoring use cases, keep caching enabled to reduce credit usage\nUse the playground first to verify a URL is scrapable before saving an API config\nDynamic APIs are ideal for search pages, paginated content, and parameterized URLs\nLinks\nWebsite: https://pulpminer.com\nAPI Dashboard: https://pulpminer.com/api"
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/melvin2016/webscraper-pulpminer",
    "publisherUrl": "https://clawhub.ai/melvin2016/webscraper-pulpminer",
    "owner": "melvin2016",
    "version": "1.0.1",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/webscraper-pulpminer",
    "downloadUrl": "https://openagent3.xyz/downloads/webscraper-pulpminer",
    "agentUrl": "https://openagent3.xyz/skills/webscraper-pulpminer/agent",
    "manifestUrl": "https://openagent3.xyz/skills/webscraper-pulpminer/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/webscraper-pulpminer/agent.md"
  }
}