{
  "schemaVersion": "1.0",
  "item": {
    "slug": "wrynai-skill",
    "name": "WrynAI Skill",
    "source": "tencent",
    "type": "skill",
    "category": "开发工具",
    "sourceUrl": "https://clawhub.ai/wrynai/wrynai-skill",
    "canonicalUrl": "https://clawhub.ai/wrynai/wrynai-skill",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/wrynai-skill",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=wrynai-skill",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "SKILL.MD"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-05-07T17:22:31.273Z",
      "expiresAt": "2026-05-14T17:22:31.273Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=afrexai-annual-report",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=afrexai-annual-report",
        "contentDisposition": "attachment; filename=\"afrexai-annual-report-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/wrynai-skill"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/wrynai-skill",
    "agentPageUrl": "https://openagent3.xyz/skills/wrynai-skill/agent",
    "manifestUrl": "https://openagent3.xyz/skills/wrynai-skill/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/wrynai-skill/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "Overview",
        "body": "This skill enables OpenClaw to perform advanced web crawling and content extraction using the WrynAI SDK. It provides capabilities for multi-page crawling, content extraction, search engine results parsing, and intelligent data gathering from websites."
      },
      {
        "title": "Core Capabilities",
        "body": "Multi-page crawling with depth and breadth control\nContent extraction (text, markdown, structured data, links)\nSearch engine results parsing (SERP data)\nScreenshot capture (viewport and full-page)\nSmart listing extraction (e-commerce, directory pages)\nPattern-based URL filtering for targeted crawling"
      },
      {
        "title": "Environment Setup",
        "body": "# Install the WrynAI SDK\npip install wrynai\n\n# Set your API key as environment variable\nexport WRYNAI_API_KEY=\"your-api-key-here\""
      },
      {
        "title": "API Key",
        "body": "Sign up at https://wryn.ai to obtain an API key. The key must be set in the WRYNAI_API_KEY environment variable."
      },
      {
        "title": "1. Basic Website Crawling",
        "body": "Use this when the user wants to crawl an entire website or section of a website.\n\nimport os\nfrom wrynai import WrynAI, WrynAIError\n\ndef crawl_website(url: str, max_pages: int = 10) -> dict:\n    \"\"\"\n    Crawl a website starting from the given URL.\n    \n    Args:\n        url: Starting URL for the crawl\n        max_pages: Maximum number of pages to crawl (hard limit: 10)\n    \n    Returns:\n        Dictionary containing crawl results with pages and their content\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    if not api_key:\n        raise ValueError(\"WRYNAI_API_KEY environment variable required\")\n    \n    try:\n        with WrynAI(api_key=api_key) as client:\n            result = client.crawl(\n                url=url,\n                max_pages=min(max_pages, 10),  # Hard limit enforced\n                max_depth=3,\n                return_urls=True,\n            )\n            \n            return {\n                \"success\": result.success,\n                \"total_pages\": result.total_pages,\n                \"total_visited\": result.total_visited,\n                \"pages\": [\n                    {\n                        \"url\": page.page_url,\n                        \"content\": page.content,\n                        \"urls_found\": len(page.urls),\n                        \"discovered_urls\": page.urls[:10],  # First 10 URLs\n                    }\n                    for page in result.pages\n                ],\n            }\n    except WrynAIError as e:\n        return {\n            \"success\": False,\n            \"error\": str(e),\n            \"status_code\": getattr(e, 'status_code', None),\n        }\n\nWhen to use:\n\nUser asks to \"crawl a website\"\nUser wants to gather content from multiple pages\nUser needs to discover site structure"
      },
      {
        "title": "2. Documentation Crawling",
        "body": "Specialized crawling for documentation sites with pattern filtering.\n\nfrom wrynai import WrynAI, Engine\n\ndef crawl_documentation(base_url: str, doc_patterns: list = None) -> list:\n    \"\"\"\n    Crawl documentation sites with targeted URL patterns.\n    \n    Args:\n        base_url: Base URL of the documentation site\n        doc_patterns: List of URL patterns to include (e.g., [\"/docs/\", \"/api/\"])\n    \n    Returns:\n        List of crawled documentation pages with content\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    doc_patterns = doc_patterns or [\"/docs/\", \"/guide/\", \"/api/\", \"/reference/\"]\n    \n    with WrynAI(api_key=api_key) as client:\n        result = client.crawl(\n            url=base_url,\n            max_pages=10,\n            max_depth=3,\n            include_patterns=doc_patterns,\n            exclude_patterns=[\"/internal/\", \"/draft/\", \"/changelog/\", \"/admin/\"],\n            return_urls=True,\n            timeout_ms=60000,  # 60 seconds for documentation crawling\n        )\n        \n        return [\n            {\n                \"url\": page.page_url,\n                \"content\": page.content,\n                \"word_count\": len(page.content.split()),\n            }\n            for page in result.pages\n        ]\n\nWhen to use:\n\nUser needs to extract documentation content\nUser wants to crawl specific sections of a site\nUser needs to build a knowledge base from docs"
      },
      {
        "title": "3. Search + Crawl Pipeline",
        "body": "Search for topics and crawl the top results.\n\nfrom wrynai import WrynAI, CountryCode, WrynAIError\nimport time\n\ndef search_and_crawl(query: str, num_sites: int = 3, country: str = \"US\") -> list:\n    \"\"\"\n    Search for a query and crawl the top results.\n    \n    Args:\n        query: Search query\n        num_sites: Number of top results to crawl\n        country: Country code for search localization\n    \n    Returns:\n        List of search results with crawled content\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key) as client:\n        # Step 1: Perform search\n        try:\n            search_result = client.search(\n                query=query,\n                num_results=num_sites,\n                country_code=getattr(CountryCode, country, CountryCode.US),\n                timeout_ms=120000,\n            )\n        except WrynAIError as e:\n            return [{\"error\": f\"Search failed: {str(e)}\"}]\n        \n        # Step 2: Crawl each result\n        results = []\n        for result in search_result.organic_results[:num_sites]:\n            try:\n                crawl_result = client.crawl(\n                    url=result.url,\n                    max_pages=3,\n                    max_depth=1,\n                    timeout_ms=60000,\n                )\n                \n                results.append({\n                    \"search_position\": result.position,\n                    \"title\": result.title,\n                    \"url\": result.url,\n                    \"snippet\": result.snippet,\n                    \"crawled_pages\": [\n                        {\n                            \"url\": page.page_url,\n                            \"content_preview\": page.content[:500],\n                            \"full_content\": page.content,\n                        }\n                        for page in crawl_result.pages\n                    ],\n                })\n                \n                # Rate limiting courtesy\n                time.sleep(1)\n                \n            except WrynAIError as e:\n                results.append({\n                    \"title\": result.title,\n                    \"url\": result.url,\n                    \"error\": str(e),\n                })\n        \n        return results\n\nWhen to use:\n\nUser wants to research a topic comprehensively\nUser needs content from top search results\nUser wants to compare information across multiple sources"
      },
      {
        "title": "4. Content Extraction Only",
        "body": "Extract specific content types without crawling.\n\nfrom wrynai import WrynAI, Engine\n\ndef extract_page_content(url: str, content_type: str = \"text\") -> dict:\n    \"\"\"\n    Extract specific content from a single page.\n    \n    Args:\n        url: Target URL\n        content_type: Type of content to extract \n                     (\"text\", \"markdown\", \"structured\", \"links\", \"title\")\n    \n    Returns:\n        Dictionary with extracted content\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key) as client:\n        try:\n            if content_type == \"text\":\n                result = client.extract_text(url, extract_main_content=True)\n                return {\"url\": url, \"text\": result.text}\n            \n            elif content_type == \"markdown\":\n                result = client.extract_markdown(url, extract_main_content=True)\n                return {\"url\": url, \"markdown\": result.markdown}\n            \n            elif content_type == \"structured\":\n                result = client.extract_structured_text(url)\n                return {\n                    \"url\": url,\n                    \"main_text\": result.main_text,\n                    \"headings\": [\n                        {\"level\": h.level, \"tag\": h.tag, \"text\": h.text}\n                        for h in result.headings\n                    ],\n                    \"links\": [\n                        {\"text\": l.text, \"url\": l.url, \"internal\": l.internal}\n                        for l in result.links\n                    ],\n                }\n            \n            elif content_type == \"links\":\n                result = client.extract_links(url)\n                return {\n                    \"url\": url,\n                    \"links\": [\n                        {\"text\": l.text, \"url\": l.url, \"internal\": l.internal}\n                        for l in result.links\n                    ],\n                }\n            \n            elif content_type == \"title\":\n                result = client.extract_title(url)\n                return {\"url\": url, \"title\": result.title}\n            \n            else:\n                return {\"error\": f\"Unknown content_type: {content_type}\"}\n                \n        except WrynAIError as e:\n            return {\"url\": url, \"error\": str(e)}\n\nWhen to use:\n\nUser needs specific content from a single page\nUser wants structured data extraction\nUser needs to extract links or headings"
      },
      {
        "title": "5. Robust Crawling with Error Handling",
        "body": "Production-ready crawling with retry logic and rate limit handling.\n\nfrom wrynai import WrynAI, RateLimitError, TimeoutError, ServerError, WrynAIError\nimport time\n\ndef robust_crawl(url: str, max_attempts: int = 3, max_pages: int = 10) -> dict:\n    \"\"\"\n    Crawl with automatic retry and error recovery.\n    \n    Args:\n        url: Starting URL\n        max_attempts: Maximum retry attempts\n        max_pages: Maximum pages to crawl\n    \n    Returns:\n        Crawl results with success status\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key, max_retries=3) as client:\n        for attempt in range(max_attempts):\n            try:\n                result = client.crawl(\n                    url=url,\n                    max_pages=max_pages,\n                    max_depth=3,\n                    timeout_ms=60000,\n                    retries=2,\n                )\n                \n                return {\n                    \"success\": True,\n                    \"attempt\": attempt + 1,\n                    \"total_visited\": result.total_visited,\n                    \"pages\": [\n                        {\n                            \"url\": page.page_url,\n                            \"content_length\": len(page.content),\n                            \"urls_found\": len(page.urls),\n                        }\n                        for page in result.pages\n                    ],\n                }\n            \n            except RateLimitError as e:\n                wait_time = e.retry_after or (2 ** attempt * 5)\n                print(f\"Rate limited. Waiting {wait_time}s before retry...\")\n                time.sleep(wait_time)\n                continue\n            \n            except TimeoutError:\n                print(f\"Timeout on attempt {attempt + 1}. Retrying...\")\n                continue\n            \n            except ServerError as e:\n                wait_time = 2 ** attempt\n                print(f\"Server error: {e}. Waiting {wait_time}s...\")\n                time.sleep(wait_time)\n                continue\n            \n            except WrynAIError as e:\n                return {\n                    \"success\": False,\n                    \"error\": str(e),\n                    \"error_type\": type(e).__name__,\n                    \"attempt\": attempt + 1,\n                }\n        \n        return {\n            \"success\": False,\n            \"error\": \"Maximum retry attempts exceeded\",\n            \"attempts\": max_attempts,\n        }\n\nWhen to use:\n\nProduction environments requiring reliability\nCrawling sites with rate limits\nWhen dealing with potentially unstable targets"
      },
      {
        "title": "6. JavaScript-Heavy Sites",
        "body": "For single-page applications and JavaScript-rendered content.\n\nfrom wrynai import WrynAI, Engine\n\ndef crawl_spa(url: str, max_pages: int = 5) -> dict:\n    \"\"\"\n    Crawl single-page applications or JavaScript-heavy sites.\n    \n    Args:\n        url: Starting URL\n        max_pages: Maximum pages to crawl\n    \n    Returns:\n        Crawl results with rendered content\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key) as client:\n        result = client.crawl(\n            url=url,\n            max_pages=max_pages,\n            max_depth=2,\n            engine=Engine.STEALTH_MODE,  # Use browser rendering\n            timeout_ms=90000,  # Longer timeout for JS rendering\n            return_urls=True,\n        )\n        \n        return {\n            \"success\": result.success,\n            \"total_visited\": result.total_visited,\n            \"pages\": [\n                {\n                    \"url\": page.page_url,\n                    \"content\": page.content,\n                    \"urls_found\": len(page.urls),\n                }\n                for page in result.pages\n            ],\n        }\n\nWhen to use:\n\nUser needs to crawl React/Vue/Angular applications\nContent is dynamically loaded via JavaScript\nAnti-bot protection is present"
      },
      {
        "title": "Crawl Limits",
        "body": "# Hard limits enforced by the API\nMAX_PAGES = 10      # Maximum pages per crawl\nMAX_DEPTH = 3       # Maximum link depth"
      },
      {
        "title": "Engine Selection",
        "body": "Engine.SIMPLE         # Fast, for static HTML (default)\nEngine.STEALTH_MODE   # Slower, for JavaScript-rendered content"
      },
      {
        "title": "Timeout Recommendations",
        "body": "# Simple scraping: 30,000 ms (30 seconds)\n# Crawling: 60,000 ms (60 seconds) \n# Search operations: 120,000 ms (2 minutes)\n# Smart extraction: 45,000 ms (45 seconds)"
      },
      {
        "title": "URL Pattern Filtering",
        "body": "# Common patterns for include_patterns\nDOCS_PATTERNS = [\"/docs/\", \"/guide/\", \"/api/\", \"/reference/\"]\nBLOG_PATTERNS = [\"/blog/\", \"/posts/\", \"/articles/\"]\n\n# Common patterns for exclude_patterns\nEXCLUDE_PATTERNS = [\"/admin/\", \"/login/\", \"/draft/\", \"/internal/\"]\nMEDIA_EXCLUDE = [\".pdf\", \".jpg\", \".png\", \".mp4\", \".zip\"]"
      },
      {
        "title": "Exception Types",
        "body": "from wrynai import (\n    WrynAIError,           # Base exception\n    AuthenticationError,    # Invalid API key (401)\n    BadRequestError,        # Invalid parameters (400)\n    RateLimitError,         # Rate limit exceeded (429)\n    TimeoutError,           # Request timeout\n    ServerError,            # Server error (5xx)\n    ConnectionError,        # Network issue\n    ValidationError,        # Local validation error\n)"
      },
      {
        "title": "Error Handling Pattern",
        "body": "try:\n    result = client.crawl(url)\nexcept AuthenticationError:\n    # Check WRYNAI_API_KEY environment variable\n    pass\nexcept RateLimitError as e:\n    # Wait for e.retry_after seconds\n    time.sleep(e.retry_after or 60)\nexcept TimeoutError:\n    # Increase timeout_ms parameter\n    pass\nexcept WrynAIError as e:\n    # General API error\n    print(f\"Error: {e} (status: {e.status_code})\")"
      },
      {
        "title": "1. Always Use Environment Variables",
        "body": "import os\napi_key = os.environ.get(\"WRYNAI_API_KEY\")\nif not api_key:\n    raise ValueError(\"WRYNAI_API_KEY environment variable required\")"
      },
      {
        "title": "2. Use Context Managers",
        "body": "# Recommended - automatic resource cleanup\nwith WrynAI(api_key=api_key) as client:\n    result = client.crawl(url)\n\n# Not recommended - manual cleanup required\nclient = WrynAI(api_key=api_key)\ntry:\n    result = client.crawl(url)\nfinally:\n    client.close()"
      },
      {
        "title": "3. Set Appropriate Timeouts",
        "body": "# For simple pages\ntimeout_ms=30000\n\n# For crawling multiple pages\ntimeout_ms=60000\n\n# For JavaScript-heavy sites\ntimeout_ms=90000"
      },
      {
        "title": "4. Graceful Degradation",
        "body": "try:\n    # Try structured extraction first\n    result = client.extract_structured_text(url)\n    content = result.main_text\nexcept Exception:\n    try:\n        # Fall back to simple text\n        result = client.extract_text(url)\n        content = result.text\n    except Exception:\n        content = None"
      },
      {
        "title": "5. Respect Rate Limits",
        "body": "import time\n\nfor url in urls:\n    result = client.crawl(url)\n    time.sleep(1)  # Be nice to the API"
      },
      {
        "title": "Smart Listing Extraction (PRO)",
        "body": "Extract structured data from listing pages (e-commerce, directories).\n\ndef extract_product_listings(url: str) -> list:\n    \"\"\"Extract product information from listing pages.\"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key) as client:\n        result = client.auto_listing(\n            url=url,\n            engine=Engine.STEALTH_MODE,\n            timeout_ms=60000,\n        )\n        \n        return [\n            {\n                \"title\": item.get(\"title\"),\n                \"price\": item.get(\"price\"),\n                \"rating\": item.get(\"rating\"),\n                \"url\": item.get(\"url\"),\n            }\n            for item in result.items\n        ]"
      },
      {
        "title": "Screenshot Capture",
        "body": "import base64\nfrom wrynai import ScreenshotType\n\ndef capture_page_screenshot(url: str, fullpage: bool = False) -> str:\n    \"\"\"Capture page screenshot and save to file.\"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key) as client:\n        result = client.take_screenshot(\n            url=url,\n            screenshot_type=ScreenshotType.FULLPAGE if fullpage else ScreenshotType.VIEWPORT,\n            timeout_ms=30000,\n        )\n        \n        # Decode and save\n        image_data = result.screenshot\n        if \",\" in image_data:\n            image_data = image_data.split(\",\")[1]\n        \n        filename = \"screenshot.png\"\n        with open(filename, \"wb\") as f:\n            f.write(base64.b64decode(image_data))\n        \n        return filename"
      },
      {
        "title": "1. Competitive Research",
        "body": "\"Search for [topic] and crawl the top 5 results\""
      },
      {
        "title": "2. Documentation Aggregation",
        "body": "\"Crawl the Python documentation and extract all API references\""
      },
      {
        "title": "3. Content Migration",
        "body": "\"Crawl our old website and extract all blog posts in markdown\""
      },
      {
        "title": "4. Link Analysis",
        "body": "\"Find all external links on [website]\""
      },
      {
        "title": "5. Site Monitoring",
        "body": "\"Crawl [site] and check if [content] is present\""
      },
      {
        "title": "6. Knowledge Base Creation",
        "body": "\"Crawl [documentation site] and create a searchable knowledge base\""
      },
      {
        "title": "Limitations & Considerations",
        "body": "Hard Limits: Maximum 10 pages per crawl, depth of 3\nRate Limits: API has rate limits; handle RateLimitError appropriately\nTimeout Management: Adjust timeouts based on site complexity\nJavaScript Rendering: Use Engine.STEALTH_MODE for SPAs (slower but necessary)\nRobots.txt: SDK respects robots.txt; some pages may be blocked\nDynamic Content: Some dynamically loaded content may require stealth mode"
      },
      {
        "title": "Common Issues",
        "body": "Issue: AuthenticationError\n\nSolution: Verify WRYNAI_API_KEY environment variable is set correctly\n\nIssue: RateLimitError\n\nSolution: Implement retry with e.retry_after wait time\n\nIssue: TimeoutError\n\nSolution: Increase timeout_ms parameter\n\nIssue: Empty content returned\n\nSolution: Try Engine.STEALTH_MODE for JavaScript-rendered pages\n\nIssue: Missing links/content\n\nSolution: Check exclude_patterns and include_patterns configuration"
      },
      {
        "title": "Integration with OpenClaw",
        "body": "When using this skill with OpenClaw:\n\nSet environment variable before running:\nexport WRYNAI_API_KEY=\"your-api-key\"\n\n\n\nInstall dependencies:\npip install wrynai\n\n\n\nUse in your OpenClaw workflows:\n\nCall the crawling functions directly from your automation scripts\nIntegrate with other OpenClaw skills for comprehensive data pipelines\nUse the returned data structures in downstream processing"
      },
      {
        "title": "API Reference Quick Links",
        "body": "Documentation: https://docs.wryn.ai\nAPI Signup: https://wryn.ai\nGitHub: https://github.com/wrynai/wrynai-python"
      },
      {
        "title": "Version Information",
        "body": "Skill Version: 1.0.0\nSDK Version: wrynai v1.0.0\nPython Version: 3.8+\nLast Updated: 2025-02-07"
      }
    ],
    "body": "WrynAI Web Crawling Skill\nOverview\n\nThis skill enables OpenClaw to perform advanced web crawling and content extraction using the WrynAI SDK. It provides capabilities for multi-page crawling, content extraction, search engine results parsing, and intelligent data gathering from websites.\n\nCore Capabilities\nMulti-page crawling with depth and breadth control\nContent extraction (text, markdown, structured data, links)\nSearch engine results parsing (SERP data)\nScreenshot capture (viewport and full-page)\nSmart listing extraction (e-commerce, directory pages)\nPattern-based URL filtering for targeted crawling\nPrerequisites\nEnvironment Setup\n# Install the WrynAI SDK\npip install wrynai\n\n# Set your API key as environment variable\nexport WRYNAI_API_KEY=\"your-api-key-here\"\n\nAPI Key\n\nSign up at https://wryn.ai to obtain an API key. The key must be set in the WRYNAI_API_KEY environment variable.\n\nUsage Patterns\n1. Basic Website Crawling\n\nUse this when the user wants to crawl an entire website or section of a website.\n\nimport os\nfrom wrynai import WrynAI, WrynAIError\n\ndef crawl_website(url: str, max_pages: int = 10) -> dict:\n    \"\"\"\n    Crawl a website starting from the given URL.\n    \n    Args:\n        url: Starting URL for the crawl\n        max_pages: Maximum number of pages to crawl (hard limit: 10)\n    \n    Returns:\n        Dictionary containing crawl results with pages and their content\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    if not api_key:\n        raise ValueError(\"WRYNAI_API_KEY environment variable required\")\n    \n    try:\n        with WrynAI(api_key=api_key) as client:\n            result = client.crawl(\n                url=url,\n                max_pages=min(max_pages, 10),  # Hard limit enforced\n                max_depth=3,\n                return_urls=True,\n            )\n            \n            return {\n                \"success\": result.success,\n                \"total_pages\": result.total_pages,\n                \"total_visited\": result.total_visited,\n                \"pages\": [\n                    {\n                        \"url\": page.page_url,\n                        \"content\": page.content,\n                        \"urls_found\": len(page.urls),\n                        \"discovered_urls\": page.urls[:10],  # First 10 URLs\n                    }\n                    for page in result.pages\n                ],\n            }\n    except WrynAIError as e:\n        return {\n            \"success\": False,\n            \"error\": str(e),\n            \"status_code\": getattr(e, 'status_code', None),\n        }\n\n\nWhen to use:\n\nUser asks to \"crawl a website\"\nUser wants to gather content from multiple pages\nUser needs to discover site structure\n2. Documentation Crawling\n\nSpecialized crawling for documentation sites with pattern filtering.\n\nfrom wrynai import WrynAI, Engine\n\ndef crawl_documentation(base_url: str, doc_patterns: list = None) -> list:\n    \"\"\"\n    Crawl documentation sites with targeted URL patterns.\n    \n    Args:\n        base_url: Base URL of the documentation site\n        doc_patterns: List of URL patterns to include (e.g., [\"/docs/\", \"/api/\"])\n    \n    Returns:\n        List of crawled documentation pages with content\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    doc_patterns = doc_patterns or [\"/docs/\", \"/guide/\", \"/api/\", \"/reference/\"]\n    \n    with WrynAI(api_key=api_key) as client:\n        result = client.crawl(\n            url=base_url,\n            max_pages=10,\n            max_depth=3,\n            include_patterns=doc_patterns,\n            exclude_patterns=[\"/internal/\", \"/draft/\", \"/changelog/\", \"/admin/\"],\n            return_urls=True,\n            timeout_ms=60000,  # 60 seconds for documentation crawling\n        )\n        \n        return [\n            {\n                \"url\": page.page_url,\n                \"content\": page.content,\n                \"word_count\": len(page.content.split()),\n            }\n            for page in result.pages\n        ]\n\n\nWhen to use:\n\nUser needs to extract documentation content\nUser wants to crawl specific sections of a site\nUser needs to build a knowledge base from docs\n3. Search + Crawl Pipeline\n\nSearch for topics and crawl the top results.\n\nfrom wrynai import WrynAI, CountryCode, WrynAIError\nimport time\n\ndef search_and_crawl(query: str, num_sites: int = 3, country: str = \"US\") -> list:\n    \"\"\"\n    Search for a query and crawl the top results.\n    \n    Args:\n        query: Search query\n        num_sites: Number of top results to crawl\n        country: Country code for search localization\n    \n    Returns:\n        List of search results with crawled content\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key) as client:\n        # Step 1: Perform search\n        try:\n            search_result = client.search(\n                query=query,\n                num_results=num_sites,\n                country_code=getattr(CountryCode, country, CountryCode.US),\n                timeout_ms=120000,\n            )\n        except WrynAIError as e:\n            return [{\"error\": f\"Search failed: {str(e)}\"}]\n        \n        # Step 2: Crawl each result\n        results = []\n        for result in search_result.organic_results[:num_sites]:\n            try:\n                crawl_result = client.crawl(\n                    url=result.url,\n                    max_pages=3,\n                    max_depth=1,\n                    timeout_ms=60000,\n                )\n                \n                results.append({\n                    \"search_position\": result.position,\n                    \"title\": result.title,\n                    \"url\": result.url,\n                    \"snippet\": result.snippet,\n                    \"crawled_pages\": [\n                        {\n                            \"url\": page.page_url,\n                            \"content_preview\": page.content[:500],\n                            \"full_content\": page.content,\n                        }\n                        for page in crawl_result.pages\n                    ],\n                })\n                \n                # Rate limiting courtesy\n                time.sleep(1)\n                \n            except WrynAIError as e:\n                results.append({\n                    \"title\": result.title,\n                    \"url\": result.url,\n                    \"error\": str(e),\n                })\n        \n        return results\n\n\nWhen to use:\n\nUser wants to research a topic comprehensively\nUser needs content from top search results\nUser wants to compare information across multiple sources\n4. Content Extraction Only\n\nExtract specific content types without crawling.\n\nfrom wrynai import WrynAI, Engine\n\ndef extract_page_content(url: str, content_type: str = \"text\") -> dict:\n    \"\"\"\n    Extract specific content from a single page.\n    \n    Args:\n        url: Target URL\n        content_type: Type of content to extract \n                     (\"text\", \"markdown\", \"structured\", \"links\", \"title\")\n    \n    Returns:\n        Dictionary with extracted content\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key) as client:\n        try:\n            if content_type == \"text\":\n                result = client.extract_text(url, extract_main_content=True)\n                return {\"url\": url, \"text\": result.text}\n            \n            elif content_type == \"markdown\":\n                result = client.extract_markdown(url, extract_main_content=True)\n                return {\"url\": url, \"markdown\": result.markdown}\n            \n            elif content_type == \"structured\":\n                result = client.extract_structured_text(url)\n                return {\n                    \"url\": url,\n                    \"main_text\": result.main_text,\n                    \"headings\": [\n                        {\"level\": h.level, \"tag\": h.tag, \"text\": h.text}\n                        for h in result.headings\n                    ],\n                    \"links\": [\n                        {\"text\": l.text, \"url\": l.url, \"internal\": l.internal}\n                        for l in result.links\n                    ],\n                }\n            \n            elif content_type == \"links\":\n                result = client.extract_links(url)\n                return {\n                    \"url\": url,\n                    \"links\": [\n                        {\"text\": l.text, \"url\": l.url, \"internal\": l.internal}\n                        for l in result.links\n                    ],\n                }\n            \n            elif content_type == \"title\":\n                result = client.extract_title(url)\n                return {\"url\": url, \"title\": result.title}\n            \n            else:\n                return {\"error\": f\"Unknown content_type: {content_type}\"}\n                \n        except WrynAIError as e:\n            return {\"url\": url, \"error\": str(e)}\n\n\nWhen to use:\n\nUser needs specific content from a single page\nUser wants structured data extraction\nUser needs to extract links or headings\n5. Robust Crawling with Error Handling\n\nProduction-ready crawling with retry logic and rate limit handling.\n\nfrom wrynai import WrynAI, RateLimitError, TimeoutError, ServerError, WrynAIError\nimport time\n\ndef robust_crawl(url: str, max_attempts: int = 3, max_pages: int = 10) -> dict:\n    \"\"\"\n    Crawl with automatic retry and error recovery.\n    \n    Args:\n        url: Starting URL\n        max_attempts: Maximum retry attempts\n        max_pages: Maximum pages to crawl\n    \n    Returns:\n        Crawl results with success status\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key, max_retries=3) as client:\n        for attempt in range(max_attempts):\n            try:\n                result = client.crawl(\n                    url=url,\n                    max_pages=max_pages,\n                    max_depth=3,\n                    timeout_ms=60000,\n                    retries=2,\n                )\n                \n                return {\n                    \"success\": True,\n                    \"attempt\": attempt + 1,\n                    \"total_visited\": result.total_visited,\n                    \"pages\": [\n                        {\n                            \"url\": page.page_url,\n                            \"content_length\": len(page.content),\n                            \"urls_found\": len(page.urls),\n                        }\n                        for page in result.pages\n                    ],\n                }\n            \n            except RateLimitError as e:\n                wait_time = e.retry_after or (2 ** attempt * 5)\n                print(f\"Rate limited. Waiting {wait_time}s before retry...\")\n                time.sleep(wait_time)\n                continue\n            \n            except TimeoutError:\n                print(f\"Timeout on attempt {attempt + 1}. Retrying...\")\n                continue\n            \n            except ServerError as e:\n                wait_time = 2 ** attempt\n                print(f\"Server error: {e}. Waiting {wait_time}s...\")\n                time.sleep(wait_time)\n                continue\n            \n            except WrynAIError as e:\n                return {\n                    \"success\": False,\n                    \"error\": str(e),\n                    \"error_type\": type(e).__name__,\n                    \"attempt\": attempt + 1,\n                }\n        \n        return {\n            \"success\": False,\n            \"error\": \"Maximum retry attempts exceeded\",\n            \"attempts\": max_attempts,\n        }\n\n\nWhen to use:\n\nProduction environments requiring reliability\nCrawling sites with rate limits\nWhen dealing with potentially unstable targets\n6. JavaScript-Heavy Sites\n\nFor single-page applications and JavaScript-rendered content.\n\nfrom wrynai import WrynAI, Engine\n\ndef crawl_spa(url: str, max_pages: int = 5) -> dict:\n    \"\"\"\n    Crawl single-page applications or JavaScript-heavy sites.\n    \n    Args:\n        url: Starting URL\n        max_pages: Maximum pages to crawl\n    \n    Returns:\n        Crawl results with rendered content\n    \"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key) as client:\n        result = client.crawl(\n            url=url,\n            max_pages=max_pages,\n            max_depth=2,\n            engine=Engine.STEALTH_MODE,  # Use browser rendering\n            timeout_ms=90000,  # Longer timeout for JS rendering\n            return_urls=True,\n        )\n        \n        return {\n            \"success\": result.success,\n            \"total_visited\": result.total_visited,\n            \"pages\": [\n                {\n                    \"url\": page.page_url,\n                    \"content\": page.content,\n                    \"urls_found\": len(page.urls),\n                }\n                for page in result.pages\n            ],\n        }\n\n\nWhen to use:\n\nUser needs to crawl React/Vue/Angular applications\nContent is dynamically loaded via JavaScript\nAnti-bot protection is present\nKey Parameters & Configuration\nCrawl Limits\n# Hard limits enforced by the API\nMAX_PAGES = 10      # Maximum pages per crawl\nMAX_DEPTH = 3       # Maximum link depth\n\nEngine Selection\nEngine.SIMPLE         # Fast, for static HTML (default)\nEngine.STEALTH_MODE   # Slower, for JavaScript-rendered content\n\nTimeout Recommendations\n# Simple scraping: 30,000 ms (30 seconds)\n# Crawling: 60,000 ms (60 seconds) \n# Search operations: 120,000 ms (2 minutes)\n# Smart extraction: 45,000 ms (45 seconds)\n\nURL Pattern Filtering\n# Common patterns for include_patterns\nDOCS_PATTERNS = [\"/docs/\", \"/guide/\", \"/api/\", \"/reference/\"]\nBLOG_PATTERNS = [\"/blog/\", \"/posts/\", \"/articles/\"]\n\n# Common patterns for exclude_patterns\nEXCLUDE_PATTERNS = [\"/admin/\", \"/login/\", \"/draft/\", \"/internal/\"]\nMEDIA_EXCLUDE = [\".pdf\", \".jpg\", \".png\", \".mp4\", \".zip\"]\n\nError Handling\nException Types\nfrom wrynai import (\n    WrynAIError,           # Base exception\n    AuthenticationError,    # Invalid API key (401)\n    BadRequestError,        # Invalid parameters (400)\n    RateLimitError,         # Rate limit exceeded (429)\n    TimeoutError,           # Request timeout\n    ServerError,            # Server error (5xx)\n    ConnectionError,        # Network issue\n    ValidationError,        # Local validation error\n)\n\nError Handling Pattern\ntry:\n    result = client.crawl(url)\nexcept AuthenticationError:\n    # Check WRYNAI_API_KEY environment variable\n    pass\nexcept RateLimitError as e:\n    # Wait for e.retry_after seconds\n    time.sleep(e.retry_after or 60)\nexcept TimeoutError:\n    # Increase timeout_ms parameter\n    pass\nexcept WrynAIError as e:\n    # General API error\n    print(f\"Error: {e} (status: {e.status_code})\")\n\nBest Practices\n1. Always Use Environment Variables\nimport os\napi_key = os.environ.get(\"WRYNAI_API_KEY\")\nif not api_key:\n    raise ValueError(\"WRYNAI_API_KEY environment variable required\")\n\n2. Use Context Managers\n# Recommended - automatic resource cleanup\nwith WrynAI(api_key=api_key) as client:\n    result = client.crawl(url)\n\n# Not recommended - manual cleanup required\nclient = WrynAI(api_key=api_key)\ntry:\n    result = client.crawl(url)\nfinally:\n    client.close()\n\n3. Set Appropriate Timeouts\n# For simple pages\ntimeout_ms=30000\n\n# For crawling multiple pages\ntimeout_ms=60000\n\n# For JavaScript-heavy sites\ntimeout_ms=90000\n\n4. Graceful Degradation\ntry:\n    # Try structured extraction first\n    result = client.extract_structured_text(url)\n    content = result.main_text\nexcept Exception:\n    try:\n        # Fall back to simple text\n        result = client.extract_text(url)\n        content = result.text\n    except Exception:\n        content = None\n\n5. Respect Rate Limits\nimport time\n\nfor url in urls:\n    result = client.crawl(url)\n    time.sleep(1)  # Be nice to the API\n\nAdvanced Features\nSmart Listing Extraction (PRO)\n\nExtract structured data from listing pages (e-commerce, directories).\n\ndef extract_product_listings(url: str) -> list:\n    \"\"\"Extract product information from listing pages.\"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key) as client:\n        result = client.auto_listing(\n            url=url,\n            engine=Engine.STEALTH_MODE,\n            timeout_ms=60000,\n        )\n        \n        return [\n            {\n                \"title\": item.get(\"title\"),\n                \"price\": item.get(\"price\"),\n                \"rating\": item.get(\"rating\"),\n                \"url\": item.get(\"url\"),\n            }\n            for item in result.items\n        ]\n\nScreenshot Capture\nimport base64\nfrom wrynai import ScreenshotType\n\ndef capture_page_screenshot(url: str, fullpage: bool = False) -> str:\n    \"\"\"Capture page screenshot and save to file.\"\"\"\n    api_key = os.environ.get(\"WRYNAI_API_KEY\")\n    \n    with WrynAI(api_key=api_key) as client:\n        result = client.take_screenshot(\n            url=url,\n            screenshot_type=ScreenshotType.FULLPAGE if fullpage else ScreenshotType.VIEWPORT,\n            timeout_ms=30000,\n        )\n        \n        # Decode and save\n        image_data = result.screenshot\n        if \",\" in image_data:\n            image_data = image_data.split(\",\")[1]\n        \n        filename = \"screenshot.png\"\n        with open(filename, \"wb\") as f:\n            f.write(base64.b64decode(image_data))\n        \n        return filename\n\nCommon Use Cases\n1. Competitive Research\n\n\"Search for [topic] and crawl the top 5 results\"\n\n2. Documentation Aggregation\n\n\"Crawl the Python documentation and extract all API references\"\n\n3. Content Migration\n\n\"Crawl our old website and extract all blog posts in markdown\"\n\n4. Link Analysis\n\n\"Find all external links on [website]\"\n\n5. Site Monitoring\n\n\"Crawl [site] and check if [content] is present\"\n\n6. Knowledge Base Creation\n\n\"Crawl [documentation site] and create a searchable knowledge base\"\n\nLimitations & Considerations\nHard Limits: Maximum 10 pages per crawl, depth of 3\nRate Limits: API has rate limits; handle RateLimitError appropriately\nTimeout Management: Adjust timeouts based on site complexity\nJavaScript Rendering: Use Engine.STEALTH_MODE for SPAs (slower but necessary)\nRobots.txt: SDK respects robots.txt; some pages may be blocked\nDynamic Content: Some dynamically loaded content may require stealth mode\nTroubleshooting\nCommon Issues\n\nIssue: AuthenticationError\n\nSolution: Verify WRYNAI_API_KEY environment variable is set correctly\n\nIssue: RateLimitError\n\nSolution: Implement retry with e.retry_after wait time\n\nIssue: TimeoutError\n\nSolution: Increase timeout_ms parameter\n\nIssue: Empty content returned\n\nSolution: Try Engine.STEALTH_MODE for JavaScript-rendered pages\n\nIssue: Missing links/content\n\nSolution: Check exclude_patterns and include_patterns configuration\nIntegration with OpenClaw\n\nWhen using this skill with OpenClaw:\n\nSet environment variable before running:\n\nexport WRYNAI_API_KEY=\"your-api-key\"\n\n\nInstall dependencies:\n\npip install wrynai\n\n\nUse in your OpenClaw workflows:\n\nCall the crawling functions directly from your automation scripts\nIntegrate with other OpenClaw skills for comprehensive data pipelines\nUse the returned data structures in downstream processing\nAPI Reference Quick Links\nDocumentation: https://docs.wryn.ai\nAPI Signup: https://wryn.ai\nGitHub: https://github.com/wrynai/wrynai-python\nVersion Information\nSkill Version: 1.0.0\nSDK Version: wrynai v1.0.0\nPython Version: 3.8+\nLast Updated: 2025-02-07"
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/wrynai/wrynai-skill",
    "publisherUrl": "https://clawhub.ai/wrynai/wrynai-skill",
    "owner": "wrynai",
    "version": "1.0.0",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/wrynai-skill",
    "downloadUrl": "https://openagent3.xyz/downloads/wrynai-skill",
    "agentUrl": "https://openagent3.xyz/skills/wrynai-skill/agent",
    "manifestUrl": "https://openagent3.xyz/skills/wrynai-skill/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/wrynai-skill/agent.md"
  }
}