Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Generic Documentation Indexing & Search. Index any documentation site (SPA/static) and search it instantly.
Generic Documentation Indexing & Search. Index any documentation site (SPA/static) and search it instantly.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
A powerful, reusable skill for indexing and searching ANY documentation site.
anydocs solves a real problem: accessing documentation from code or CLI. Instead of opening a browser every time, you can: Index any documentation site (Discord, OpenClaw, internal docs, etc.) Search instantly from the command line or Python API Cache pages locally to avoid repeated network calls Configure multiple profiles for different doc sites
Use anydocs when you need to: Quickly look up API documentation without leaving the terminal Build agents that need to reference docs Extract specific information from documentation Search across multiple documentation sites Integrate docs into your workflow
Keyword search: Fast, term-based matching with BM25-style scoring Hybrid search: Keyword + phrase proximity for better relevance Regex search: Advanced pattern matching for power users
Sitemap-based discovery (standard XML sitemap) Fallback crawling from base URL HTML content extraction with smart selector detection Automatic rate limiting to be respectful
Pages cached locally with 7-day TTL (configurable) Search indexes cached for instant second searches Cache statistics and cleanup commands Respects cache invalidation
Support multiple doc sites simultaneously Per-profile search methods and cache TTLs Configuration stored in ~/.anydocs/config.json Examples for Discord, OpenClaw, and custom sites
Uses Playwright to render client-side SPAs (Single Page Apps) Automatically discovers links on JS-heavy sites like Discord docs Gracefully falls back to standard HTTP if Playwright unavailable Configure per-discovery session or globally per profile
cd /path/to/skills/anydocs pip install -r requirements.txt chmod +x anydocs.py
For sites like Discord that use client-side rendering, install Playwright: pip install playwright==1.40.0 playwright install # Downloads Chromium If Playwright is unavailable, anydocs gracefully falls back to standard HTTP fetching.
python anydocs.py config vuejs \ https://vuejs.org \ https://vuejs.org/sitemap.xml
python anydocs.py index vuejs This discovers all pages via sitemap, scrapes content, and builds a searchable index.
python anydocs.py search "composition api" --profile vuejs python anydocs.py search "reactivity" --profile vuejs --limit 5
python anydocs.py fetch "guide/introduction" --profile vuejs
# Add or update a profile anydocs config <profile> <base_url> <sitemap_url> [--search-method hybrid] [--ttl-days 7] # List configured profiles anydocs list-profiles
# Build index for a profile anydocs index <profile> # Force re-index (skip cache) anydocs index <profile> --force
# Basic keyword search anydocs search "query" --profile discord # Limit results anydocs search "query" --profile discord --limit 5 # Regex search anydocs search "^API" --profile discord --regex
# Fetch a specific page (URL or path) anydocs fetch "https://discord.com/developers/docs/resources/webhook" anydocs fetch "resources/webhook" --profile discord
# Show cache statistics anydocs cache status # Clear all cache anydocs cache clear # Clear specific profile's cache anydocs cache clear --profile discord
For use in agents and scripts: from lib.config import ConfigManager from lib.scraper import DiscoveryEngine from lib.indexer import SearchIndex # Load configuration config_mgr = ConfigManager() config = config_mgr.get_profile("discord") # Scrape documentation scraper = DiscoveryEngine(config["base_url"], config["sitemap_url"]) pages = scraper.fetch_all() # Build search index index = SearchIndex() index.build(pages) # Search results = index.search("webhooks", limit=10) for result in results: print(f"{result['title']} ({result['relevance_score']})") print(f" {result['url']}")
Configuration is stored in ~/.anydocs/config.json: { "discord": { "name": "discord", "base_url": "https://discord.com/developers/docs", "sitemap_url": "https://discord.com/developers/docs/sitemap.xml", "search_method": "hybrid", "cache_ttl_days": 7 }, "openclaw": { "name": "openclaw", "base_url": "https://docs.openclaw.ai", "sitemap_url": "https://docs.openclaw.ai/sitemap.xml", "search_method": "hybrid", "cache_ttl_days": 7 } }
Speed: Fast Best for: Common terms, exact matches How it works: Term matching with position weighting (title > tags > content) Example: anydocs search "webhooks"
Speed: Fast Best for: Natural language queries How it works: Keyword search + phrase proximity scoring Example: anydocs search "how to set up webhooks"
Speed: Medium Best for: Complex patterns How it works: Compiled regex pattern matching across all content Example: anydocs search "^(GET|POST)" --regex
Pages: Cached as JSON with 7-day TTL (configurable) Indexes: Cached after indexing, invalidated on TTL expiry Cache location: ~/.anydocs/cache/ Manual refresh: Use --force flag or clear cache
First index build takes 2-10 minutes depending on site size Subsequent searches are instant (cached indexes) Rate limit: 0.5s per page to be respectful Typical search returns ~100 results in <100ms
Run anydocs index <profile> first to build the index.
Check the sitemap URL. Falls back to crawling from base_url if unavailable.
This is normal for large sites. Rate limiting prevents overwhelming servers.
Run anydocs cache clear or set --ttl-days to a smaller value.
anydocs config vuejs \ https://vuejs.org \ https://vuejs.org/sitemap.xml anydocs index vuejs anydocs search "composition api"
anydocs config nextjs \ https://nextjs.org \ https://nextjs.org/sitemap.xml anydocs index nextjs anydocs search "app router" --profile nextjs
anydocs config internal \ https://docs.company.local \ https://docs.company.local/sitemap.xml anydocs index internal --force anydocs search "deployment" --profile internal
scraper.py: Discovers URLs via sitemap, fetches and parses HTML indexer.py: Builds searchable indexes, implements multiple search strategies config.py: Manages configuration profiles cache.py: TTL-based file caching for pages and indexes cli.py: Click-based command-line interface
To add new documentation sites, run: anydocs config <profile> <base_url> <sitemap_url> To extend search functionality, modify lib/indexer.py.
Part of the OpenClaw system.
Workflow acceleration for inboxes, docs, calendars, planning, and execution loops.
Largest current source with strong distribution and engagement signals.