{
  "schemaVersion": "1.0",
  "item": {
    "slug": "llms-txt-generator",
    "name": "LLMs.txt Generator",
    "source": "tencent",
    "type": "skill",
    "category": "AI 智能",
    "sourceUrl": "https://clawhub.ai/ngm9/llms-txt-generator",
    "canonicalUrl": "https://clawhub.ai/ngm9/llms-txt-generator",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/llms-txt-generator",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=llms-txt-generator",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "SKILL.md",
      "package.json",
      "references/llms_txt_spec.md",
      "scripts/crawl.py"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/llms-txt-generator"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/llms-txt-generator",
    "agentPageUrl": "https://openagent3.xyz/skills/llms-txt-generator/agent",
    "manifestUrl": "https://openagent3.xyz/skills/llms-txt-generator/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/llms-txt-generator/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "Overview",
        "body": "This skill crawls a business website, extracts structured information, and generates a properly formatted llms.txt file — the standard that makes any business readable and transactable by AI agents.\n\nIt follows the llmstxt.org specification with business-specific extensions:\n\n## Team — builds agent trust in the people behind the business\n## Clients & Testimonials — social proof for agent decision-making\n## For Agents — how agents can interact (or a clear \"coming soon\" notice)\n\nRead references/llms_txt_spec.md before generating any output."
      },
      {
        "title": "Step 1 — Get the URL",
        "body": "If the user didn't provide a URL, ask:\n\n\"What's the website URL?\"\n\nNormalize it (add https:// if missing)."
      },
      {
        "title": "Step 2 — Crawl",
        "body": "Run the crawler:\n\n~/.virtualenvs/llms-txt-generator/bin/python3 \\\n  ~/.openclaw/workspace/llms-txt-generator/scripts/crawl.py \\\n  {url} > /tmp/llms_business_info.json\n\nRead /tmp/llms_business_info.json. Note:\n\nWhat pages were crawled\nWhat was found vs missing (team, pricing, testimonials, API)\nWhether an existing llms.txt was found\n\nTell the user briefly:\n\n\"Crawled {domain} ({N} pages). Found: {what was found}. I'll ask about a few things I couldn't determine.\"\n\nIf the crawl found an existing llms.txt, note it:\n\n\"I noticed you already have a llms.txt at {domain}/llms.txt. I'll generate a fresh one — you can compare and decide which to keep.\""
      },
      {
        "title": "Step 3 — Ask for additional sources (always ask this first)",
        "body": "\"Are there any other pages I should read? (docs, API reference, existing llms.txt, press page — anything useful)\"\n\nIf they provide URLs, re-run the crawl with those extras:\n\n~/.virtualenvs/llms-txt-generator/bin/python3 \\\n  ~/.openclaw/workspace/llms-txt-generator/scripts/crawl.py \\\n  {url} {extra_url1} {extra_url2} > /tmp/llms_business_info.json\n\nIf they say no/skip, continue."
      },
      {
        "title": "Step 4 — Generate Pass 1 draft + gap report",
        "body": "Generate a draft llms.txt now using what you have from the crawl. Use all heuristic signals (team_found, testimonials_found, pricing_found, etc.) and the raw_text_summary.\n\nWrite the draft. For any section you couldn't populate confidently, use a clear [NOT FOUND] placeholder.\n\nThen show it to the user with a gap report:\n\n\"Here's a first draft of your llms.txt:\n{draft}\n\nFound automatically: {brief list — e.g. emails, pricing page, testimonials from Wybrid + Cital}\nCouldn't determine: {brief list — e.g. team, pricing figures, API}\nTwo questions to start:\n\n{Most important gap — e.g. \"Who's on the founding team? Names, roles, and an email if you're comfortable.\"}\n{Second most important — e.g. \"What's your pricing model? Even a rough description — per-candidate, subscription, etc.\"}\n\n_(I have a few more after these. Also — say 'dig deeper' if you'd rather I try to find it myself.)\""
      },
      {
        "title": "Step 4b — Handle \"dig deeper\" (Pass 2)",
        "body": "If the user says \"dig deeper\" (or similar — \"try again\", \"re-crawl\", \"look harder\"):\n\nRe-run the crawl in deep mode:\n\n~/.virtualenvs/llms-txt-generator/bin/python3 \\\n  ~/.openclaw/workspace/llms-txt-generator/scripts/crawl.py \\\n  {url} {extra_urls} --deep > /tmp/llms_business_info.json\n\nThis returns pages_raw — the full raw text of every crawled page. Use it to extract structure with the LLM. In your generation prompt (Step 5), add:\n\nIn addition to the heuristic signals, here is the full raw text from each crawled page.\nExtract team members, testimonials, pricing details, and any API information directly from this text.\n\nHomepage raw text:\n{pages_raw[homepage_url]}\n\nTeam page raw text (if available):\n{pages_raw[team_url]}\n\nPricing page raw text (if available):\n{pages_raw[pricing_url]}\n\nTell the user:\n\n\"Doing a deeper crawl — this takes a bit longer but I'll extract everything I can from the raw page content.\"\n\nAfter Pass 2, show the updated draft with the same gap report format. Whatever still can't be found, ask the user directly."
      },
      {
        "title": "Step 5 — Conversational gap-filling (for anything still missing)",
        "body": "Ask questions one at a time — only for things still [NOT FOUND] after Pass 1/2. Wait for each answer. Stop as soon as you have enough to finalize.\n\nUse your judgment — if the user has already filled most gaps conversationally, skip remaining questions and generate.\n\nQ1 — Core value for agents (always ask):\n\n\"In one or two sentences: what should an AI agent understand about what it can do or get by working with {domain}?\"\n\nQ2 — Team (ask if team not found in crawl):\n\n\"I didn't find team info publicly. Want to add a Team section? It helps agents trust who's behind the business. Just names, roles, and emails if you're comfortable.\"\n\nQ3 — Clients / testimonials (ask if not found):\n\n\"Any existing clients or testimonials I can include? Even a couple of company names or a one-line quote builds agent trust. Totally optional.\"\n\nQ4 — API / integration (ask if api_found=false):\n\n\"Is there a public API or docs page agents can reference? (skip if not applicable)\"\n\nQ5 — Pricing (ask if pricing_found=false):\n\n\"What's the pricing model? Even a rough description helps — like 'per assessment' or 'monthly subscription'.\"\n\nQ6 — ICP / agent-buyers (ask if not obvious from context):\n\n\"Who are the kinds of agents or automated systems most likely to want to work with you? (e.g. HR bots, recruiting pipelines)\"\n\nQ7 — Anything else (optional, ask last):\n\n\"Anything else agents should know before working with you? (geographic limits, onboarding steps, etc.)\""
      },
      {
        "title": "Step 6 — Generate final llms.txt",
        "body": "Read references/llms_txt_spec.md now if you haven't already.\n\nGenerate the complete llms.txt using ALL information gathered:\n\nThe crawled business_info JSON (and pages_raw if deep mode ran)\nThe user's answers from the conversation\nThe spec from references/llms_txt_spec.md\n\nGeneration rules:\n\nFollow the spec format exactly: H1 title → blockquote summary → H2 sections → named links\nEvery bullet = - [Title](url): description — no plain text bullets\nSection order: Services → Team → Clients & Testimonials → For Agents → Pricing → API → Links → Optional\n## Team: Always include. Use crawled/user-provided data. If none available, omit silently.\n## Clients & Testimonials: Always try to include. Structure:\n\nICP bullets first (who the business serves)\nThen a ### subsection per named client where you have a real quote or case study detail\nEach subsection: blockquote with verbatim/lightly-cleaned quote, optional Problem: and Outcome: lines\nIf you only have a name + one-liner with no detail, a single bullet is fine\nNever invent quotes or outcomes\n\n\n## For Agents: ALWAYS include. If no API info: add the \"coming soon\" notice + contact email. Never skip.\n## Pricing: If unknown, link to pricing page with no summary. If no pricing page, omit.\n## API: Document URL only — no auth details, no secrets.\n## Optional: FAQs, blog, case studies, anything supplementary.\nDo NOT invent facts. If something is unknown and user didn't provide it, either omit it or note it clearly.\nKeep it tight — this is for agents, not humans. No marketing fluff.\n\nWrite the final llms.txt to /tmp/llms_final.txt."
      },
      {
        "title": "Step 7 — Show and confirm",
        "body": "Show the full llms.txt to the user in a code block, then ask:\n\n\"Here's your llms.txt 👆\nDoes this look right? You can:\n\nTell me what to change\nSay 'save' to download it\nSay 'deploy' when you're ready to push it live (Phase 2)\""
      },
      {
        "title": "Step 8 — Handle revisions",
        "body": "If the user asks for changes, make them and show the updated version. Repeat until satisfied.\n\nIf they say 'save': tell them the file is at /tmp/llms_final.txt and they can copy it to their project.\n\nIf they say 'deploy': acknowledge and note that deployment via Cloudflare Workers is coming in Phase 2."
      },
      {
        "title": "Notes",
        "body": "Existing llms.txt: If the crawl found one, mention it early: \"I noticed you already have a llms.txt. I'll generate a fresh one — you can compare and decide which to keep.\"\nAnchor-only links (e.g. /#section): Skip for Level 2 crawling — they don't load new content.\nThe For Agents section is mandatory — even if empty of details, it signals intent to support agents and provides a contact path.\nNever ask all questions at once — it's a conversation, not a form."
      }
    ],
    "body": "LLMs.txt Generator\nOverview\n\nThis skill crawls a business website, extracts structured information, and generates a properly formatted llms.txt file — the standard that makes any business readable and transactable by AI agents.\n\nIt follows the llmstxt.org specification with business-specific extensions:\n\n## Team — builds agent trust in the people behind the business\n## Clients & Testimonials — social proof for agent decision-making\n## For Agents — how agents can interact (or a clear \"coming soon\" notice)\n\nRead references/llms_txt_spec.md before generating any output.\n\nWorkflow\nStep 1 — Get the URL\n\nIf the user didn't provide a URL, ask:\n\n\"What's the website URL?\"\n\nNormalize it (add https:// if missing).\n\nStep 2 — Crawl\n\nRun the crawler:\n\n~/.virtualenvs/llms-txt-generator/bin/python3 \\\n  ~/.openclaw/workspace/llms-txt-generator/scripts/crawl.py \\\n  {url} > /tmp/llms_business_info.json\n\n\nRead /tmp/llms_business_info.json. Note:\n\nWhat pages were crawled\nWhat was found vs missing (team, pricing, testimonials, API)\nWhether an existing llms.txt was found\n\nTell the user briefly:\n\n\"Crawled {domain} ({N} pages). Found: {what was found}. I'll ask about a few things I couldn't determine.\"\n\nIf the crawl found an existing llms.txt, note it:\n\n\"I noticed you already have a llms.txt at {domain}/llms.txt. I'll generate a fresh one — you can compare and decide which to keep.\"\n\nStep 3 — Ask for additional sources (always ask this first)\n\n\"Are there any other pages I should read? (docs, API reference, existing llms.txt, press page — anything useful)\"\n\nIf they provide URLs, re-run the crawl with those extras:\n\n~/.virtualenvs/llms-txt-generator/bin/python3 \\\n  ~/.openclaw/workspace/llms-txt-generator/scripts/crawl.py \\\n  {url} {extra_url1} {extra_url2} > /tmp/llms_business_info.json\n\n\nIf they say no/skip, continue.\n\nStep 4 — Generate Pass 1 draft + gap report\n\nGenerate a draft llms.txt now using what you have from the crawl. Use all heuristic signals (team_found, testimonials_found, pricing_found, etc.) and the raw_text_summary.\n\nWrite the draft. For any section you couldn't populate confidently, use a clear [NOT FOUND] placeholder.\n\nThen show it to the user with a gap report:\n\n\"Here's a first draft of your llms.txt:\n\n{draft}\n\n\nFound automatically: {brief list — e.g. emails, pricing page, testimonials from Wybrid + Cital} Couldn't determine: {brief list — e.g. team, pricing figures, API}\n\nTwo questions to start:\n\n{Most important gap — e.g. \"Who's on the founding team? Names, roles, and an email if you're comfortable.\"}\n{Second most important — e.g. \"What's your pricing model? Even a rough description — per-candidate, subscription, etc.\"}\n\n_(I have a few more after these. Also — say 'dig deeper' if you'd rather I try to find it myself.)\"\n\nStep 4b — Handle \"dig deeper\" (Pass 2)\n\nIf the user says \"dig deeper\" (or similar — \"try again\", \"re-crawl\", \"look harder\"):\n\nRe-run the crawl in deep mode:\n\n~/.virtualenvs/llms-txt-generator/bin/python3 \\\n  ~/.openclaw/workspace/llms-txt-generator/scripts/crawl.py \\\n  {url} {extra_urls} --deep > /tmp/llms_business_info.json\n\n\nThis returns pages_raw — the full raw text of every crawled page. Use it to extract structure with the LLM. In your generation prompt (Step 5), add:\n\nIn addition to the heuristic signals, here is the full raw text from each crawled page.\nExtract team members, testimonials, pricing details, and any API information directly from this text.\n\nHomepage raw text:\n{pages_raw[homepage_url]}\n\nTeam page raw text (if available):\n{pages_raw[team_url]}\n\nPricing page raw text (if available):\n{pages_raw[pricing_url]}\n\n\nTell the user:\n\n\"Doing a deeper crawl — this takes a bit longer but I'll extract everything I can from the raw page content.\"\n\nAfter Pass 2, show the updated draft with the same gap report format. Whatever still can't be found, ask the user directly.\n\nStep 5 — Conversational gap-filling (for anything still missing)\n\nAsk questions one at a time — only for things still [NOT FOUND] after Pass 1/2. Wait for each answer. Stop as soon as you have enough to finalize.\n\nUse your judgment — if the user has already filled most gaps conversationally, skip remaining questions and generate.\n\nQ1 — Core value for agents (always ask):\n\n\"In one or two sentences: what should an AI agent understand about what it can do or get by working with {domain}?\"\n\nQ2 — Team (ask if team not found in crawl):\n\n\"I didn't find team info publicly. Want to add a Team section? It helps agents trust who's behind the business. Just names, roles, and emails if you're comfortable.\"\n\nQ3 — Clients / testimonials (ask if not found):\n\n\"Any existing clients or testimonials I can include? Even a couple of company names or a one-line quote builds agent trust. Totally optional.\"\n\nQ4 — API / integration (ask if api_found=false):\n\n\"Is there a public API or docs page agents can reference? (skip if not applicable)\"\n\nQ5 — Pricing (ask if pricing_found=false):\n\n\"What's the pricing model? Even a rough description helps — like 'per assessment' or 'monthly subscription'.\"\n\nQ6 — ICP / agent-buyers (ask if not obvious from context):\n\n\"Who are the kinds of agents or automated systems most likely to want to work with you? (e.g. HR bots, recruiting pipelines)\"\n\nQ7 — Anything else (optional, ask last):\n\n\"Anything else agents should know before working with you? (geographic limits, onboarding steps, etc.)\"\n\nStep 6 — Generate final llms.txt\n\nRead references/llms_txt_spec.md now if you haven't already.\n\nGenerate the complete llms.txt using ALL information gathered:\n\nThe crawled business_info JSON (and pages_raw if deep mode ran)\nThe user's answers from the conversation\nThe spec from references/llms_txt_spec.md\n\nGeneration rules:\n\nFollow the spec format exactly: H1 title → blockquote summary → H2 sections → named links\nEvery bullet = - [Title](url): description — no plain text bullets\nSection order: Services → Team → Clients & Testimonials → For Agents → Pricing → API → Links → Optional\n## Team: Always include. Use crawled/user-provided data. If none available, omit silently.\n## Clients & Testimonials: Always try to include. Structure:\nICP bullets first (who the business serves)\nThen a ### subsection per named client where you have a real quote or case study detail\nEach subsection: blockquote with verbatim/lightly-cleaned quote, optional Problem: and Outcome: lines\nIf you only have a name + one-liner with no detail, a single bullet is fine\nNever invent quotes or outcomes\n## For Agents: ALWAYS include. If no API info: add the \"coming soon\" notice + contact email. Never skip.\n## Pricing: If unknown, link to pricing page with no summary. If no pricing page, omit.\n## API: Document URL only — no auth details, no secrets.\n## Optional: FAQs, blog, case studies, anything supplementary.\nDo NOT invent facts. If something is unknown and user didn't provide it, either omit it or note it clearly.\nKeep it tight — this is for agents, not humans. No marketing fluff.\n\nWrite the final llms.txt to /tmp/llms_final.txt.\n\nStep 7 — Show and confirm\n\nShow the full llms.txt to the user in a code block, then ask:\n\n\"Here's your llms.txt 👆\n\nDoes this look right? You can:\n\nTell me what to change\nSay 'save' to download it\nSay 'deploy' when you're ready to push it live (Phase 2)\"\nStep 8 — Handle revisions\n\nIf the user asks for changes, make them and show the updated version. Repeat until satisfied.\n\nIf they say 'save': tell them the file is at /tmp/llms_final.txt and they can copy it to their project.\n\nIf they say 'deploy': acknowledge and note that deployment via Cloudflare Workers is coming in Phase 2.\n\nNotes\nExisting llms.txt: If the crawl found one, mention it early: \"I noticed you already have a llms.txt. I'll generate a fresh one — you can compare and decide which to keep.\"\nAnchor-only links (e.g. /#section): Skip for Level 2 crawling — they don't load new content.\nThe For Agents section is mandatory — even if empty of details, it signals intent to support agents and provides a contact path.\nNever ask all questions at once — it's a conversation, not a form."
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/ngm9/llms-txt-generator",
    "publisherUrl": "https://clawhub.ai/ngm9/llms-txt-generator",
    "owner": "ngm9",
    "version": "0.1.0",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/llms-txt-generator",
    "downloadUrl": "https://openagent3.xyz/downloads/llms-txt-generator",
    "agentUrl": "https://openagent3.xyz/skills/llms-txt-generator/agent",
    "manifestUrl": "https://openagent3.xyz/skills/llms-txt-generator/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/llms-txt-generator/agent.md"
  }
}