Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Create, evaluate, improve, benchmark, and publish OpenClaw skills. Use when building a new skill from scratch, iterating on an existing skill, running evals...
Create, evaluate, improve, benchmark, and publish OpenClaw skills. Use when building a new skill from scratch, iterating on an existing skill, running evals...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Build, refine, and publish OpenClaw skills. Supports six modes.
ModeWhen to UseOutputCreateNew skill from scratch<name>/SKILL.md + resourcesEvalMeasure skill qualityRun report + pass/failImproveIterate an existing skillNew version with changelogBenchmarkCompare two skill versionsWinner + delta analysisAnalyzeExtract reusable patternspatterns.md reportSynthesizeBuild skill from patternsScaffolded SKILL.md
Build a skill from scratch in 6 steps.
Clarify before writing a single line: What does this skill do that no existing skill does? Who triggers it and when? (the description field drives triggering) What CLI tools, APIs, or files does it need? What's the output format? Run scripts/analyze_patterns.py --query "<skill concept>" to see if relevant patterns already exist.
Write a one-paragraph spec covering: trigger conditions, happy path, error cases, output format. Confirm with user if uncertain.
Scripts are bundled in scripts/ โ no external path needed: # From your workspace skills directory: python3 $(openclaw skills info skill-creator --json 2>/dev/null | python3 -c "import json,sys; print(json.load(sys.stdin).get('path',''))")/scripts/init_skill.py \ <skill-name> \ --path ~/.openclaw/workspace/skills/ \ --resources scripts,references \ --examples Or locate the skill dir and use relative path: SKILL_DIR=$(dirname $(find ~/.openclaw/workspace/skills ~/.nvm -name "init_skill.py" 2>/dev/null | head -1)) python3 "$SKILL_DIR/init_skill.py" <skill-name> --path ~/.openclaw/workspace/skills/ --resources scripts,references This creates: <skill-name>/ SKILL.md # Edit this scripts/ # Helper scripts references/ # Reference docs, cheat sheets _meta.json # Auto-populated on publish
Frontmatter rules: --- name: my-skill-name # lowercase-hyphen, max 64 chars description: "One sentence: what it does AND when to use it. Include trigger phrases." --- Body structure: # Skill Title Brief one-liner. ## Quick Start [Most common usage โ 3-5 lines max] ## Commands / Recipes [Concrete examples with real output] ## Reference [Full option tables, edge cases, advanced usage] Progressive disclosure rules: Frontmatter: always loaded (~100 words) โ make it count Body: loaded on trigger (<500 lines) โ stay under limit Bundled resources: loaded on demand โ put verbosity here
# package_skill.py is bundled in this skill's scripts/ directory: SKILL_SCRIPTS="$(dirname "$(find ~/.openclaw/workspace/skills/skill-creator ~/.nvm -name "package_skill.py" 2>/dev/null | head -1)")" python3 "$SKILL_SCRIPTS/package_skill.py" ~/.openclaw/workspace/skills/<skill-name> Validates structure, outputs <skill-name>.skill zip.
Run evals (Mode 2) โ identify failures โ update SKILL.md โ re-package โ repeat.
Measure skill quality against defined expectations.
Create evals/evals.json: [ { "id": "basic-create", "prompt": "Create a skill that sends a Slack message", "expected_output": "SKILL.md with slack-notifier name and working command", "assertions": [ "contains SKILL.md frontmatter with name and description", "contains at least one bash command example", "description includes trigger phrases" ] } ]
For each eval case: Execute the prompt using current skill Grade against assertions (pass/fail per assertion) Log result to evals/runs/<timestamp>.json
{ "skill": "skill-creator", "version": "1.0.0", "timestamp": "2026-02-22T03:00:00Z", "pass_rate": 0.85, "cases": [ { "id": "basic-create", "passed": true, "assertions_passed": 3, "assertions_total": 3 } ] }
Iterate on an existing skill using eval feedback.
1. Run evals โ identify failing assertions 2. Read current SKILL.md 3. Draft changes targeting failures 4. Write new version (increment semver in _meta.json) 5. Re-run evals โ confirm pass rate improved 6. Update history.json
Track all versions at evals/history.json: [ { "version": "1.0.0", "parent": null, "expectation_pass_rate": 0.70, "is_current_best": false, "notes": "Initial version" }, { "version": "1.1.0", "parent": "1.0.0", "expectation_pass_rate": 0.85, "is_current_best": true, "notes": "Improved trigger description, added Synthesize mode" } ]
Blind A/B comparison of two skill versions.
Run identical eval suite against version A and version B Collect raw outputs without labels Compare blind (no version labels) โ pick winner per case Reveal versions, compute delta Recommend: keep A, adopt B, or cherry-pick specific cases
Version A: 1.0.0 pass_rate=0.70 Version B: 1.1.0 pass_rate=0.85 Delta: +0.15 (B wins) Regressions: 0 Recommendation: Adopt B
Scan installed skills to extract reusable building blocks. python3 ~/.openclaw/workspace/skills/skill-creator/scripts/analyze_patterns.py \ --scan-dirs ~/.openclaw/workspace/skills/,~/.nvm/versions/node/v22.22.0/lib/node_modules/openclaw/skills/ \ --output ~/.openclaw/workspace/skills/skill-creator/references/patterns.md What it extracts: Trigger phrases โ common description keywords that activate skills Tool patterns โ CLI tools, APIs, Docker patterns used across skills Output formats โ JSON schemas, markdown templates, log formats Structural patterns โ how skills organize commands/recipes Error handling patterns โ retry logic, circuit breakers, fallbacks See references/patterns.md for the current extracted pattern library.
Build a new skill scaffold by combining patterns from the library.
When asked to create a skill in a domain that resembles existing skills: Run Analyze Patterns first Query references/patterns.md for relevant patterns Compose a SKILL.md that combines: Best trigger phrases from similar skills Relevant tool/API patterns Appropriate output format Error handling from most robust similar skill
"Create a skill for Twitter scraping": Pull trigger phrases from reddit-scraper Pull CDP/browser patterns from fast-browser-use Pull output format (JSON array) from crypto-market-data Synthesize into twitter-scraper/SKILL.md
<skill-name>/ SKILL.md # Required: frontmatter + body scripts/ # Helper Python/bash scripts references/ # Cheat sheets, API docs, schemas assets/ # Images, templates evals/ evals.json # Test cases runs/ # Eval run results history.json # Version history _meta.json # Publishing metadata _meta.json template: { "ownerId": "", "slug": "skill-name", "version": "1.0.0", "publishedAt": null }
Registry: clawhub.com โ use the clawhub CLI (already installed). # 1. Login (opens browser once) clawhub login # 2. Publish directly from skill folder โ no .skill zip needed clawhub publish ~/.openclaw/workspace/skills/<skill-name> \ --version 1.0.0 \ --changelog "Initial release" # 3. Or sync all workspace skills at once: clawhub sync --workdir ~/.openclaw/workspace --dir skills Ensure _meta.json has correct slug and version Run full eval suite โ pass rate must be โฅ 0.80 clawhub login (one-time browser auth) clawhub publish <skill-folder> Verify at clawhub.com/skills/<slug> Quality bar for publishing: Description triggers correctly (test with 3+ natural phrasings) At least 3 concrete command examples with real output Error cases documented Eval pass rate โฅ 0.80 _meta.json complete
Code helpers, APIs, CLIs, browser automation, testing, and developer operations.
Largest current source with strong distribution and engagement signals.