Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Track skill versions, benchmark performance, compare improvements, and get self-improvement signals. Integrates with tasktime and ClawVault.
Track skill versions, benchmark performance, compare improvements, and get self-improvement signals. Integrates with tasktime and ClawVault.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Self-improving skill ecosystem for AI agents. Track skill versions, benchmark performance, compare improvements, and get signals on what to fix next. Part of the ClawVault ecosystem | tasktime | ClawHub
npm install -g @versatly/skillbench
1. Use a skill β skillbench use github@1.0.0 2. Do the task β tt start "Create PR" && ... && tt stop 3. Record result β skillbench record "Create PR" --success 4. Check scores β skillbench score github 5. Improve skill β Update skill, bump version 6. Repeat β Compare v1.0.0 vs v1.1.0
skillbench use github@1.2.0 # Set active skill version skillbench skills # List tracked skills + signals
# Auto-pulls duration from tasktime skillbench record "Create PR" --success # Manual duration skillbench record "Create PR" --duration 45s --success # Record failures skillbench record "Create PR" --fail --error-type "auth-error"
skillbench score # All skills with grades skillbench score github # Single skill skillbench compare github@1.0.0 github@1.1.0
skillbench export --format markdown skillbench export --format json skillbench dashboard # Generate HTML dashboard skillbench dashboard --open # Generate and open in browser
skillbench test tasktime@1.1.0 # Run smoke test skillbench test tasktime@1.1.0 --suite full # Run named suite skillbench test tasktime@1.1.0 --dry-run # Test without recording
skillbench sync --clawhub # Import installed skills skillbench sync --vault # Sync to ClawVault skillbench sync --all # Everything
skillbench health # Overall health report with alerts skillbench watch --once # Run all test suites once skillbench watch --interval 300 # Continuous monitoring every 5 min
skillbench improve # Get suggestions for weakest skill skillbench improve github # Improvement plan for specific skill skillbench trend tasktime --days 30 # Performance trend over time skillbench leaderboard # Compare agents (multi-agent setups) skillbench schedule --interval 60 # Generate cron config for auto-testing
skillbench baseline tasktime --set # Set baseline from current performance skillbench baseline --list # List all baselines skillbench baseline --check # Check all baselines (CI-friendly, exits 1 if failing) skillbench baseline tasktime --remove # Remove a baseline
skillbench ci # Run all tests + baseline checks skillbench ci --json # JSON output for automation skillbench badge # Generate shields.io badges for README Copy examples/github-action.yml for ready-to-use GitHub Actions workflow.
GradeScoreMeaningπ A+95-100Elite performanceβ A85-94Excellentπ B70-84Goodβ οΈ C50-69Needs workβ D<50Broken Based on: Success Rate (40%), Avg Duration (30%), Consistency (20%), Trend (10%)
When you omit --duration, skillbench pulls from tasktime: tt start "Create PR" -c git # ... do work ... tt stop skillbench record --success # Duration auto-pulled
Benchmarks sync to ClawVault automatically.
skillbench skills shows: β οΈ needs work β Success rate below 70% π stale β No benchmarks in 7+ days βοΈ declining β Getting worse over time
ClawVault β Memory system for AI agents tasktime β Task timer CLI ClawHub β Skill marketplace
Workflow acceleration for inboxes, docs, calendars, planning, and execution loops.
Largest current source with strong distribution and engagement signals.