Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Cut your LLM costs by 200x. Offload parallel, batch, and research work to Gemini Flash workers instead of burning your expensive primary model.
Cut your LLM costs by 200x. Offload parallel, batch, and research work to Gemini Flash workers instead of burning your expensive primary model.
This item is timing out or returning errors right now. Review the source page and try again later.
Use the source page and any available docs to guide the install because the item is currently unstable or timing out.
I tried to install a skill package from Yavira, but the item is currently unstable or timing out. Inspect the source page and any extracted docs, then tell me what you can confirm and any manual steps still required. Then review README.md for any prerequisites, environment setup, or post-install checks.
I tried to upgrade a skill package from Yavira, but the item is currently unstable or timing out. Compare the source page and any extracted docs with my current installation, then summarize what changed and what manual follow-up I still need. Then review README.md for any prerequisites, environment setup, or post-install checks.
Turn your expensive model into an affordable daily driver. Offload the boring stuff to Gemini Flash workers โ parallel, batch, research โ at a fraction of the cost.
30 tasks viaTimeCostOpus (sequential)~30s~$0.50Swarm (parallel)~1s~$0.003
Swarm is ideal for: 3+ independent tasks (research, summaries, comparisons) Comparing or researching multiple subjects Multiple URLs to fetch/analyze Batch processing (documents, entities, facts) Complex analysis needing multiple perspectives โ use chain
# Check daemon (do this every session) swarm status # Start if not running swarm start # Parallel prompts swarm parallel "What is X?" "What is Y?" "What is Z?" # Research multiple subjects swarm research "OpenAI" "Anthropic" "Mistral" --topic "AI safety" # Discover capabilities swarm capabilities
N prompts โ N workers simultaneously. Best for independent tasks. swarm parallel "prompt1" "prompt2" "prompt3"
Multi-phase: search โ fetch โ analyze. Uses Google Search grounding. swarm research "Buildertrend" "Jobber" --topic "pricing 2026"
Data flows through multiple stages, each with a different perspective/filter. Stages run in sequence; tasks within a stage run in parallel. Stage modes: parallel โ N inputs โ N workers (same perspective) single โ merged input โ 1 worker fan-out โ 1 input โ N workers with DIFFERENT perspectives reduce โ N inputs โ 1 synthesized output Auto-chain โ describe what you want, get an optimal pipeline: curl -X POST http://localhost:9999/chain/auto \ -d '{"task":"Find business opportunities","data":"...market data...","depth":"standard"}' Manual chain: swarm chain pipeline.json # or echo '{"stages":[...]}' | swarm chain --stdin Depth presets: quick (2 stages), standard (4), deep (6), exhaustive (8) Built-in perspectives: extractor, filter, enricher, analyst, synthesizer, challenger, optimizer, strategist, researcher, critic Preview without executing: curl -X POST http://localhost:9999/chain/preview \ -d '{"task":"...","depth":"standard"}'
Compare single vs parallel vs chain on the same task with LLM-as-judge scoring. curl -X POST http://localhost:9999/benchmark \ -d '{"task":"Analyze X","data":"...","depth":"standard"}' Scores on 6 FLASK dimensions: accuracy (2x weight), depth (1.5x), completeness, coherence, actionability (1.5x), nuance.
Lets the orchestrator discover what execution modes are available: swarm capabilities # or curl http://localhost:9999/capabilities
LRU cache for LLM responses. 212x speedup on cache hits (parallel), 514x on chains. Keyed by hash of instruction + input + perspective 500 entries max, 1 hour TTL Skips web search tasks (need fresh data) Persists to disk across daemon restarts Per-task bypass: set task.cache = false # View cache stats curl http://localhost:9999/cache # Clear cache curl -X DELETE http://localhost:9999/cache Cache stats show in swarm status.
If tasks fail within a chain stage, only the failed tasks get retried (not the whole stage). Default: 1 retry. Configurable per-phase via phase.retries or globally via options.stageRetries.
All endpoints return cost data in their complete event: session โ current daemon session totals daily โ persisted across restarts, accumulates all day swarm status # Shows session + daily cost swarm savings # Monthly savings report
Workers search the live web via Google Search grounding (Gemini only, no extra cost). # Research uses web search by default swarm research "Subject" --topic "angle" # Parallel with web search curl -X POST http://localhost:9999/parallel \ -d '{"prompts":["Current price of X?"],"options":{"webSearch":true}}'
const { parallel, research } = require('~/clawd/skills/node-scaling/lib'); const { SwarmClient } = require('~/clawd/skills/node-scaling/lib/client'); // Simple parallel const result = await parallel(['prompt1', 'prompt2', 'prompt3']); // Client with streaming const client = new SwarmClient(); for await (const event of client.parallel(prompts)) { ... } for await (const event of client.research(subjects, topic)) { ... } // Chain const result = await client.chainSync({ task, data, depth });
swarm start # Start daemon (background) swarm stop # Stop daemon swarm status # Status, cost, cache stats swarm restart # Restart daemon swarm savings # Monthly savings report swarm logs [N] # Last N lines of daemon log
ModeTasksTimeNotesParallel (simple)5~700ms142ms/task effectiveParallel (stress)10~1.2s123ms/task effectiveChain (standard)5~14s3-stage multi-perspectiveChain (quick)2~3s2-stage extract+synthesizeCache hitany~3-5ms200-500x speedupResearch (web)2~15sGoogle grounding latency
Location: ~/.config/clawdbot/node-scaling.yaml node_scaling: enabled: true limits: max_nodes: 16 max_concurrent_api: 16 provider: name: gemini model: gemini-2.0-flash web_search: enabled: true parallel_default: false cost: max_daily_spend: 10.00
IssueFixDaemon not runningswarm startNo API keySet GEMINI_API_KEY or run npm run setupRate limitedLower max_concurrent_api in configWeb search not workingEnsure provider is gemini + web_search.enabledCache stale resultscurl -X DELETE http://localhost:9999/cacheChain too slowUse depth: "quick" or check context size
Force JSON output with schema validation โ zero parse failures on structured tasks. # With built-in schema curl -X POST http://localhost:9999/structured \ -d '{"prompt":"Extract entities from: Tim Cook announced iPhone 17","schema":"entities"}' # With custom schema curl -X POST http://localhost:9999/structured \ -d '{"prompt":"Classify this text","data":"...","schema":{"type":"object","properties":{"category":{"type":"string"}}}}' # JSON mode (no schema, just force JSON) curl -X POST http://localhost:9999/structured \ -d '{"prompt":"Return a JSON object with name, age, city for a fictional person"}' # List available schemas curl http://localhost:9999/structured/schemas Built-in schemas: entities, summary, comparison, actions, classification, qa Uses Gemini's native response_mime_type: application/json + responseSchema for guaranteed JSON output. Includes schema validation on the response.
Same prompt โ N parallel executions โ pick the best answer. Higher accuracy on factual/analytical tasks. # Judge strategy (LLM picks best โ most reliable) curl -X POST http://localhost:9999/vote \ -d '{"prompt":"What are the key factors in SaaS pricing?","n":3,"strategy":"judge"}' # Similarity strategy (consensus โ zero extra cost) curl -X POST http://localhost:9999/vote \ -d '{"prompt":"What year was Python released?","n":3,"strategy":"similarity"}' # Longest strategy (heuristic โ zero extra cost) curl -X POST http://localhost:9999/vote \ -d '{"prompt":"Explain recursion","n":3,"strategy":"longest"}' Strategies: judge โ LLM scores all candidates on accuracy/completeness/clarity/actionability, picks winner (N+1 calls) similarity โ Jaccard word-set similarity, picks consensus answer (N calls, zero extra cost) longest โ Picks longest response as heuristic for thoroughness (N calls, zero extra cost) When to use: Factual questions, critical decisions, or any task where accuracy > speed. StrategyCallsExtra CostQualitysimilarityN$0Good (consensus)longestN$0Decent (heuristic)judgeN+1~$0.0001Best (LLM-scored)
Optional critic pass after chain/skeleton output. Scores 5 dimensions, auto-refines if below threshold. # Add reflect:true to any chain or skeleton request curl -X POST http://localhost:9999/chain/auto \ -d '{"task":"Analyze the AI chip market","data":"...","reflect":true}' curl -X POST http://localhost:9999/skeleton \ -d '{"task":"Write a market analysis","reflect":true}' Proven: improved weak output from 5.0 โ 7.6 avg score. Skeleton + reflect scored 9.4/10.
Generate outline โ expand each section in parallel โ merge into coherent document. Best for long-form content. curl -X POST http://localhost:9999/skeleton \ -d '{"task":"Write a comprehensive guide to SaaS pricing","maxSections":6,"reflect":true}' Performance: 14,478 chars in 21s (675 chars/sec) โ 5.1x more content than chain at 2.9x higher throughput. MetricChainSkeleton-of-ThoughtWinnerOutput size2,856 chars14,478 charsSoT (5.1x)Throughput234 chars/sec675 chars/secSoT (2.9x)Duration12s21sChain (faster)Quality (w/ reflect)~7-8/109.4/10SoT When to use what: SoT โ long-form content, reports, guides, docs (anything with natural sections) Chain โ analysis, research, adversarial review (anything needing multiple perspectives) Parallel โ independent tasks, batch processing Structured โ entity extraction, classification, any task needing reliable JSON Voting โ factual accuracy, critical decisions, consensus-building
MethodPathDescriptionGET/healthHealth checkGET/statusDetailed status + cost + cacheGET/capabilitiesDiscover execution modesPOST/parallelExecute N prompts in parallelPOST/researchMulti-phase web researchPOST/skeletonSkeleton-of-Thought (outline โ expand โ merge)POST/chainManual chain pipelinePOST/chain/autoAuto-build + execute chainPOST/chain/previewPreview chain without executingPOST/chain/templateExecute pre-built templatePOST/structuredForced JSON with schema validationGET/structured/schemasList built-in schemasPOST/voteMajority voting (best-of-N)POST/benchmarkQuality comparison testGET/templatesList chain templatesGET/cacheCache statisticsDELETE/cacheClear cache
ModelCost per 1M tokensRelativeClaude Opus 4~$15 input / $75 output1xGPT-4o~$2.50 input / $10 output~7x cheaperGemini Flash~$0.075 input / $0.30 output200x cheaper Cache hits are essentially free (~3-5ms, no API call).
Code helpers, APIs, CLIs, browser automation, testing, and developer operations.
Largest current source with strong distribution and engagement signals.