Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Intelligent cost-aware model routing that classifies task complexity and selects the optimal AI model. Automatically routes simple tasks to cheap models and...
Intelligent cost-aware model routing that classifies task complexity and selects the optimal AI model. Automatically routes simple tasks to cheap models and...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Intelligent cost-aware model routing for OpenClaw agents. Before executing any task via sessions_spawn or delegating to a sub-agent, classify the task complexity using the rules below and route to the optimal model. This saves 60-90% on LLM costs by using cheap models for simple work and reserving premium models for tasks that genuinely need them.
Route every request to the cheapest model that can handle it well.
Score the task on these dimensions. Count how many COMPLEX/REASONING indicators are present:
Greetings, small talk, status checks, heartbeats Single factual questions ("What is X?", "Define Y") Simple translations, format conversions File lookups, directory listings, basic shell commands Calendar checks, weather queries Tasks under 50 tokens with no technical depth Keywords: "what is", "define", "translate", "list", "check", "hello", "status"
Summarization of documents or conversations Single-file code edits, bug fixes, simple refactors Writing emails, messages, short-form content Data extraction, parsing, formatting Explaining concepts, answering "how to" questions Research requiring synthesis of a few sources Keywords: "summarize", "explain", "write", "fix this", "how to", "extract"
Multi-file code generation or refactoring Architecture design, system design Creative writing (stories, long-form, nuanced tone) Debugging complex issues across multiple systems Analysis requiring multiple perspectives Tasks with constraints ("optimize for X while maintaining Y") Keywords: "build", "design", "architect", "refactor", "create", "implement", "analyze"
Mathematical proofs, formal logic Multi-step reasoning chains ("first X, then Y, therefore Z") Security vulnerability analysis Performance optimization with tradeoffs Scientific analysis, hypothesis testing Any task with 2+ of: "prove", "derive", "why does", "compare and contrast", "evaluate tradeoffs", "step by step" Keywords: "prove", "derive", "reason", "why does", "evaluate", "theorem"
2+ reasoning keywords β always Tier 4 (high confidence) Code blocks or multi-file references β minimum Tier 2 "Debug" + stack traces β Tier 3 Heartbeats and /status β always Tier 1 When uncertain, default to Tier 2 (fast, cheap, good enough)
ModelCostBest ForGemini 2.5 Flash (free)$0.00High-volume simple tasks, translationGemini 2.5 Flash-Lite (free)$0.00Translation, marketingGemini 3 Flash Preview (free)$0.00Technology, health, scienceDeepSeek V3.2 (free)$0.00Roleplay, creative writingMoonshot Kimi K2.5 (free)$0.00Technology, programmingArcee Trinity Large Preview (free)$0.00Creative writing, storytelling, agents Default Tier 0 model: openrouter/free (auto-selects from available free models) Access via OpenRouter with model IDs like google/gemini-2.5-flash, deepseek/deepseek-v3.2-20251201, moonshotai/kimi-k2.5-0127. Or use openrouter/free to auto-route across all free models. Note: Free models have rate limits and may have variable availability. Use for non-critical tasks only.
ModelInput $/MTokOutput $/MTokBest ForGemini 2.0 Flash$0.10$0.40Default simple tier β fast, multimodal, 1M contextGPT-4o-mini$0.15$0.60Simple tasks, multimodalGPT-5 Nano$0.05$0.40Cheapest OpenAI optionDeepSeek V3$0.27$1.10Budget general-purposeGemini 2.5 Flash-Lite$0.10$0.40Most economical Google model Default Tier 1 model: gemini-2.0-flash (best cost/reliability balance)
ModelInput $/MTokOutput $/MTokBest ForClaude Haiku 4.5$1.00$5.00Near-frontier, fast, great codingGPT-4o$2.50$10.00Multimodal, tool use, solid all-rounderGemini 2.5 Flash$0.15$0.60Thinking-enabled, fast reasoningGPT-5 Mini$0.25$2.00Balanced performance, 400K contextMistral Medium 3$0.40$2.00European languages, balanced Default Tier 2 model: claude-haiku-4-5 (best quality-to-price at this tier)
ModelInput $/MTokOutput $/MTokBest ForClaude Sonnet 4.5$3.00$15.00Best coding-to-cost ratio, most popularGPT-5$1.25$10.00Flagship coding and agentic tasksGPT-5.3 Codex$1.75*$14.00*Most capable agentic coding modelGemini 2.5 Pro$1.25$10.00Coding, reasoning, up to 2M contextClaude Opus 4.5$5.00$25.00Maximum intelligence, agentic tasksGrok 4$3.00$15.00Frontier reasoning, real-time data *GPT-5.3 Codex API pricing not yet officially released; estimated from GPT-5.2 Codex rates. Default Tier 3 model: claude-sonnet-4-5 (best balance of quality, coding, and cost)
ModelInput $/MTokOutput $/MTokBest ForClaude Opus 4.6$5.00$25.00Latest frontier reasoning, extended thinking, 1M context (beta)Claude Opus 4.5$5.00$25.00Extended thinking, frontier reasoningo3$2.00$8.00Deep STEM reasoningDeepSeek R1$0.55$2.19Budget reasoning (20-50x cheaper than o1)o4-mini$1.10$4.40Efficient reasoning Default Tier 4 model: claude-opus-4-6 with extended thinking enabled
Use the default model for each tier as listed above. Escalate to next tier if the model produces low-quality output or fails.
Override tier defaults to cheapest option: Tier 0-1: openrouter/free ($0.00) for simple tasks, fall back to gemini-2.0-flash ($0.10/$0.40) Tier 2: gemini-2.5-flash ($0.15/$0.60) Tier 3: gemini-2.5-pro ($1.25/$10.00) Tier 4: deepseek-r1 ($0.55/$2.19) Savings: 70-99% vs always using Opus
Override tier defaults to best-in-class: Tier 1: claude-haiku-4-5 ($1.00/$5.00) Tier 2: claude-sonnet-4-5 ($3.00/$15.00) Tier 3: claude-opus-4-6 ($5.00/$25.00) or gpt-5.3-codex for coding Tier 4: claude-opus-4-6 ($5.00/$25.00) with extended thinking
# Simple task β Tier 1 sessions_spawn --task "What's on my calendar today?" --model gemini-2.0-flash # Moderate task β Tier 2 sessions_spawn --task "Summarize this document" --model claude-haiku-4-5 # Complex task β Tier 3 sessions_spawn --task "Build a React auth component with tests" --model claude-sonnet-4-5 # Reasoning task β Tier 4 sessions_spawn --task "Prove this algorithm is O(n log n)" --model claude-opus-4-6
When uncertain about complexity, start cheap and escalate: # 1. Try Tier 1 with timeout sessions_spawn --task "Fix this bug" --model gemini-2.0-flash --runTimeoutSeconds 60 # 2. If output is poor or times out, escalate to Tier 2 sessions_spawn --task "Fix this bug" --model claude-haiku-4-5 # 3. If still failing, escalate to Tier 3 sessions_spawn --task "Fix this complex bug" --model claude-sonnet-4-5 Maximum escalation chain: 3 attempts. If Tier 3 fails, surface the error to the user rather than burning tokens.
Route batch/parallel tasks to Tier 1 models for massive savings: # Batch summaries in parallel with cheap model sessions_spawn --task "Summarize doc A" --model gemini-2.0-flash & sessions_spawn --task "Summarize doc B" --model gemini-2.0-flash & sessions_spawn --task "Summarize doc C" --model gemini-2.0-flash & wait # Then analyze results with premium model sessions_spawn --task "Synthesize findings from all summaries" --model claude-sonnet-4-5
ScenarioRoute ToWhyHeartbeat / status checkTier 0 (openrouter/free) or Tier 1Zero intelligence needed, save every centVision / image analysisgemini-2.5-proBest multimodal + huge contextLong context (>100K tokens)gemini-2.5-pro or gpt-51M-2M context windowsChinese language tasksdeepseek-v3 or glm-4.7Optimized for ChineseReal-time web data neededgrok-4.1-fastBuilt-in X/web search, 2M contextAgentic coding tasksgpt-5.3-codex or claude-sonnet-4-5Purpose-built for agentic code workflowsCode generationclaude-sonnet-4-5 minimumBest code quality per dollarMath / formal proofso3 or claude-opus-4-6 with thinkingSpecialized reasoning
For a typical OpenClaw day (24 heartbeats + 20 sub-agent tasks + 10 user queries): StrategyMonthly CostSavingsAll Opus 4.6~$200baselineSmart routing (balanced)~$4578%Smart routing (aggressive)~$1592%Smart routing (aggressive + free tier)~$597%All free models (OpenRouter)~$0100% (but rate-limited & unreliable)
Always use Tier 3+ for: Security-sensitive code review Financial calculations where errors are costly Architecture decisions that affect the whole codebase Anything the user explicitly asks for premium quality Tasks where the user says "be thorough" or "take your time"
Users can switch modes mid-conversation: "Use aggressive routing" β Switch to cheapest models per tier "Use quality mode" β Switch to best models per tier "Use balanced routing" β Return to defaults "Use [specific model] for this" β Override routing for one task
All prices per million tokens. Models are listed from cheapest to most expensive output: ModelInputOutputContextProviderOpenRouter Free Models$0.00$0.00VariesOpenRouterGPT-5 Nano$0.05$0.40400KOpenAIGemini 2.0 Flash$0.10$0.401MGoogleGemini 2.5 Flash-Lite$0.10$0.401MGoogleGPT-4o-mini$0.15$0.60128KOpenAIGemini 2.5 Flash$0.15$0.601MGoogleGrok 4.1 Fast$0.20$0.502MxAIGPT-5 Mini$0.25$2.00400KOpenAIDeepSeek V3$0.27$1.1064KDeepSeekDeepSeek R1$0.55$2.1964KDeepSeekClaude Haiku 4.5$1.00$5.00200KAnthropico4-mini$1.10$4.40200KOpenAIGemini 2.5 Pro$1.25$10.001MGoogleGPT-5$1.25$10.00400KOpenAIGPT-5.3 Codex$1.75*$14.00*400KOpenAIo3$2.00$8.00200KOpenAIGPT-4o$2.50$10.00128KOpenAIClaude Sonnet 4.5$3.00$15.00200KAnthropicGrok 4$3.00$15.00256KxAIClaude Opus 4.5$5.00$25.00200KAnthropicClaude Opus 4.6$5.00$25.00200K (1M beta)Anthropic *GPT-5.3 Codex pricing estimated from GPT-5.2 Codex; official API pricing pending. Note: Prices change. Check provider pricing pages for current rates. Batch API discounts (50% off) and prompt caching (50-90% off) can reduce costs further. OpenRouter free models have rate limits β see openrouter.ai/collections/free-models for current availability.
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.