Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Route tasks to the best AI model across paid subscriptions (Claude, ChatGPT, Codex, Gemini, Kimi) via OpenClaw gateway. Use when user mentions model routing,...
Route tasks to the best AI model across paid subscriptions (Claude, ChatGPT, Codex, Gemini, Kimi) via OpenClaw gateway. Use when user mentions model routing,...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Route incoming tasks to the optimal AI model across available providers. OpenClaw handles all API connections — this skill defines the classification and delegation logic. Classify each task by type and delegate to the appropriate agent/model.
When this skill is first loaded, determine the user's available providers: Ask: "Which AI subscriptions do you have?" (Claude Max 5x/20x, ChatGPT Plus/Pro, Gemini Advanced, Kimi) Map subscriptions to available tiers (see table below) Disable tiers for missing providers — those decision steps get skipped Confirm the active configuration with the user If only Claude is available, all tasks stay on Opus. No routing needed — but conflict resolution and collaboration patterns still apply for judging task complexity. To verify providers are actually working after setup, ask the user to run: openclaw models status Any model showing missing or auth_expired is not usable. Remove it from your active tiers until the user fixes it. For full provider configuration details, consult references/provider-config.md (in the same directory as this SKILL.md).
TierModelOpenClaw IDSpeedTTFTIntelligenceContextBest AtSIMPLEGemini 2.5 Flash-Litegoogle-gemini-cli/gemini-2.5-flash-lite495 tok/s0.23s21.61MLow-latency pings, trivial format tasksFASTGemini 3 Flashgoogle-gemini-cli/gemini-3-flash-preview206 tok/s12.75s46.41MInstruction following, structured output, heartbeatsRESEARCHGemini 3 Progoogle-gemini-cli/gemini-3-pro-preview131 tok/s29.59s48.41MScientific research, long context analysisCODEGPT-5.3 Codexopenai-codex/gpt-5.3-codex113 tok/s20.00s51.5200KCode generation, math (99.0)DEEPClaude Opus 4.6anthropic/claude-opus-4-667 tok/s1.76s53.0200KReasoning, planning, judgmentORCHESTRATEKimi K2.5kimi-coding/k2p539 tok/s1.65s46.7128KMulti-agent orchestration (TAU-2: 0.959) Key benchmark scores (higher = better): GPQA (science): Gemini Pro 0.908, Opus 0.769, Codex 0.738* Coding (SWE-bench): Codex 49.3*, Opus 43.3, Gemini Pro 35.1 Math (AIME '25): Codex 99.0*, Gemini Flash 97.0, Opus 54.0 IFBench (instruction following): Gemini Flash 0.780, Opus 0.639, Codex 0.590* TAU-2 (agentic tool use): Kimi K2.5 0.959, Codex 0.811*, Opus 0.780 Scores marked with * are estimated from vendor reports, not independently verified. Source: Artificial Analysis API v4, February 2026. Structured data in benchmarks.json.
Walk through these 9 steps IN ORDER for every incoming task. The FIRST match wins. If a required model is unavailable, skip that step and continue to the next. Estimating token count for Step 1: Count characters in the input and divide by 4. 100k tokens ≈ 400,000 characters. If the user pastes a large file, codebase, or says "analyze this entire repo," assume it exceeds 100k. StepSignalsRoute toFallbacks1. Context >100k tokenslarge file, long document, bulk, CSV, log dump, entire codebase, "analyze this PDF"RESEARCH (Pro, 1M ctx)Opus (200K)2. Math / proofcalculate, solve, equation, proof, integral, probability, optimize, formulaCODE (Codex, Math 99.0)Flash (97.0), Opus3. Code writingwrite code, implement, function, class, refactor, script, migration, test, PR, diffCODE (Codex, Coding 49.3)Opus4. Code review / architecturereview, audit, architecture, design, trade-off, security review, best practiceDEEP (Opus, Intel 53.0)stays on main5. Speed critical / trivialquick, fast, simple, format, convert, summarize, list, extract, translate, one-linerFAST (Flash, 206 tok/s)Flash-Lite, Opus6. Research / scientificresearch, find out, explain, compare, analyze, paper, evidence, fact-check, deep diveRESEARCH (Pro, GPQA 0.908)Opus7. Multi-step tool pipelineorchestrate, coordinate, pipeline, workflow, chain, parallel, fan-outORCHESTRATE (Kimi, TAU-2 0.959)Codex, Opus8. Structured outputfollow rules exactly, JSON schema, strict template, structured, checklist, tableFAST (Flash, IFBench 0.780)Opus9. Defaultno clear matchDEEP (Opus, Intel 53.0)safest all-rounder Step 5 note: For sub-second TTFT needs (pings, health checks), use SIMPLE (Flash-Lite, 0.23s TTFT). For heartbeats and cron jobs, use FAST (Flash) — better instruction following (IFBench 0.780).
When a task matches multiple steps: "Analyze this 200-page PDF and write a Python parser for it" -- Step 1 wins (context size), route to RESEARCH. Then delegate code writing to CODE as a follow-up. "Quickly solve this integral" -- Step 2 wins over Step 5 (math trumps speed). "Generate a JSON schema for this API" -- Step 8 wins (structured output, not code writing). "Review this code and refactor the authentication module" -- Step 4 wins for review, then Step 3 for the refactor (delegate to CODE).
Do NOT route away from the current model when: User explicitly requests a model. "Use Opus for this" or "don't delegate this" — always respect direct instructions. Security-sensitive tasks. If the task involves credentials, private keys, secrets, or personally identifiable data, keep it on the main agent. Do not send sensitive content to sub-agents. Debugging a specific model. If the user is testing or comparing model behavior, route to the model they specify. Mid-conversation continuity. In a multi-turn conversation where the user asks a quick follow-up, do not switch models just because the follow-up is "simple." Stay on the current model for context continuity unless the user explicitly asks to delegate.
When multiple steps seem to match, resolve with these priority rules: Judgment trumps speed. If the task has ambiguity, nuance, or risk — stay on Opus. Specialist trumps generalist. If a model has a standout benchmark for the exact task type, prefer it. Code writing -- Codex. Code review -- Opus. Different models for writing vs judging. Context overflow -- Gemini. Only Gemini models handle 1M context. TTFT matters for interactive tasks. Flash-Lite (0.23s), Kimi (1.65s), and Opus (1.76s) respond fast. Codex (20s) and Pro (29.59s) are slow to start — don't use them for quick back-and-forth. When truly tied -- Opus. Highest general intelligence, lowest risk of subtle errors.
Use OpenClaw's agent system to delegate: /agent <agent-id> <instruction> You send /agent codex <instruction> — OpenClaw spawns the sub-agent with that instruction. The sub-agent runs in its own workspace and returns a text response. Sub-agents do NOT share your conversation context or workspace files. Pass ALL necessary context in the instruction. What to pass: The specific task, relevant code snippets, output format expectations, and constraints.
/agent codex Write a Python function that parses RFC 3339 timestamps with timezone support. Return only the code. /agent gemini-researcher Analyze the differences between SQLite WAL mode and journal mode. Include benchmarks and a recommendation. /agent gemini-fast Convert the following list into a markdown table with columns: Name, Role, Status. /agent kimi-orchestrator Coordinate: (1) gemini-researcher gathers data on X, (2) codex writes a parser, (3) report results.
Timeout (no response within 60s): Retry once on same model. If it fails again, fall to next fallback. Auth error (401/403): Do NOT retry — fall to next fallback immediately and tell user to re-authenticate. See references/oauth-setup.md. Rate limit (429): Wait 30 seconds, retry once. If still limited, fall to next fallback. Partial/garbage response: Retry once. If still broken, fall to next fallback. Model unavailable: Skip that tier entirely and continue. Maximum retries: 1 retry on same model, then next fallback. If ALL fallbacks fail, stay on Opus. Never retry more than 3 times total across all fallbacks. When a fallback is triggered, briefly inform the user: "Codex is unavailable, routing to Opus instead."
Stay on the same model for follow-up messages in the same topic. Context continuity matters more than optimal model selection. Re-route only when the task type clearly changes. Example: user discusses architecture (Opus) -- then says "now write the implementation" -- delegate code writing to Codex. When switching models mid-conversation: Summarize the relevant context from the current conversation. Pass that summary as part of the delegation instruction. Continue on the original model (Opus) with awareness of what the sub-agent produced.
Sub-agents cannot read your files — paste content into the instruction. Sub-agents cannot write to your workspace — output comes back as text. Sub-agents share nothing with each other — complete isolation by design.
PatternFlowUse whenPipelineResearch Agent -- Main Agent -- Code AgentTask requires gathering facts before implementingParallel + MergeMain spawns Code (approach A) + Research (approach B), then mergesExploring multiple solutions or under time pressureAdversarial ReviewCode Agent writes -- Main critiques -- Code revisesSecurity-sensitive or production-critical codeOrchestrated (Kimi)/agent kimi-orchestrator Plan and execute: <task>3+ agents in complex dependency graphs (Kimi: slowest at 39 tok/s, best at TAU-2 0.959)Choose this for tasks requiring 3+ agents in complex dependency graphs. Caution: Kimi is slowest (39 tok/s) but best at tool orchestration (TAU-2: 0.959).
When a model is unavailable or rate-limited, fall through in reliability order.
Task TypePrimaryFallback 1Fallback 2Fallback 3ReasoningOpusGemini ProCodexKimi K2.5CodeCodexOpusGemini ProKimi K2.5ResearchGemini ProOpusCodexKimi K2.5Fast tasksFlash-LiteFlashOpusCodexAgenticKimi K2.5CodexGemini ProOpus Important: Always use cross-provider fallbacks. Same-provider fallbacks (e.g., Gemini Pro -- Flash) help with model-specific issues but not provider outages. Every fallback chain should span at least 2 different providers.
Task TypePrimaryFallback 1Fallback 2ReasoningOpusGemini Pro—CodeOpusGemini Pro—ResearchGemini ProOpus—Fast tasksFlash-LiteFlashOpus
Task TypePrimaryFallback 1ReasoningOpusCodexCodeCodexOpusEverything elseOpusCodex
All tasks route to Opus. No fallback needed.
For auth setup, OAuth flows (including headless VPS), and multi-device safety details, consult references/oauth-setup.md (in the same directory as this SKILL.md). For provider configuration (openclaw.json, per-agent models.json, Google Gemini workarounds), consult references/provider-config.md. Quick reference: ProviderAuth MethodMaintenanceAnthropicSetup-token (OAuth)Low — auto-refreshGoogle GeminiOAuth (CLI plugin)Very low — long-lived tokensOpenAI CodexOAuth (ChatGPT PKCE)Low — auto-refreshKimiStatic API keyNone — never expires
For detailed troubleshooting, consult references/troubleshooting.md (in the same directory as this SKILL.md). Common issues: "No API provider registered for api: undefined" -- Missing api field in provider config "API key not valid" with Gemini subscription -- Wrong API type; use google-gemini-cli not google-generative-ai Model shows missing -- Model ID mismatch; gemini-2.5-flash-lite (no -preview suffix) Codex 401 Unauthorized -- Token expired; re-run OAuth flow via references/oauth-setup.md Sub-agent "Unknown model" -- Provider missing from sub-agent's auth-profile
SetupMonthlyNotesClaude only (Max 5x)$100No routing, Opus handles everythingClaude only (Max 20x)$200No routing, 20x rate limitsBalanced (Max 20x + Gemini)$220Adds Flash speed + Pro researchCode-focused (+ ChatGPT Plus)$240Adds Codex for code + mathFull stack (all 4, ChatGPT Plus)$250Full specializationFull stack Pro (all 4, ChatGPT Pro)$430Maximum rate limits Source: Artificial Analysis API v4, February 2026. Codex scores estimated (*) from OpenAI blog data. Structured benchmark data available in references/benchmarks.json.
FileContentreferences/oauth-setup.mdAuth setup, OAuth flows, multi-device safetyreferences/provider-config.mdopenclaw.json, per-agent models.json, Gemini workaroundsreferences/troubleshooting.mdCommon errors and fixesreferences/benchmarks.jsonRaw benchmark data for all models
Code helpers, APIs, CLIs, browser automation, testing, and developer operations.
Largest current source with strong distribution and engagement signals.