Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Smart LLM Router — routes every query to the cheapest capable model. Supports 17 models across Anthropic, OpenAI, Google, DeepSeek & xAI (Grok). Uses a pre-t...
Smart LLM Router — routes every query to the cheapest capable model. Supports 17 models across Anthropic, OpenAI, Google, DeepSeek & xAI (Grok). Uses a pre-t...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Smart LLM router that saves up to 99% on inference costs by routing each request to the cheapest model that can handle it. Powered by a pre-trained ML classifier and semantic embeddings — no external calls, no API keys needed.
openclaw plugins install @rayray1218/semantic-model-router
from scripts.model_router import ModelRouter router = ModelRouter() res = router.route("Design a distributed caching layer for a fintech platform.") print(res["report"]) # [ClawRouter] anthropic/claude-sonnet-4-6 (ELITE, ml, conf=0.97) # Cost: $3.0/M | Baseline: $10.0/M | Saved: 70.0%
Queries are classified into three tiers through a 3-stage pipeline: ML Classifier (primary): A Logistic Regression model trained on 6,000+ labeled queries. Runs in <1ms from embedded weights in model_weights.py. Semantic Embeddings (fallback): Cosine similarity to tier intent vectors via sentence-transformers. Keyword Rules (last resort): Pattern matching with no dependencies. TierDefault ModelTypical WorkloadCost/1Mvs BaselineBASICdeepseek/deepseek-chatGreetings, simple Q&A, chit-chat$0.1499% savedBALANCEDopenai/gpt-4o-miniSummaries, translations, explanations$0.1599% savedELITEanthropic/claude-sonnet-4-6Complex coding, architecture, security$3.0070% saved
ModelInput /1MOutput /1Manthropic/claude-sonnet-4-6$3.00$15.00 ★ ELITE defaultanthropic/claude-opus-4-5$5.00$25.00anthropic/claude-haiku-4-5$0.80$4.00
ModelInput /1MOutput /1Mopenai/gpt-5$1.25$10.00openai/gpt-4o$2.50$10.00openai/gpt-4o-mini$0.15$0.60 ★ BALANCED defaultopenai/o3$2.00$8.00openai/o4-mini$1.10$4.40
ModelInput /1MOutput /1Mgoogle/gemini-3.0-pro$1.25$10.00google/gemini-2.5-pro$1.25$10.00google/gemini-2.5-flash$0.30$2.50google/gemini-2.5-flash-lite$0.10$0.40
ModelInput /1MOutput /1Mdeepseek/deepseek-chat (V3.2)$0.28$0.42 ★ BASIC defaultdeepseek/deepseek-reasoner (V3.2)$0.28$0.42
ModelInput /1MOutput /1Mxai/grok-3$3.00$15.00xai/grok-3-mini$0.30$0.50 Pricing source: Official API docs of each provider, verified Feb 2026.
# Use GPT-5.2 for ELITE, Gemini Flash Lite for BASIC router = ModelRouter( elite_model="openai/gpt-5.2", balanced_model="google/gemini-2.5-flash", basic_model="google/gemini-2.5-flash-lite", ) # Swap a tier's model without recreating the router router.set_model("ELITE", "anthropic/claude-opus-4-5")
python3 scripts/model_router.py --list-models
# Route a single query python3 scripts/model_router.py "Implement AES encryption from scratch" # Override ELITE model python3 scripts/model_router.py --elite openai/gpt-5.2 "Write a compiler" # Run full smoke-test python3 scripts/model_router.py
router.add_keywords("ELITE", ["cryptographic proof", "zero-knowledge"])
Query Predicted Expected ✓ Cost Info ──────────────────────────────────────────────────────────────────────────────────── How are you doing today? BASIC BASIC ✓ $0.14/M saved 98.6% Summarize this article in three bullet points. BALANCED BALANCED ✓ $0.15/M saved 98.5% Implement a thread-safe LRU cache in Python. ELITE ELITE ✓ $3.0/M saved 70.0%
Zero external calls: All classification runs locally. No API keys: The router itself needs none. Transparent weights: All model parameters live in scripts/model_weights.py — fully auditable. Save costs, route smarter. Built for the OpenClaw community.
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.