Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Choose AI models for coding, reasoning, and agents with cost-aware, task-matched recommendations.
Choose AI models for coding, reasoning, and agents with cost-aware, task-matched recommendations.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
No single model is best for everything β match model to task, not brand loyalty A $0.75/M model often performs identically to a $40/M model for simple tasks Test cheaper alternatives before committing to expensive defaults
Output tokens cost 3-10x more than input tokens β advertised input prices are misleading Calculate real cost with your actual input/output ratio, not theoretical pricing Batch/async APIs offer 50% discounts β use them for non-real-time workloads Prompt caching reduces repeated context costs significantly
Architecture and design decisions: Use frontier models (Opus-class) β they catch subtle issues cheaper models miss Day-to-day implementation: Mid-tier models (Sonnet-class) offer 90% of capability at 20% of cost Parallel subtasks and scaffolding: Fast/cheap models (Haiku-class) β speed matters more than depth Code review: Thorough models catch async bugs and edge cases that fast models miss
Complex reasoning and math: Extended thinking modes justify their cost for hard problems General assistance: User preference studies favor models different from benchmark leaders High-volume simple queries: Cheapest models perform identically β don't overpay Long documents: Context window size determines viability β some offer 1M+ tokens
Claude Code: Fast iteration, UI/frontend, interactive debugging β developer stays in the loop Codex CLI: Long-running background tasks, large refactors, set-and-forget β accuracy over speed Both tools have value β use Claude Code for implementation, Codex for final review File size limits differ β Claude Code struggles with files over 25K tokens
Planning phase: Use expensive/smart models to break down problems correctly Execution phase: Use balanced models, parallelize where possible Review phase: Use accurate models for final verification β catches bugs others miss This pattern beats using one model for everything at similar total cost
Benchmark scores vary 2-3x based on scaffolding and evaluation method User preference rankings differ significantly from benchmark rankings SWE-bench scores don't predict real-world coding quality reliably Models drift week-to-week β last month's best may underperform today
DeepSeek and similar models approach frontier performance at 1/50th API cost Self-hosting eliminates API rate limits and price variability MIT/Apache licensed models allow commercial use without restrictions Consider for: data privacy, cost predictability, custom fine-tuning
Using premium models for chatbot responses that cheap models handle identically Ignoring context window limits β chunking long documents costs more than using large-context models Expecting consistency β same prompt gives different results over time as models update Trusting speed over accuracy for complex tasks β fast models trade thoroughness for latency
Default to mid-tier for most tasks, escalate to frontier only when quality suffers Track actual costs per workflow, not just per-token rates Build verification into pipelines β don't trust any model blindly Reassess model choices quarterly β pricing and capabilities shift constantly
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.