Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Evaluate and monitor AI agent fleets across six key dimensions to score health, identify issues, and optimize performance for ops teams managing 1-100+ agents.
Evaluate and monitor AI agent fleets across six key dimensions to score health, identify issues, and optimize performance for ops teams managing 1-100+ agents.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
Score, monitor, and troubleshoot AI agent fleets in production. Built for ops teams running 1-100+ agents.
Evaluates your agent deployment across 6 dimensions and returns a 0-100 health score with specific fixes.
Can you see what every agent is doing right now? Task queue depth, active/idle ratio, error rates Benchmark: Top quartile tracks 95%+ of agent actions in real-time
Do you know exactly what each agent costs per task? Token spend, API calls, compute time, tool invocations Benchmark: Unmonitored agents waste 30-55% on retries and hallucination loops
Are agent outputs validated before reaching users or systems? Accuracy sampling, hallucination detection, regression tracking Benchmark: 1 in 12 agent outputs contains a material error without monitoring
What happens when an agent fails mid-task? Retry logic, graceful degradation, human escalation paths Benchmark: Mean time to detect agent failure without monitoring: 4.2 hours
Are agents staying within authorized scope? Tool access auditing, data exfiltration checks, permission drift Benchmark: 23% of production agents access tools outside their intended scope
Do multi-agent workflows hand off cleanly? Message passing reliability, deadlock detection, duplicate work Benchmark: Uncoordinated fleets duplicate 18-25% of work
ScoreRatingAction80-100Production-gradeOptimize and scale60-79OperationalFix gaps before scaling40-59RiskyImmediate remediation needed0-39BlindStop scaling, instrument first
Company SizeUnmonitored WasteMonitoring InvestmentNet Savings1-5 agents$2K-$8K/mo$500-$1K/mo$1.5K-$7K/mo5-20 agents$8K-$45K/mo$2K-$5K/mo$6K-$40K/mo20-100 agents$45K-$200K/mo$8K-$20K/mo$37K-$180K/mo
Week 1-2: Inventory all agents, document intended scope, tag cost centers Week 3-4: Deploy execution logging (every tool call, every output) Month 2: Build dashboards — cost per task, error rate, latency P95 Month 3: Automated alerting — failure detection <5 min, cost anomaly flags, scope violations
Logging only errors (miss the slow degradation) No cost attribution (agents burn budget invisibly) Monitoring agents like servers (they need task-level observability) Manual review of agent outputs (doesn't scale past 3 agents) No baseline metrics (can't detect regression without a baseline) Alerting on everything (alert fatigue kills response time) Skipping agent-to-agent handoff monitoring (where most fleet failures happen)
IndustryCritical DimensionWhyFinancial ServicesSecurity & BoundariesRegulatory audit trails mandatoryHealthcareOutput QualityClinical accuracy non-negotiableLegalExecution VisibilityBilling requires task-level trackingEcommerceCost AttributionMargin-sensitive, waste kills profitSaaSFleet CoordinationMulti-tenant agent isolationManufacturingFailure RecoveryDowntime = production line stopsConstructionSecurity & BoundariesSafety-critical document handlingReal EstateOutput QualityValuation errors = liabilityRecruitmentFleet CoordinationCandidate pipeline handoffsProfessional ServicesCost AttributionClient billing accuracy
AI Agent Context Packs — industry-specific decision frameworks: https://afrexai-cto.github.io/context-packs/ AI Revenue Leak Calculator — find where your business loses money to manual processes: https://afrexai-cto.github.io/ai-revenue-calculator/ Agent Setup Wizard — configure your agent stack in 5 minutes: https://afrexai-cto.github.io/agent-setup/ Built by AfrexAI — we help businesses run AI agents that actually make money.
Long-tail utilities that do not fit the current primary taxonomy cleanly.
Largest current source with strong distribution and engagement signals.