← All skills
Tencent SkillHub · Other

AI Agent Observability

Evaluate and monitor AI agent fleets across six key dimensions to score health, identify issues, and optimize performance for ops teams managing 1-100+ agents.

skill openclawclawhub Free
0 Downloads
0 Stars
0 Installs
0 Score
High Signal

Evaluate and monitor AI agent fleets across six key dimensions to score health, identify issues, and optimize performance for ops teams managing 1-100+ agents.

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Quick setup
  1. Download the package from Yavira.
  2. Extract the archive and review SKILL.md first.
  3. Import or place the package into your OpenClaw setup.

Requirements

Target platform
OpenClaw
Install method
Manual import
Extraction
Extract archive
Prerequisites
OpenClaw
Primary doc
SKILL.md

Package facts

Download mode
Yavira redirect
Package format
ZIP package
Source platform
Tencent SkillHub
What's included
README.md, SKILL.md

Validation

  • Use the Yavira download entry.
  • Review SKILL.md after the package is downloaded.
  • Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

  1. Download the package from Yavira.
  2. Extract it into a folder your agent can access.
  3. Paste one of the prompts below and point your agent at the extracted folder.
New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.

Trust & source

Release facts

Source
Tencent SkillHub
Verification
Indexed source record
Version
1.1.0

Documentation

ClawHub primary doc Primary doc: SKILL.md 15 sections Open source page

Agent Observability & Monitoring

Score, monitor, and troubleshoot AI agent fleets in production. Built for ops teams running 1-100+ agents.

What This Does

Evaluates your agent deployment across 6 dimensions and returns a 0-100 health score with specific fixes.

1. Execution Visibility (0-20 pts)

Can you see what every agent is doing right now? Task queue depth, active/idle ratio, error rates Benchmark: Top quartile tracks 95%+ of agent actions in real-time

2. Cost Attribution (0-20 pts)

Do you know exactly what each agent costs per task? Token spend, API calls, compute time, tool invocations Benchmark: Unmonitored agents waste 30-55% on retries and hallucination loops

3. Output Quality (0-15 pts)

Are agent outputs validated before reaching users or systems? Accuracy sampling, hallucination detection, regression tracking Benchmark: 1 in 12 agent outputs contains a material error without monitoring

4. Failure Recovery (0-15 pts)

What happens when an agent fails mid-task? Retry logic, graceful degradation, human escalation paths Benchmark: Mean time to detect agent failure without monitoring: 4.2 hours

5. Security & Boundaries (0-15 pts)

Are agents staying within authorized scope? Tool access auditing, data exfiltration checks, permission drift Benchmark: 23% of production agents access tools outside their intended scope

6. Fleet Coordination (0-15 pts)

Do multi-agent workflows hand off cleanly? Message passing reliability, deadlock detection, duplicate work Benchmark: Uncoordinated fleets duplicate 18-25% of work

Scoring

ScoreRatingAction80-100Production-gradeOptimize and scale60-79OperationalFix gaps before scaling40-59RiskyImmediate remediation needed0-39BlindStop scaling, instrument first

Quick Assessment Prompt

  • Ask the agent to evaluate your setup:
  • Run the agent observability assessment against our current deployment:
  • How many agents are running?
  • What monitoring exists today?
  • What broke in the last 30 days?
  • What's our monthly agent spend?
  • Who gets alerted when an agent fails?

Cost Framework

Company SizeUnmonitored WasteMonitoring InvestmentNet Savings1-5 agents$2K-$8K/mo$500-$1K/mo$1.5K-$7K/mo5-20 agents$8K-$45K/mo$2K-$5K/mo$6K-$40K/mo20-100 agents$45K-$200K/mo$8K-$20K/mo$37K-$180K/mo

90-Day Monitoring Roadmap

Week 1-2: Inventory all agents, document intended scope, tag cost centers Week 3-4: Deploy execution logging (every tool call, every output) Month 2: Build dashboards — cost per task, error rate, latency P95 Month 3: Automated alerting — failure detection <5 min, cost anomaly flags, scope violations

7 Monitoring Mistakes

Logging only errors (miss the slow degradation) No cost attribution (agents burn budget invisibly) Monitoring agents like servers (they need task-level observability) Manual review of agent outputs (doesn't scale past 3 agents) No baseline metrics (can't detect regression without a baseline) Alerting on everything (alert fatigue kills response time) Skipping agent-to-agent handoff monitoring (where most fleet failures happen)

Industry Adjustments

IndustryCritical DimensionWhyFinancial ServicesSecurity & BoundariesRegulatory audit trails mandatoryHealthcareOutput QualityClinical accuracy non-negotiableLegalExecution VisibilityBilling requires task-level trackingEcommerceCost AttributionMargin-sensitive, waste kills profitSaaSFleet CoordinationMulti-tenant agent isolationManufacturingFailure RecoveryDowntime = production line stopsConstructionSecurity & BoundariesSafety-critical document handlingReal EstateOutput QualityValuation errors = liabilityRecruitmentFleet CoordinationCandidate pipeline handoffsProfessional ServicesCost AttributionClient billing accuracy

Go Deeper

AI Agent Context Packs — industry-specific decision frameworks: https://afrexai-cto.github.io/context-packs/ AI Revenue Leak Calculator — find where your business loses money to manual processes: https://afrexai-cto.github.io/ai-revenue-calculator/ Agent Setup Wizard — configure your agent stack in 5 minutes: https://afrexai-cto.github.io/agent-setup/ Built by AfrexAI — we help businesses run AI agents that actually make money.

Category context

Long-tail utilities that do not fit the current primary taxonomy cleanly.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package
2 Docs
  • SKILL.md Primary doc
  • README.md Docs