โ† All skills
Tencent SkillHub ยท Developer Tools

Production Readiness

Meta-skill that orchestrates logging, monitoring, error handling, performance, security, deployment, and testing skills to ensure a service is fully production-ready before launch. Use before first deploy, major releases, quarterly reviews, or after incidents.

skill openclawclawhub Free
0 Downloads
0 Stars
0 Installs
0 Score
High Signal

Meta-skill that orchestrates logging, monitoring, error handling, performance, security, deployment, and testing skills to ensure a service is fully production-ready before launch. Use before first deploy, major releases, quarterly reviews, or after incidents.

โฌ‡ 0 downloads โ˜… 0 stars Unverified but indexed

Install for OpenClaw

Quick setup
  1. Download the package from Yavira.
  2. Extract the archive and review SKILL.md first.
  3. Import or place the package into your OpenClaw setup.

Requirements

Target platform
OpenClaw
Install method
Manual import
Extraction
Extract archive
Prerequisites
OpenClaw
Primary doc
SKILL.md

Package facts

Download mode
Yavira redirect
Package format
ZIP package
Source platform
Tencent SkillHub
What's included
README.md, SKILL.md

Validation

  • Use the Yavira download entry.
  • Review SKILL.md after the package is downloaded.
  • Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

  1. Download the package from Yavira.
  2. Extract it into a folder your agent can access.
  3. Paste one of the prompts below and point your agent at the extracted folder.
New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.

Trust & source

Release facts

Source
Tencent SkillHub
Verification
Indexed source record
Version
1.0.0

Documentation

ClawHub primary doc Primary doc: SKILL.md 18 sections Open source page

Production Readiness (Meta-Skill)

Coordinates all operational concerns into a single readiness review. Instead of duplicating domain expertise, this skill routes to specialized skills and agents for each area, then synthesizes results into a unified go/no-go assessment.

OpenClaw / Moltbot / Clawbot

npx clawhub@latest install production-readiness

Purpose

Ensure a service is production-ready by systematically checking every operational concern โ€” logging, error handling, performance, security, deployment, testing, and documentation โ€” before traffic hits it. A production-ready service: Fails gracefully under load and partial outages Observes itself with structured logs, metrics, and traces Recovers automatically from transient failures Communicates health to orchestrators and operators Documents operations so on-call engineers can respond without tribal knowledge

When to Use

TriggerContextBefore first deployNew service going to production for the first timeBefore major releaseSignificant feature or architectural change shippingQuarterly production reviewScheduled audit of existing servicesAfter incidentPost-incident hardening to prevent recurrenceDependency upgradeMajor framework, runtime, or infrastructure changeTeam handoffTransferring ownership of a service to another team

Orchestration Flow

Run each area sequentially or in parallel. Each step delegates to a specialized skill or agent โ€” this skill does not re-implement their logic. โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Production Readiness Review โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ โ”‚ โ”‚ 1. Logging & Observability โ”€โ”€โ–บ logging-observability skill โ”‚ 2. Error Handling โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ error-handling-patterns skill โ”‚ 3. Performance โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ performance-agent โ”‚ 4. Security โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ security-review meta-skill โ”‚ 5. Deployment โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ deployment-agent + docker-expert skill โ”‚ 6. Testing โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ testing-workflow meta-skill โ”‚ 7. Documentation โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ /generate-docs command โ”‚ โ”‚ โ”‚ โ”€โ”€โ–บ Synthesize results into go/no-go report โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Step Details

Logging & Observability โ€” Structured logging, log levels, correlation IDs, metrics endpoints, distributed tracing, alerting rules Error Handling โ€” Global error boundaries, retry policies, dead-letter queues, error classification, user-facing error messages Performance โ€” Load testing results, P95/P99 latency baselines, memory/CPU profiling, database query analysis, caching strategy Security โ€” Auth/authz verification, input validation, dependency audit, secrets management, OWASP top-10 review Deployment โ€” Container hardening, rollback strategy, blue-green/canary configuration, infrastructure-as-code review Testing โ€” Unit/integration/e2e coverage, contract tests, chaos/failure injection, smoke tests in staging Documentation โ€” API docs, runbooks, architecture diagrams, on-call playbooks, ADRs for key decisions

Skill Routing Table

ConcernSkill / AgentPathLogging & Observabilitylogging-observabilityai/skills/tools/logging-observability/SKILL.mdError Handlingerror-handling-patternsai/skills/backend/error-handling-patterns/SKILL.mdPerformanceperformance-agentai/agents/performance/Securitysecurity-reviewai/skills/meta/security-review/SKILL.mdDeployment (containers)docker-expertai/skills/devops/docker/SKILL.mdDeployment (pipelines)deployment-agentai/agents/deployment/Testingtesting-workflowai/skills/testing/testing-workflow/SKILL.mdRate Limitingrate-limiting-patternsai/skills/backend/rate-limiting-patterns/SKILL.mdDocumentation/generate-docsai/commands/documentation/ Routing rule: Read the target skill first, follow its instructions, then return results here for synthesis.

Health & Lifecycle

Health check endpoint (/healthz or /health) returns dependency status Readiness probe distinguishes "starting" from "ready to serve" Liveness probe detects deadlocks and unrecoverable states Graceful shutdown drains in-flight requests before exit Startup probe allows for slow initialization without false restarts

Resilience

Circuit breakers on all external service calls Retry with exponential backoff and jitter on transient failures Rate limiting configured per endpoint and per client Backpressure mechanisms prevent cascade failures under load Timeouts set on every outbound call (HTTP, DB, queue) Bulkhead isolation separates critical from non-critical paths

Configuration & Secrets

All configuration externalized (env vars, config service, or feature flags) No secrets in code, images, or environment variable defaults Secrets loaded from a vault (e.g., AWS Secrets Manager, HashiCorp Vault) Configuration changes do not require redeployment Feature flags in place for high-risk changes

Data Safety

Backup strategy defined and tested (RPO/RTO documented) Restore procedure verified in a non-production environment Database migrations are backward-compatible and reversible Data retention policies implemented and enforced

Operational Readiness

Runbooks exist for top 5 most likely failure scenarios SLOs defined (availability, latency, error rate) with error budgets SLAs communicated to dependent teams or customers On-call rotation staffed and escalation path documented Dashboards show golden signals (latency, traffic, errors, saturation) Alerting rules configured with appropriate thresholds and severity

Maturity Levels

LevelNameRequirementsL1MVPHealth check, basic logging, error handling, manual deploy, unit tests, READMEL2StableStructured logging, metrics, graceful shutdown, CI/CD pipeline, integration tests, runbooksL3ResilientDistributed tracing, circuit breakers, auto-scaling, chaos testing, SLOs, on-call rotationL4OptimizedAdaptive rate limiting, predictive alerting, canary deploys, full observability, error budgets, postmortem culture

Progression Guidance

L1 โ†’ L2: Add structured logging, metrics endpoint, and a CI/CD pipeline. Write runbooks for known failure modes. L2 โ†’ L3: Instrument distributed tracing. Add circuit breakers to external calls. Define SLOs and set up on-call. L3 โ†’ L4: Implement canary deployments. Adopt error budgets. Run regular game days. Build predictive alerting.

On-Call Rotation

Minimum two engineers per rotation (primary + secondary) Handoff includes review of recent deploys, open issues, and known risks Escalation targets defined: primary โ†’ secondary โ†’ engineering lead โ†’ VP Eng

Escalation Matrix

SeverityResponse TimeEscalation AfterStakeholder NotificationSEV-1 (outage)15 min30 minImmediate โ€” exec + customersSEV-2 (degraded)30 min1 hourWithin 1 hour โ€” eng leadSEV-3 (minor)4 hoursNext business dayDaily standupSEV-4 (cosmetic)Next sprintN/ABacklog

Postmortem Template

  • ## Incident: [Title]
  • **Date:** YYYY-MM-DD | **Duration:** X hours | **Severity:** SEV-N
  • ### Summary
  • One-paragraph description of what happened and impact.
  • ### Timeline
  • HH:MM โ€” First alert fired
  • HH:MM โ€” Engineer paged, investigation started
  • HH:MM โ€” Root cause identified
  • HH:MM โ€” Mitigation applied
  • HH:MM โ€” Full resolution confirmed
  • ### Root Cause
  • What broke and why. Link to code/config change if applicable.
  • ### Impact
  • Users affected: N
  • Revenue impact: $X (if applicable)
  • SLO budget consumed: X%
  • ### Action Items
  • | Action | Owner | Due Date | Status |
  • |--------|-------|----------|--------|
  • | Fix X | @eng | YYYY-MM-DD | Open |
  • ### Lessons Learned
  • What went well
  • What went poorly
  • Where we got lucky

NEVER Do

NEVER skip health checks โ€” every service must expose health endpoints; no exceptions for "simple" services NEVER store secrets in code or container images โ€” use a secrets manager; never default env vars with real values NEVER deploy without a rollback plan โ€” if you cannot roll back in under 5 minutes, you are not ready to deploy NEVER ignore error budget violations โ€” when the error budget is exhausted, freeze feature work and fix reliability NEVER treat logging as optional โ€” a service without structured logging is a service you cannot debug at 3 AM NEVER go to production without runbooks โ€” if on-call cannot resolve the top 5 failure modes without the original author, the service is not production-ready

Category context

Code helpers, APIs, CLIs, browser automation, testing, and developer operations.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package
2 Docs
  • SKILL.md Primary doc
  • README.md Docs