{
  "schemaVersion": "1.0",
  "item": {
    "slug": "alerts",
    "name": "Alerts",
    "source": "tencent",
    "type": "skill",
    "category": "AI 智能",
    "sourceUrl": "https://clawhub.ai/ivangdavila/alerts",
    "canonicalUrl": "https://clawhub.ai/ivangdavila/alerts",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/alerts",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=alerts",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "SKILL.md"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "slug": "alerts",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-29T18:48:05.914Z",
      "expiresAt": "2026-05-06T18:48:05.914Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=alerts",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=alerts",
        "contentDisposition": "attachment; filename=\"alerts-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null,
        "slug": "alerts"
      },
      "scope": "item",
      "summary": "Item download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this item.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/alerts"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/alerts",
    "agentPageUrl": "https://openagent3.xyz/skills/alerts/agent",
    "manifestUrl": "https://openagent3.xyz/skills/alerts/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/alerts/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "Alert Fatigue Prevention",
        "body": "Group alerts by root cause, never by individual symptoms.\nUse labels: alertname, service, cluster - not instance IDs.\n\n# Good: One alert for database down affecting 50 pods\ngroup_by: ['alertname', 'service']\n# Bad: 50 individual alerts for each failed pod\n\nImplement severity hierarchy: P0 (pages immediately) > P1 (within 15min) > P2 (business hours) > P3 (weekly review).\nP0: Service completely down, data loss, security breach.\nP1: Degraded performance, partial outage, high error rates.\n\nSet cooldown periods to prevent alert spam.\nMinimum 5 minutes between identical alerts, 30 minutes for cost alerts.\n\nrepeat_interval: 5m  # For critical alerts\nrepeat_interval: 30m # For cost/performance alerts\n\nUse inhibition rules to suppress symptoms when root cause fires.\nIf \"Database Unreachable\" fires, silence all \"API High Latency\" alerts from same cluster."
      },
      {
        "title": "AI Agent Monitoring Patterns",
        "body": "Monitor token/API usage with exponential alerting thresholds.\nAlert at 2x, 5x, 10x normal usage - costs can spiral quickly.\nTrack: tokens per minute, cost per request, API rate limits approached.\n\nSet behavioral drift alerts on response quality degradation.\nCompare current outputs to baseline with sample prompts every hour.\nAlert when success rate drops below 85% or response time exceeds 2x baseline.\n\nMonitor for infinite loops in multi-agent workflows.\nAlert if same prompt sent >3 times in 5 minutes or agent hasn't responded in 10 minutes.\nInclude correlation IDs to trace conversation chains.\n\nTrack silent failures through downstream metrics.\nMonitor: tasks completed vs started, user satisfaction scores, retry attempts.\nThese catch errors that don't throw exceptions."
      },
      {
        "title": "Routing and Escalation Rules",
        "body": "Route by expertise domain, not arbitrary on-call schedules.\nDatabase alerts → DB team, API alerts → backend team, cost alerts → platform team.\nOnly escalate to managers for P0 incidents lasting >30 minutes.\n\nUse progressive escalation with increasing urgency.\nP1 alerts: Slack notification → 5min wait → SMS → 10min wait → phone call.\nInclude runbook links in every alert for faster resolution.\n\nSet context-aware routing based on time and impact.\nBusiness hours: Route to primary team. Off-hours: Route to on-call only for P0/P1.\nIf >100 users affected: Immediately escalate regardless of severity."
      },
      {
        "title": "Webhook Reliability Patterns",
        "body": "Always include correlation IDs for alert lifecycle management.\nGenerate UUID for each incident, use it to create/update/resolve alerts.\nEssential for bi-directional integrations with PagerDuty/Slack.\n\nImplement exponential backoff for webhook failures.\nRetry after 1s, 2s, 4s, 8s, 16s, then mark failed and escalate.\nLog webhook response codes/times for debugging delivery issues.\n\nUse webhook verification to prevent spoofing.\nValidate signatures using HMAC-SHA256 with shared secret.\nAlways check timestamp to prevent replay attacks (max 5 min old).\n\nImplement circuit breaker pattern for unreliable endpoints.\nAfter 5 consecutive failures, mark endpoint down and use backup channel.\nRe-test every 30 seconds until recovery confirmed."
      },
      {
        "title": "Status Page Integration",
        "body": "Update status page automatically when P0/P1 alerts fire.\nCreate incident, post initial assessment within 5 minutes.\nInclude ETA and workaround if available.\n\nUse component-based status updates matching your alert groups.\nMap alert labels to status page components (API, Database, Auth, etc.).\nPartial outages should show \"Degraded Performance\", not \"Operational\"."
      },
      {
        "title": "Runbook Automation",
        "body": "Embed runbook links directly in alert messages.\nFormat: \"Alert: High CPU on web-01. Runbook: https://wiki/runbooks/high-cpu-web\"\nLinks must be accessible from mobile devices for on-call engineers.\n\nTrigger automated remediation for known issues.\nAuto-restart stuck services, clear full disks, reset rate limits.\nAlways require human approval for destructive actions (scaling down, deleting data).\n\nLog all automated actions taken in response to alerts.\nInclude: timestamp, action, result, approval chain.\nEssential for post-incident reviews and compliance audits."
      }
    ],
    "body": "Alert Fatigue Prevention\n\nGroup alerts by root cause, never by individual symptoms. Use labels: alertname, service, cluster - not instance IDs.\n\n# Good: One alert for database down affecting 50 pods\ngroup_by: ['alertname', 'service']\n# Bad: 50 individual alerts for each failed pod\n\n\nImplement severity hierarchy: P0 (pages immediately) > P1 (within 15min) > P2 (business hours) > P3 (weekly review). P0: Service completely down, data loss, security breach. P1: Degraded performance, partial outage, high error rates.\n\nSet cooldown periods to prevent alert spam. Minimum 5 minutes between identical alerts, 30 minutes for cost alerts.\n\nrepeat_interval: 5m  # For critical alerts\nrepeat_interval: 30m # For cost/performance alerts\n\n\nUse inhibition rules to suppress symptoms when root cause fires. If \"Database Unreachable\" fires, silence all \"API High Latency\" alerts from same cluster.\n\nAI Agent Monitoring Patterns\n\nMonitor token/API usage with exponential alerting thresholds. Alert at 2x, 5x, 10x normal usage - costs can spiral quickly. Track: tokens per minute, cost per request, API rate limits approached.\n\nSet behavioral drift alerts on response quality degradation. Compare current outputs to baseline with sample prompts every hour. Alert when success rate drops below 85% or response time exceeds 2x baseline.\n\nMonitor for infinite loops in multi-agent workflows. Alert if same prompt sent >3 times in 5 minutes or agent hasn't responded in 10 minutes. Include correlation IDs to trace conversation chains.\n\nTrack silent failures through downstream metrics. Monitor: tasks completed vs started, user satisfaction scores, retry attempts. These catch errors that don't throw exceptions.\n\nRouting and Escalation Rules\n\nRoute by expertise domain, not arbitrary on-call schedules. Database alerts → DB team, API alerts → backend team, cost alerts → platform team. Only escalate to managers for P0 incidents lasting >30 minutes.\n\nUse progressive escalation with increasing urgency. P1 alerts: Slack notification → 5min wait → SMS → 10min wait → phone call. Include runbook links in every alert for faster resolution.\n\nSet context-aware routing based on time and impact. Business hours: Route to primary team. Off-hours: Route to on-call only for P0/P1. If >100 users affected: Immediately escalate regardless of severity.\n\nWebhook Reliability Patterns\n\nAlways include correlation IDs for alert lifecycle management. Generate UUID for each incident, use it to create/update/resolve alerts. Essential for bi-directional integrations with PagerDuty/Slack.\n\nImplement exponential backoff for webhook failures. Retry after 1s, 2s, 4s, 8s, 16s, then mark failed and escalate. Log webhook response codes/times for debugging delivery issues.\n\nUse webhook verification to prevent spoofing. Validate signatures using HMAC-SHA256 with shared secret. Always check timestamp to prevent replay attacks (max 5 min old).\n\nImplement circuit breaker pattern for unreliable endpoints. After 5 consecutive failures, mark endpoint down and use backup channel. Re-test every 30 seconds until recovery confirmed.\n\nStatus Page Integration\n\nUpdate status page automatically when P0/P1 alerts fire. Create incident, post initial assessment within 5 minutes. Include ETA and workaround if available.\n\nUse component-based status updates matching your alert groups. Map alert labels to status page components (API, Database, Auth, etc.). Partial outages should show \"Degraded Performance\", not \"Operational\".\n\nRunbook Automation\n\nEmbed runbook links directly in alert messages. Format: \"Alert: High CPU on web-01. Runbook: https://wiki/runbooks/high-cpu-web\" Links must be accessible from mobile devices for on-call engineers.\n\nTrigger automated remediation for known issues. Auto-restart stuck services, clear full disks, reset rate limits. Always require human approval for destructive actions (scaling down, deleting data).\n\nLog all automated actions taken in response to alerts. Include: timestamp, action, result, approval chain. Essential for post-incident reviews and compliance audits."
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/ivangdavila/alerts",
    "publisherUrl": "https://clawhub.ai/ivangdavila/alerts",
    "owner": "ivangdavila",
    "version": "1.0.0",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/alerts",
    "downloadUrl": "https://openagent3.xyz/downloads/alerts",
    "agentUrl": "https://openagent3.xyz/skills/alerts/agent",
    "manifestUrl": "https://openagent3.xyz/skills/alerts/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/alerts/agent.md"
  }
}