Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
AWS ECS production health monitoring with CloudWatch log analysis — monitors ECS service health, ALB targets, SSL certificates, and provides deep CloudWatch...
AWS ECS production health monitoring with CloudWatch log analysis — monitors ECS service health, ALB targets, SSL certificates, and provides deep CloudWatch...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Production health monitoring and log analysis for AWS ECS services.
Health Checks: HTTP probes against your domain, ECS service status (desired vs running), ALB target group health, SSL certificate expiry Log Analysis: Pulls CloudWatch logs, categorizes errors (panics, fatals, OOM, timeouts, 5xx), detects container restarts, filters health check noise Auto-Diagnosis: Reads health status and automatically investigates failing services via log analysis
aws CLI configured with appropriate IAM permissions: ecs:ListServices, ecs:DescribeServices elasticloadbalancing:DescribeTargetGroups, elasticloadbalancing:DescribeTargetHealth logs:FilterLogEvents, logs:DescribeLogGroups curl for HTTP health checks python3 for JSON processing and log analysis openssl for SSL certificate checks (optional)
All configuration is via environment variables: VariableRequiredDefaultDescriptionECS_CLUSTERYes—ECS cluster nameECS_REGIONNous-east-1AWS regionECS_DOMAINNo—Domain for HTTP/SSL checks (skip if unset)ECS_SERVICESNoauto-detectComma-separated service names to monitorECS_HEALTH_STATENo./data/ecs-health.jsonPath to write health state JSONECS_HEALTH_OUTDIRNo./data/Output directory for logs and alertsECS_LOG_PATTERNNo/ecs/{service}CloudWatch log group pattern ({service} is replaced)ECS_HTTP_ENDPOINTSNo—Comma-separated name=url pairs for HTTP probes
ECS_HEALTH_STATE (default: ./data/ecs-health.json) — Health state JSON file ECS_HEALTH_OUTDIR (default: ./data/) — Output directory for logs, alerts, and analysis reports
# Full check ECS_CLUSTER=my-cluster ECS_DOMAIN=example.com ./scripts/ecs-health.sh # JSON output only ECS_CLUSTER=my-cluster ./scripts/ecs-health.sh --json # Quiet mode (no alerts, just status file) ECS_CLUSTER=my-cluster ./scripts/ecs-health.sh --quiet Exit codes: 0 = healthy, 1 = unhealthy/degraded, 2 = script error
# Pull raw logs from a service ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh pull my-api --minutes 30 # Show errors across all services ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh errors all --minutes 120 # Deep analysis with error categorization ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh diagnose --minutes 60 # Detect container restarts ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh restarts my-api # Auto-diagnose from health state file ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh auto-diagnose # Summary across all services ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh summary --minutes 120 Options: --minutes N (default: 60), --json, --limit N (default: 200), --verbose
When ECS_SERVICES is not set, both scripts auto-detect services from the cluster: aws ecs list-services --cluster $ECS_CLUSTER Log groups are resolved by pattern (default /ecs/{service}). Override with ECS_LOG_PATTERN: # If your log groups are /ecs/prod/my-api, /ecs/prod/my-frontend, etc. ECS_LOG_PATTERN="/ecs/prod/{service}" ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh diagnose
The health monitor can trigger the log analyzer for auto-diagnosis when issues are detected. Set ECS_HEALTH_OUTDIR to a shared directory and run both scripts together: export ECS_CLUSTER=my-cluster export ECS_DOMAIN=example.com export ECS_HEALTH_OUTDIR=./data # Run health check (auto-triggers log analysis on failure) ./scripts/ecs-health.sh # Or run log analysis independently ./scripts/cloudwatch-logs.sh auto-diagnose --minutes 30
The log analyzer classifies errors into: panic — Go panics fatal — Fatal errors oom — Out of memory timeout — Connection/request timeouts connection_error — Connection refused/reset http_5xx — HTTP 500-level responses python_traceback — Python tracebacks exception — Generic exceptions auth_error — Permission/authorization failures structured_error — JSON-structured error logs error — Generic ERROR-level messages Health check noise (GET/HEAD /health from ALB) is automatically filtered from error counts and HTTP status distribution.
Long-tail utilities that do not fit the current primary taxonomy cleanly.
Largest current source with strong distribution and engagement signals.