Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Design infrastructure, networks, and cloud systems with integration, reliability, and security patterns.
Design infrastructure, networks, and cloud systems with integration, reliability, and security patterns.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Design for failure at every layer β hardware fails, networks partition, regions go down Redundancy costs money, downtime costs more β calculate acceptable risk Prefer managed services for undifferentiated work β run less, build more Infrastructure as code from day one β manual changes drift and break Immutable infrastructure beats patching β replace, don't repair
Multi-AZ minimum, multi-region for critical systems β availability zones fail together sometimes Right-size first, auto-scale second β baseline must be correct Reserved capacity for steady load, spot/preemptible for bursts β cost optimization requires planning Egress costs add up β keep traffic within regions when possible Cloud vendor lock-in is real β abstract where escape matters, accept where it doesn't
Private subnets for workloads, public only for load balancers β minimize attack surface VPC peering and transit gateways for multi-account β plan topology before scaling DNS for service discovery β hardcoded IPs break migrations Zero trust: authenticate and encrypt internal traffic β perimeter security isn't enough Network segmentation limits blast radius β flat networks let attackers roam
APIs for synchronous, queues for asynchronous β match pattern to requirements Event-driven for loose coupling β producers don't know consumers Service mesh for complex microservices β observability and security at network layer Rate limiting and backpressure protect systems β don't let slow consumers crash fast producers Dead letter queues for failed messages β don't lose data, process later
Define SLOs before building β what does "up" mean for this system? Error budgets allow controlled risk β 99.9% means 8 hours downtime per year is acceptable Blast radius reduction: cell-based architecture β limit how many users one failure affects Chaos engineering in staging first β break things intentionally before production breaks accidentally Runbooks for every alert β 3 AM isn't debugging time
RTO (recovery time) and RPO (data loss) are business decisions β architect for the requirement Backups aren't recovery until tested β restore regularly Hot/warm/cold standby each have trade-offs β cost vs speed of recovery Cross-region replication for critical data β single region is single point of failure DR drills reveal real problems β plan meets reality
Defense in depth: multiple barriers β one layer will fail Least privilege for services too β not just users Secrets management centralized β no secrets in code, config files, or environment variables in images Audit logging for compliance and forensics β you'll need it after a breach Patch aggressively β known vulnerabilities are actively exploited
Metrics, logs, and traces together β each tells part of the story Alerting on symptoms, not causes β users down matters, CPU high might not Dashboards for each service with golden signals β latency, traffic, errors, saturation Distributed tracing across services β follow requests end to end Log aggregation with retention policy β balance cost and forensic needs
Measure current baseline before projecting β can't scale what you don't measure Load test to find breaking points β theory differs from reality Capacity leads demand β scaling takes time, be ahead Cost modeling for growth scenarios β 10x users is rarely 10x cost Review quarterly at minimum β patterns change
Strangler fig pattern for legacy replacement β route traffic gradually Blue-green or canary for infrastructure changes β test in production safely Database migrations are hardest β plan data migration separately Rollback plans before rollout β assume failure, prepare for it Communicate maintenance windows β surprises damage trust
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.