โ† All skills
Tencent SkillHub ยท Other

Self-Hosting Mastery

Complete self-hosting and homelab operating system. Deploy, secure, monitor, and maintain self-hosted services with production-grade reliability. Use when se...

skill openclawclawhub Free
0 Downloads
0 Stars
0 Installs
0 Score
High Signal

Complete self-hosting and homelab operating system. Deploy, secure, monitor, and maintain self-hosted services with production-grade reliability. Use when se...

โฌ‡ 0 downloads โ˜… 0 stars Unverified but indexed

Install for OpenClaw

Quick setup
  1. Download the package from Yavira.
  2. Extract the archive and review SKILL.md first.
  3. Import or place the package into your OpenClaw setup.

Requirements

Target platform
OpenClaw
Install method
Manual import
Extraction
Extract archive
Prerequisites
OpenClaw
Primary doc
SKILL.md

Package facts

Download mode
Yavira redirect
Package format
ZIP package
Source platform
Tencent SkillHub
What's included
README.md, SKILL.md

Validation

  • Use the Yavira download entry.
  • Review SKILL.md after the package is downloaded.
  • Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

  1. Download the package from Yavira.
  2. Extract it into a folder your agent can access.
  3. Paste one of the prompts below and point your agent at the extracted folder.
New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.

Trust & source

Release facts

Source
Tencent SkillHub
Verification
Indexed source record
Version
1.0.0

Documentation

ClawHub primary doc Primary doc: SKILL.md 47 sections Open source page

Self-Hosting Mastery

Complete system for building and operating reliable self-hosted infrastructure โ€” from first server to multi-node homelab.

Server Profile YAML

server_profile: name: "" hardware: cpu: "" # e.g., "Intel i5-12400" or "Raspberry Pi 5" ram_gb: 0 storage: - device: "" # e.g., "/dev/sda" type: "" # ssd | hdd | nvme size_gb: 0 role: "" # boot | data | backup network: "" # 1gbe | 2.5gbe | 10gbe os: "" # debian | ubuntu | proxmox | unraid | truenas location: "" # home | closet | rack | colo | vps power: ups: false wattage_idle: 0 wattage_load: 0 monthly_cost_estimate: "" # electricity network: public_ip: "" # static | dynamic | cgnat domain: "" dns_provider: "" # cloudflare | duckdns | custom isp_ports_open: true # some ISPs block 80/443 goals: - "" # media server, smart home, dev environment, etc. budget_monthly: "" # electricity + domain + any VPS

Hardware Decision Matrix

BudgetRAMStorageGood ForExample Hardware$04-8GB64GB+Pi-hole, AdGuard, small toolsRaspberry Pi 4/5$50-1508-16GB256GB+Docker host, 5-10 servicesUsed SFF PC (Dell Optiplex, Lenovo Tiny)$150-40016-32GB1TB+NAS + services, media serverMini PC (Intel NUC, Beelink)$400-80032-64GB4TB+Full homelab, VMs + containersUsed enterprise (Dell R720, HP DL380)$800+64GB+10TB+Multi-node, Proxmox clusterMultiple nodes, dedicated NAS

Self-Host vs SaaS Decision

Ask before self-hosting anything: Data sensitivity โ€” Does keeping data local matter? (passwords, health, finance = yes) Reliability need โ€” Can you tolerate occasional downtime? (email = risky, media = fine) Maintenance budget โ€” Do you have 2-4 hours/month for updates? Skill level โ€” Can you debug Docker/networking issues? Cost comparison โ€” Is the SaaS < $10/mo? Often not worth self-hosting for trivial savings. Always self-host: Password manager, DNS/ad-blocking, VPN, bookmarks, notes Usually self-host: Media server, file sync, photo backup, monitoring, git Think twice: Email (deliverability hell), calendar (sync complexity), chat (uptime expectations) Rarely worth it: Search engine (resource hungry), social media (no network effect)

OS Selection Guide

OSBest ForLearning CurveNotesDebian 12Docker-only hostLowStable, minimal, just worksUbuntu Server 24.04Beginners, wide docsLowMore packages, snap controversyProxmox VEVMs + containersMediumFree, enterprise features, ZFSUnraidNAS + Docker + VMsMedium$59-129, great UI, parity arrayTrueNAS ScaleZFS NAS + DockerMediumFree, ZFS-first, apps improvingNixOSReproducible configsHighDeclarative, steep learning curve

Proxmox Quick Setup

# Post-install essentials # 1. Remove enterprise repo (if no subscription) sed -i 's/^deb/#deb/' /etc/apt/sources.list.d/pve-enterprise.list echo "deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription" > /etc/apt/sources.list.d/pve-no-subscription.list apt update && apt upgrade -y # 2. Create a Docker LXC (lightweight container) # Download template: Datacenter โ†’ Storage โ†’ CT Templates โ†’ Download โ†’ debian-12 # Create CT: 2 cores, 2GB RAM, 32GB disk, bridge vmbr0 # Inside CT: install Docker apt install -y curl curl -fsSL https://get.docker.com | sh # 3. Enable IOMMU for GPU passthrough (if needed) # Edit /etc/default/grub: GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on" # update-grub && reboot

VM vs LXC vs Docker Decision

FactorVMLXCDockerIsolationFull (own kernel)Partial (shared kernel)Process-levelOverheadHigh (1-2GB base)Low (50-200MB)MinimalUse whenDifferent OS, GPU passthrough, untrusted workloadsDedicated service host, ZFS datasetsMost servicesAvoid whenRAM-constrainedNeed Windows, custom kernelStateful databases (use LXC/VM) Rule: Docker for 90% of services. LXC for Docker hosts or isolated environments. VM for Windows, different kernel needs, or GPU passthrough.

Docker Compose Project Structure

/opt/stacks/ # or ~/docker/ โ”œโ”€โ”€ traefik/ โ”‚ โ”œโ”€โ”€ docker-compose.yml โ”‚ โ”œโ”€โ”€ .env โ”‚ โ”œโ”€โ”€ config/ โ”‚ โ”‚ โ””โ”€โ”€ traefik.yml โ”‚ โ””โ”€โ”€ data/ โ”‚ โ”œโ”€โ”€ acme.json # chmod 600 โ”‚ โ””โ”€โ”€ dynamic/ โ”œโ”€โ”€ monitoring/ โ”‚ โ”œโ”€โ”€ docker-compose.yml โ”‚ โ”œโ”€โ”€ .env โ”‚ โ””โ”€โ”€ config/ โ”œโ”€โ”€ media/ โ”‚ โ”œโ”€โ”€ docker-compose.yml โ”‚ โ”œโ”€โ”€ .env โ”‚ โ””โ”€โ”€ config/ โ”œโ”€โ”€ productivity/ โ”‚ โ”œโ”€โ”€ docker-compose.yml โ”‚ โ”œโ”€โ”€ .env โ”‚ โ””โ”€โ”€ config/ โ””โ”€โ”€ scripts/ โ”œโ”€โ”€ backup.sh โ”œโ”€โ”€ update-all.sh โ””โ”€โ”€ health-check.sh

Docker Compose Best Practices

# Template: production-grade service services: app: image: vendor/app:1.2.3 # ALWAYS pin version container_name: app # Explicit name restart: unless-stopped # Auto-restart networks: - proxy # Traefik network - internal # Backend network volumes: - ./config:/config # Bind mount for config - app-data:/data # Named volume for data environment: - TZ=Europe/London # Always set timezone - PUID=1000 # Match host user - PGID=1000 env_file: - .env # Secrets in .env (gitignored) labels: - "traefik.enable=true" - "traefik.http.routers.app.rule=Host(`app.example.com`)" - "traefik.http.routers.app.tls.certresolver=letsencrypt" - "traefik.http.services.app.loadbalancer.server.port=8080" healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8080/health"] interval: 30s timeout: 10s retries: 3 start_period: 40s deploy: resources: limits: memory: 512M # Prevent OOM cascades security_opt: - no-new-privileges:true # Security hardening read_only: true # Where possible tmpfs: - /tmp volumes: app-data: networks: proxy: external: true internal:

Docker Security Checklist

Pin all image versions (never :latest in production) Set restart: unless-stopped on all services Use .env files for secrets (never hardcode in compose) Set memory limits on all containers Use security_opt: no-new-privileges:true Use read_only: true where possible + tmpfs for /tmp Create separate Docker networks per stack Never expose database ports to 0.0.0.0 Run containers as non-root (PUID/PGID or user:) Enable Docker content trust: export DOCKER_CONTENT_TRUST=1 Prune unused images/volumes monthly: docker system prune -af Use named volumes (not anonymous) for all persistent data Set TZ environment variable on every container

Reverse Proxy Selection

ProxyBest ForSSLConfig StyleLearning CurveTraefikDocker-native, auto-discoveryAuto (ACME)Labels + YAMLMediumCaddySimplicity, auto-SSLAuto (built-in)CaddyfileLowNginx Proxy ManagerGUI preferenceAuto (UI)Web UIVery LowNginx (manual)Maximum controlManual/certbotConfig filesHigh Recommendation: Traefik for Docker power users. Caddy for simplicity. NPM for beginners.

Traefik Production Config

# traefik/config/traefik.yml api: dashboard: true insecure: false entryPoints: web: address: ":80" http: redirections: entryPoint: to: websecure scheme: https websecure: address: ":443" http: tls: certResolver: letsencrypt certificatesResolvers: letsencrypt: acme: email: you@example.com storage: /data/acme.json # Use DNS challenge if ISP blocks port 80 # dnsChallenge: # provider: cloudflare httpChallenge: entryPoint: web providers: docker: exposedByDefault: false # Explicit opt-in per service network: proxy file: directory: /data/dynamic watch: true log: level: WARN accessLog: filePath: /data/access.log bufferingSize: 100

Cloudflare Tunnel (Zero Port Forwarding)

For CGNAT or ISPs blocking ports โ€” expose services without opening firewall: # cloudflared/docker-compose.yml services: cloudflared: image: cloudflare/cloudflared:2024.1.0 container_name: cloudflared restart: unless-stopped command: tunnel run environment: - TUNNEL_TOKEN=${CF_TUNNEL_TOKEN} networks: - proxy When to use Cloudflare Tunnel vs port forwarding: CGNAT (no public IP) โ†’ Tunnel (only option) ISP blocks 80/443 โ†’ Tunnel or DNS challenge + non-standard ports Security-first โ†’ Tunnel (no open ports) Performance-first โ†’ Direct (lower latency) LAN-only access โ†’ Neither (use Tailscale/WireGuard)

Tier 1 โ€” Deploy First (Foundation)

ServicePurposeImageRAMNotesTraefik/CaddyReverse proxy + SSLtraefik:v3.064MBGateway to everythingPi-hole/AdGuardDNS + ad blockingpihole/pihole128MBNetwork-wide ad blockingAuthelia/AuthentikSSO + 2FAauthelia/authelia128MBProtect services without built-in authUptime KumaMonitoringlouislam/uptime-kuma128MBKnow when things breakWatchtowerAuto-updatescontainrrr/watchtower32MBOptional โ€” some prefer manual

Tier 2 โ€” Core Services

ServicePurposeAltRAMVaultwardenPassword managerBitwarden64MBNextcloudFile sync + officeSeafile (lighter)512MBImmichPhoto backupPhotoPrism1-4GBJellyfinMedia serverPlex (less free)512MB-2GBPaperless-ngxDocument management-256MBHome AssistantSmart home-512MB

Tier 3 โ€” Power User

ServicePurposeRAMGitea/ForgejoGit hosting256MBn8nWorkflow automation256MBGrafana + PrometheusMetrics & dashboards512MBTandoorRecipe management256MBMealieMeal planning128MBLinkwarden/HoarderBookmark manager256MBStirling PDFPDF tools512MBIT-ToolsDeveloper utilities64MB

RAM Planning

Total RAM needed โ‰ˆ OS base (1-2GB) + sum of service RAM + 20% headroom Example 16GB server: OS + Docker: 2 GB Traefik: 0.1 GB Pi-hole: 0.1 GB Authelia: 0.1 GB Uptime Kuma: 0.1 GB Vaultwarden: 0.1 GB Nextcloud: 0.5 GB Immich: 2.0 GB Jellyfin: 1.0 GB Paperless: 0.3 GB Home Assistant: 0.5 GB โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Total: 6.8 GB โ†’ 8.2 GB with headroom Available: ~7.8 GB free for more services

DNS Architecture

Internet โ†’ Cloudflare DNS โ†’ Your Public IP โ†’ Router โ†’ Server โ†“ Reverse Proxy (Traefik) โ†“ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ†“ โ†“ โ†“ app.domain.com files.domain.com media.domain.com

Split DNS (Access Services Locally Without Hairpin NAT)

# Pi-hole/AdGuard: Local DNS rewrites # Point *.home.example.com โ†’ 192.168.1.100 (server LAN IP) # External: Cloudflare points to public IP # Result: LAN traffic stays local, external goes through internet

VPN for Remote Access

SolutionTypeBest ForComplexityTailscaleMesh VPNEasiest setup, multi-deviceVery LowWireGuardPoint-to-pointPerformance, full controlMediumHeadscaleSelf-hosted TailscalePrivacy, no vendor lockMedium-High Recommendation: Start with Tailscale (free for 3 users). Move to Headscale when you want full control.

Firewall Rules (UFW)

# Default deny incoming ufw default deny incoming ufw default allow outgoing # Allow SSH (change port from 22!) ufw allow 2222/tcp comment 'SSH' # Allow HTTP/HTTPS for reverse proxy ufw allow 80/tcp comment 'HTTP redirect' ufw allow 443/tcp comment 'HTTPS' # Allow local network for discovery ufw allow from 192.168.1.0/24 comment 'LAN' # Enable ufw enable

3-2-1 Rule Implementation

3 copies: Live data + Local backup + Remote backup 2 media: SSD/HDD (server) + External drive or NAS 1 offsite: Cloud (Backblaze B2, Wasabi) or second location

Backup Script Template

#!/bin/bash # /opt/stacks/scripts/backup.sh set -euo pipefail BACKUP_DIR="/mnt/backup/docker" STACKS_DIR="/opt/stacks" DATE=$(date +%Y-%m-%d_%H%M) RETENTION_DAYS=30 log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1"; } # 1. Stop services that need consistent backups log "Stopping database services..." cd "$STACKS_DIR/productivity" && docker compose stop db # 2. Backup Docker volumes log "Backing up volumes..." for vol in $(docker volume ls -q); do docker run --rm \ -v "$vol":/source:ro \ -v "$BACKUP_DIR/volumes":/backup \ alpine tar czf "/backup/${vol}_${DATE}.tar.gz" -C /source . done # 3. Backup compose files and configs log "Backing up configs..." tar czf "$BACKUP_DIR/configs/stacks_${DATE}.tar.gz" \ --exclude='*.log' \ --exclude='node_modules' \ "$STACKS_DIR" # 4. Restart services log "Restarting services..." cd "$STACKS_DIR/productivity" && docker compose start db # 5. Cleanup old backups log "Cleaning up backups older than ${RETENTION_DAYS} days..." find "$BACKUP_DIR" -name "*.tar.gz" -mtime +$RETENTION_DAYS -delete # 6. Sync to remote (Backblaze B2 example) # rclone sync "$BACKUP_DIR" b2:my-backups/docker/ --transfers 4 # 7. Verify BACKUP_SIZE=$(du -sh "$BACKUP_DIR" | cut -f1) log "Backup complete. Total size: $BACKUP_SIZE" # 8. Send notification (optional) # curl -s "https://ntfy.sh/my-backups" -d "Backup complete: $BACKUP_SIZE"

Backup Schedule

WhatFrequencyRetentionMethodDocker volumesDaily 3 AM30 daysScript + cronCompose files + configsDaily 3 AM90 daysScript + cronDatabase dumpsEvery 6 hours7 dayspg_dump/mysqldumpFull disk imageMonthly3 monthsClonezilla/ddOffsite syncDaily 5 AM60 daysrclone to B2/Wasabi

Backup Verification (Monthly)

Pick a random backup from last week Restore to a test VM/container Verify data integrity (check file counts, DB row counts) Time the restore process (document RTO) Log results in backup-verification.md

Monitoring Stack (Docker Compose)

# monitoring/docker-compose.yml services: uptime-kuma: image: louislam/uptime-kuma:1 container_name: uptime-kuma restart: unless-stopped volumes: - uptime-data:/app/data labels: - "traefik.enable=true" - "traefik.http.routers.uptime.rule=Host(`status.example.com`)" prometheus: image: prom/prometheus:v2.49.0 container_name: prometheus restart: unless-stopped volumes: - ./config/prometheus.yml:/etc/prometheus/prometheus.yml:ro - prometheus-data:/prometheus command: - '--config.file=/etc/prometheus/prometheus.yml' - '--storage.tsdb.retention.time=30d' grafana: image: grafana/grafana:10.3.0 container_name: grafana restart: unless-stopped volumes: - grafana-data:/var/lib/grafana environment: - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD} node-exporter: image: prom/node-exporter:v1.7.0 container_name: node-exporter restart: unless-stopped pid: host volumes: - /proc:/host/proc:ro - /sys:/host/sys:ro - /:/rootfs:ro command: - '--path.procfs=/host/proc' - '--path.sysfs=/host/sys' - '--path.rootfs=/rootfs' cadvisor: image: gcr.io/cadvisor/cadvisor:v0.49.0 container_name: cadvisor restart: unless-stopped volumes: - /:/rootfs:ro - /var/run:/var/run:ro - /sys:/sys:ro - /var/lib/docker/:/var/lib/docker:ro volumes: uptime-data: prometheus-data: grafana-data:

Alert Rules

MetricWarningCriticalActionDisk usage>80%>90%Cleanup or expandRAM usage>85%>95%Identify memory leak, add RAMCPU sustained>80% 5min>95% 5minCheck runaway processContainer restart>2/hour>5/hourCheck logs, fix root causeSSL cert expiry<14 days<3 daysRenew certBackup age>26 hours>48 hoursCheck backup script/cronService down>2 min>10 minInvestigate, restart

Notification Channels

ChannelServiceBest ForPush notificationntfy.sh (self-hosted)Mobile alertsChatDiscord/Slack webhookTeam alertsEmailUptime Kuma built-inFormal notificationsDashboardGrafana + Uptime KumaVisual monitoring

Server Hardening Checklist

# 1. SSH hardening # /etc/ssh/sshd_config Port 2222 # Change default port PermitRootLogin no # No root SSH PasswordAuthentication no # Key-only MaxAuthTries 3 AllowUsers yourusername # 2. Install fail2ban apt install fail2ban -y systemctl enable fail2ban # 3. Automatic security updates apt install unattended-upgrades -y dpkg-reconfigure -plow unattended-upgrades # 4. Disable unused services systemctl list-unit-files --state=enabled # Disable anything you don't need

Authentication Architecture

Internet โ†’ Traefik โ†’ Authelia/Authentik โ†’ Service โ†“ Check: authenticated? Yes โ†’ Forward to service No โ†’ Redirect to login page + 2FA Authelia (lightweight, YAML config) โ€” good for smaller setups Authentik (full IdP, web UI) โ€” good for many users/services, SAML/OIDC

Security Scoring (0-100)

DimensionWeightScore GuideSSH hardened (keys, non-root, non-22)150=default, 15=fully hardenedFirewall active (deny-by-default)150=none, 15=UFW/iptables configuredReverse proxy (no direct port exposure)150=ports exposed, 15=all behind proxySSL/TLS on all services100=HTTP, 10=HTTPS everywhereAuth on all public services150=open, 15=SSO/2FA on everythingContainer security (non-root, limits)100=default, 10=hardenedAuto-updates enabled100=manual, 10=automatedSecrets management (.env, not hardcoded)100=in compose, 10=.env + restricted perms Score: 0-40 = Vulnerable, 41-70 = Acceptable, 71-90 = Good, 91-100 = Hardened

Update Strategy

Option A: Manual (Recommended for critical services) # Update script: /opt/stacks/scripts/update-all.sh #!/bin/bash set -euo pipefail STACKS_DIR="/opt/stacks" LOG="/var/log/docker-updates.log" for stack in "$STACKS_DIR"/*/; do if [ -f "$stack/docker-compose.yml" ]; then echo "[$(date)] Updating $(basename $stack)..." | tee -a "$LOG" cd "$stack" docker compose pull 2>&1 | tee -a "$LOG" docker compose up -d 2>&1 | tee -a "$LOG" fi done docker image prune -f | tee -a "$LOG" echo "[$(date)] Update complete" | tee -a "$LOG" Option B: Watchtower (Automated โ€” use with caution) services: watchtower: image: containrrr/watchtower:1.7.1 container_name: watchtower restart: unless-stopped volumes: - /var/run/docker.sock:/var/run/docker.sock environment: - WATCHTOWER_SCHEDULE=0 0 4 * * MON # Monday 4 AM - WATCHTOWER_CLEANUP=true - WATCHTOWER_NOTIFICATIONS=shoutrrr - WATCHTOWER_NOTIFICATION_URL=discord://webhook - WATCHTOWER_LABEL_ENABLE=true # Only update labeled containers # Add label to containers: com.centurylinklabs.watchtower.enable=true

Weekly Maintenance Checklist

Check Uptime Kuma for any downtime events Review disk usage (df -h) Check container health (docker ps --filter health=unhealthy) Review fail2ban bans (fail2ban-client status) Check backup logs (last successful backup) Review Docker logs for errors (docker logs --since 7d <container>) Prune unused resources (docker system prune -f)

Monthly Maintenance

Update all container images (read changelogs first!) Update host OS (apt update && apt upgrade) Test a backup restore Review and rotate secrets/passwords Check SSL certificate expiry dates Review Grafana dashboards for trends Clean up unused Docker networks/volumes

Multi-Node Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Node 1 โ”‚ โ”‚ Node 2 โ”‚ โ”‚ Node 3 โ”‚ โ”‚ (Proxy/DNS) โ”‚โ”€โ”€โ”€โ”€โ”‚ (Services) โ”‚โ”€โ”€โ”€โ”€โ”‚ (NAS) โ”‚ โ”‚ Traefik โ”‚ โ”‚ Apps โ”‚ โ”‚ TrueNAS โ”‚ โ”‚ Pi-hole โ”‚ โ”‚ Databases โ”‚ โ”‚ NFS/SMB โ”‚ โ”‚ Authelia โ”‚ โ”‚ Media โ”‚ โ”‚ Backup โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ†‘ โ†‘ โ†‘ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Tailscale Mesh โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Docker Compose Includes (Compose v2.20+)

# Shared fragments include: - path: ../common/traefik-labels.yml - path: ../common/logging.yml services: app: # inherits common configs

GitOps for Homelab

homelab-configs/ # Git repo โ”œโ”€โ”€ .github/ โ”‚ โ””โ”€โ”€ workflows/ โ”‚ โ””โ”€โ”€ deploy.yml # CI: lint + push to server โ”œโ”€โ”€ stacks/ โ”‚ โ”œโ”€โ”€ traefik/ โ”‚ โ”œโ”€โ”€ monitoring/ โ”‚ โ””โ”€โ”€ media/ โ”œโ”€โ”€ scripts/ โ””โ”€โ”€ README.md Workflow: Edit compose locally โ†’ commit โ†’ push โ†’ CI deploys to server Tools: Flux/ArgoCD (overkill), or simple git pull && docker compose up -d via webhook

Hardware Redundancy

ComponentSolutionCostPowerUPS (APC Back-UPS 600VA+)$60-150StorageRAID1/ZFS mirror (not RAID0!)2x disk costNetworkDual NIC, managed switch$30-100ServerSecond node (cold spare or active)$100-400 Rule: RAID is NOT backup. It protects against disk failure only, not ransomware/deletion/corruption.

Common Issues Decision Tree

Service not accessible? โ”œโ”€โ”€ Can you ping the server? โ†’ No โ†’ Network/firewall issue โ”œโ”€โ”€ Is the container running? (`docker ps`) โ†’ No โ†’ Check logs: `docker logs <name>` โ”œโ”€โ”€ Is the port exposed? (`docker port <name>`) โ†’ No โ†’ Check compose ports/networks โ”œโ”€โ”€ Is Traefik routing? (Check Traefik dashboard) โ†’ No โ†’ Check labels, network โ”œโ”€โ”€ Is DNS resolving? (`dig app.example.com`) โ†’ No โ†’ Check DNS provider โ””โ”€โ”€ SSL error? โ†’ Check acme.json permissions (chmod 600), cert resolver logs

Docker Debug Commands

# Container not starting docker logs <name> --tail 50 docker inspect <name> | jq '.[0].State' # Network issues docker network ls docker network inspect <network> docker exec <name> ping other-container # Resource issues docker stats # Live resource usage docker system df # Disk usage docker volume ls -f dangling=true # Orphaned volumes # Nuclear options (use carefully) docker compose down && docker compose up -d # Full restart docker system prune -af --volumes # Clean EVERYTHING

Performance Optimization

SymptomLikely CauseFixSlow file accessHDD for databaseMove DB to SSDHigh CPU idleMonitoring too frequentIncrease scrape intervalsOOM killsNo memory limitsSet deploy.resources.limits.memorySlow NextcloudMissing Redis cacheAdd Redis containerJellyfin bufferingNo hardware transcodingEnable GPU passthroughSlow Docker buildsNo layer cachingUse multi-stage + .dockerignore

Vaultwarden (Password Manager)

services: vaultwarden: image: vaultwarden/server:1.30.5 container_name: vaultwarden restart: unless-stopped volumes: - vaultwarden-data:/data environment: - SIGNUPS_ALLOWED=false # Disable after creating your account - WEBSOCKET_ENABLED=true - ADMIN_TOKEN=${ADMIN_TOKEN} # Generate: openssl rand -base64 48 labels: - "traefik.enable=true" - "traefik.http.routers.vault.rule=Host(`vault.example.com`)"

Immich (Photo Backup)

# Use their official docker-compose.yml from: # https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml # Key settings: # - Set UPLOAD_LOCATION to a large storage mount # - Enable hardware transcoding if GPU available # - Set IMMICH_MACHINE_LEARNING_URL for face detection

Paperless-ngx (Document Management)

services: paperless: image: ghcr.io/paperless-ngx/paperless-ngx:2.4 container_name: paperless restart: unless-stopped volumes: - paperless-data:/usr/src/paperless/data - paperless-media:/usr/src/paperless/media - ./consume:/usr/src/paperless/consume # Drop PDFs here - ./export:/usr/src/paperless/export environment: - PAPERLESS_OCR_LANGUAGE=eng - PAPERLESS_TIME_ZONE=Europe/London - PAPERLESS_ADMIN_USER=${ADMIN_USER} - PAPERLESS_ADMIN_PASSWORD=${ADMIN_PASS}

Homelab Quality Rubric (0-100)

DimensionWeight0 (Poor)50 (Decent)100 (Excellent)Security20%Default passwords, open portsFirewall + SSLHardened SSH, SSO/2FA, no-new-privilegesBackups20%NoneLocal only, untested3-2-1, automated, verified monthlyMonitoring15%NoneUptime Kuma onlyFull stack: metrics + logs + alertsDocumentation10%Nothing writtenREADME per stackGitOps, full runbook, diagramsUpdates10%Never updatedManual quarterlyScheduled weekly, changelogs reviewedReliability10%Frequent crashesMostly stableUPS, auto-restart, health checksPerformance10%Slow, OOM killsAdequateResource limits, SSD, HW transcodingScalability5%Single machine, no planCompose organizedMulti-node ready, IaC

10 Self-Hosting Mistakes

#MistakeFix1Using :latest tagPin versions: image:1.2.32No backups3-2-1 backup rule, test restores3Exposing ports directlyEverything behind reverse proxy4Default passwordsChange immediately, use password manager5No monitoringUptime Kuma minimum, Grafana for depth6RAID = backup mentalityRAID protects disks, not data7Over-engineering day 1Start small, add complexity as needed8No documentationDocument every service, every port, every cron9Ignoring updatesSecurity patches matter, schedule updates10Running as rootNon-root containers, restricted SSH

Natural Language Commands

SayAgent Does"Set up a new service"Guide through compose file creation with security best practices"Audit my homelab security"Run through security scoring checklist"Plan my backup strategy"Design 3-2-1 backup plan for your setup"What should I self-host?"Assess needs and recommend services by tier"My container keeps crashing"Walk through troubleshooting decision tree"Help me set up Traefik"Generate production Traefik config with SSL"Compare NAS options"Compare TrueNAS vs Unraid vs DIY for your needs"Optimize my Docker setup"Review compose files for security and performance"Set up monitoring"Deploy Uptime Kuma + Prometheus + Grafana stack"Plan a hardware upgrade"Assess current usage, recommend hardware by budget"Migrate from cloud to self-hosted"Plan migration with data export and service mapping"Set up remote access"Compare and deploy VPN/Tailscale for secure remote access

Category context

Long-tail utilities that do not fit the current primary taxonomy cleanly.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package
2 Docs
  • SKILL.md Primary doc
  • README.md Docs