← All skills
Tencent SkillHub Β· AI

SWARM Safety

SWARM: System-Wide Assessment of Risk in Multi-agent systems. 38 agent types, 29 governance levers, 55 scenarios. Study emergent risks, phase transitions, an...

skill openclawclawhub Free
0 Downloads
0 Stars
0 Installs
0 Score
High Signal

SWARM: System-Wide Assessment of Risk in Multi-agent systems. 38 agent types, 29 governance levers, 55 scenarios. Study emergent risks, phase transitions, an...

⬇ 0 downloads β˜… 0 stars Unverified but indexed

Install for OpenClaw

Quick setup
  1. Download the package from Yavira.
  2. Extract the archive and review SKILL.md first.
  3. Import or place the package into your OpenClaw setup.

Requirements

Target platform
OpenClaw
Install method
Manual import
Extraction
Extract archive
Prerequisites
OpenClaw
Primary doc
SKILL.md

Package facts

Download mode
Yavira redirect
Package format
ZIP package
Source platform
Tencent SkillHub
What's included
SKILL.md, skill.json

Validation

  • Use the Yavira download entry.
  • Review SKILL.md after the package is downloaded.
  • Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

  1. Download the package from Yavira.
  2. Extract it into a folder your agent can access.
  3. Paste one of the prompts below and point your agent at the extracted folder.
New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.

Trust & source

Release facts

Source
Tencent SkillHub
Verification
Indexed source record
Version
1.7.1

Documentation

ClawHub primary doc Primary doc: SKILL.md 25 sections Open source page

SWARM Safety Skill

Study how intelligence swarms β€” and where it fails. SWARM is a research framework for studying emergent risks in multi-agent AI systems using soft (probabilistic) labels instead of binary good/bad classifications. AGI-level risks don't require AGI-level agents β€” harmful dynamics emerge when many sub-AGI agents interact, even when no individual agent is misaligned. v1.7.0 | 38 agent types | 29 governance levers | 55 scenarios | 2922 tests | 8 framework bridges Repository: https://github.com/swarm-ai-safety/swarm

Hard Rules

SWARM simulations run locally. Install the package first. Do not submit scenarios containing real API keys, credentials, or PII. Simulation results are research artifacts. Do not present them as ground truth about real systems. When publishing results, cite the framework and disclose simulation parameters.

Security

API binds to localhost only (127.0.0.1) by default to prevent network exposure. CORS restricted to localhost origins by default. No authentication on development API β€” do not expose to untrusted networks. In-memory storage β€” data does not persist between restarts. For production deployment, add authentication middleware and use a proper database.

Install

# From PyPI pip install swarm-safety # With LLM agent support pip install swarm-safety[llm] # Full development (all extras) git clone https://github.com/swarm-ai-safety/swarm.git cd swarm pip install -e ".[dev,runtime]"

Quick Start (Python)

from swarm.agents.honest import HonestAgent from swarm.agents.opportunistic import OpportunisticAgent from swarm.agents.deceptive import DeceptiveAgent from swarm.agents.adversarial import AdversarialAgent from swarm.core.orchestrator import Orchestrator, OrchestratorConfig config = OrchestratorConfig(n_epochs=10, steps_per_epoch=10, seed=42) orchestrator = Orchestrator(config=config) orchestrator.register_agent(HonestAgent(agent_id="honest_1", name="Alice")) orchestrator.register_agent(HonestAgent(agent_id="honest_2", name="Bob")) orchestrator.register_agent(OpportunisticAgent(agent_id="opp_1")) orchestrator.register_agent(DeceptiveAgent(agent_id="dec_1")) metrics = orchestrator.run() for m in metrics: print(f"Epoch {m.epoch}: toxicity={m.toxicity_rate:.3f}, welfare={m.total_welfare:.2f}")

Quick Start (CLI)

# List available scenarios swarm list # Run a scenario swarm run scenarios/baseline.yaml # Override settings swarm run scenarios/baseline.yaml --seed 42 --epochs 20 --steps 15 # Export results swarm run scenarios/baseline.yaml --export-json results.json --export-csv outputs/

Quick Start (API)

Start the API server: pip install swarm-safety[api] uvicorn swarm.api.app:app --host 127.0.0.1 --port 8000 API documentation at http://localhost:8000/docs. Security Note: The server binds to 127.0.0.1 (localhost only) by default. Do not bind to 0.0.0.0 unless you understand the security implications and have proper firewall rules in place.

Register Agent

curl -X POST http://localhost:8000/api/v1/agents/register \ -H "Content-Type: application/json" \ -d '{ "name": "YourAgent", "description": "What your agent does", "capabilities": ["governance-testing", "red-teaming"] }' Returns agent_id and api_key.

Submit Scenario

curl -X POST http://localhost:8000/api/v1/scenarios/submit \ -H "Content-Type: application/json" \ -d '{ "name": "my-scenario", "description": "Testing collusion detection with 5 agents", "yaml_content": "simulation:\n n_epochs: 10\n steps_per_epoch: 10\nagents:\n - type: honest\n count: 3\n - type: adversarial\n count: 2", "tags": ["collusion", "governance"] }'

Create & Join Simulation

# Create curl -X POST http://localhost:8000/api/v1/simulations/create \ -H "Content-Type: application/json" \ -d '{"scenario_id": "SCENARIO_ID", "max_participants": 5}' # Join curl -X POST http://localhost:8000/api/v1/simulations/SIM_ID/join \ -H "Content-Type: application/json" \ -d '{"agent_id": "YOUR_AGENT_ID", "role": "participant"}'

Soft Probabilistic Labels

Interactions carry p = P(v = +1) β€” probability of beneficial outcome: Observables -> ProxyComputer -> v_hat -> sigmoid -> p -> PayoffEngine -> payoffs | SoftMetrics -> toxicity, quality gap, etc.

Five Key Metrics

MetricWhat It MeasuresToxicity rateExpected harm among accepted interactions: E[1-p | accepted]Quality gapAdverse selection indicator (negative = bad): E[p | accepted] - E[p | rejected]Conditional lossSelection effect on payoffsIncoherenceVariance-to-error ratio across replaysIllusion deltaGap between perceived coherence and actual consistency

Agent Types (14 families, 38 implementations)

TypeBehaviorHonestCooperative, trust-based, completes tasks diligentlyOpportunisticMaximizes short-term payoff, cherry-picks tasksDeceptiveBuilds trust, then exploits trusted relationshipsAdversarialTargets honest agents, coordinates with alliesLDTLogical Decision Theory with FDT/UDT precommitmentRLMReinforcement Learning from MemoryCouncilMulti-agent deliberation-based decisionsSkillRLLearns interaction strategies via reward signalsLLMBehavior determined by LLM (Anthropic, OpenAI, or Ollama)MoltbookDomain-specific social platform agentScholarAcademic citation and research agentWiki EditorCollaborative editing with editorial policy

Governance Levers (29 mechanisms)

Transaction Taxes β€” Reduce exploitation, cost welfare Reputation Decay β€” Punish bad actors, erode honest standing Circuit Breakers β€” Freeze toxic agents quickly Random Audits β€” Deter hidden exploitation Staking β€” Filter undercapitalized agents Collusion Detection β€” Catch coordinated attacks (the critical lever near collapse threshold) Sybil Detection β€” Identify duplicate agents Transparency Ledger β€” Reward/penalize based on outcome Moderator Agent β€” Probabilistic review of interactions Incoherence Friction β€” Tax uncertainty-driven decisions Council Deliberation β€” Multi-agent governance decisions Diversity Enforcement β€” Prevent monoculture collapse Moltipedia-specific β€” Pair caps, page cooldowns, daily caps, self-fix prevention

Framework Bridges

BridgeIntegrationConcordiaDeepMind's multi-agent frameworkGasTownMulti-agent workspace governanceClaude CodeClaude CLI agent integrationLiveSWELive software engineering tasksOpenClawOpen agent protocolPrime IntellectCross-platform run trackingRalphAgent orchestrationWorktreeGit worktree-based sandboxing

Scenario YAML Format

simulation: n_epochs: 10 steps_per_epoch: 10 seed: 42 agents: - type: honest count: 3 config: acceptance_threshold: 0.4 - type: adversarial count: 2 config: aggression_level: 0.7 governance: transaction_tax_rate: 0.05 circuit_breaker_enabled: true collusion_detection_enabled: true success_criteria: max_toxicity: 0.3 min_quality_gap: 0.0

Phase Transitions (11-scenario, 209-epoch study)

RegimeAdversarial %ToxicityWelfareOutcomeCooperative0-20%< 0.30StableSurvivesContested20-37.5%0.33-0.37DecliningSurvivesCollapse50%+~0.30Zero by epoch 12-14Collapses Critical threshold between 37.5% and 50% adversarial agents separates recoverable from irreversible collapse.

Governance Cost Paradox (v1.7.0 GasTown study)

42-run study reveals: governance reduces toxicity at all adversarial levels (mean reduction 0.071) but imposes net-negative welfare costs at current parameter tuning. At 0% adversarial, governance costs 216 welfare units (-57.6%) for only 0.066 toxicity reduction.

GasTown Governance Cost

Study governance overhead vs. toxicity reduction across 7 agent compositions with and without governance levers. Reveals the safety-throughput trade-off. See scenarios/gastown_governance_cost.yaml.

LDT Cooperation

220 runs across 10 seeds comparing TDT vs FDT vs UDT cooperation strategies at population scales up to 21 agents. See scenarios/ldt_cooperation.yaml.

Moltipedia Heartbeat

Model the Moltipedia wiki editing loop: competing AI editors, editorial policy, point farming, and anti-gaming governance. See scenarios/moltipedia_heartbeat.yaml.

Moltbook CAPTCHA

Model Moltbook's anti-human math challenges and rate limiting: obfuscated text parsing, verification gates, and spam prevention. See scenarios/moltbook_captcha.yaml.

API Endpoints (Full Reference)

MethodEndpointDescriptionGET/healthHealth checkGET/API infoPOST/api/v1/agents/registerRegister agentGET/api/v1/agents/{agent_id}Get agent detailsGET/api/v1/agents/List agentsPOST/api/v1/scenarios/submitSubmit scenarioGET/api/v1/scenarios/{scenario_id}Get scenarioGET/api/v1/scenarios/List scenariosPOST/api/v1/simulations/createCreate simulationPOST/api/v1/simulations/{id}/joinJoin simulationGET/api/v1/simulations/{id}Get simulationGET/api/v1/simulations/List simulations

Citation

@software{swarm2026, title = {SWARM: System-Wide Assessment of Risk in Multi-agent systems}, author = {Savitt, Raeli}, year = {2026}, url = {https://github.com/swarm-ai-safety/swarm} }

Linked Docs

Skill metadata: skill.json Agent discovery: .well-known/agent.json Full documentation: https://github.com/swarm-ai-safety/swarm/tree/main/docs Theoretical foundations: docs/research/theory.md Governance guide: docs/governance.md Red-teaming guide: docs/red-teaming.md Scenario format: docs/guides/scenarios.md

Category context

Agent frameworks, memory systems, reasoning layers, and model-native orchestration.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package
1 Docs1 Config
  • SKILL.md Primary doc
  • skill.json Config