Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Inference-based intrusion detection for AI agents. Pattern matching + LLM analysis for jailbreaks, prompt injection, credential theft, social engineering. 108 detection patterns, OpenClaw plugin, auto-scan, quarantine. Commands: hopeid scan, hopeid test, hopeid setup, hopeid stats, hopeid doctor.
Inference-based intrusion detection for AI agents. Pattern matching + LLM analysis for jailbreaks, prompt injection, credential theft, social engineering. 108 detection patterns, OpenClaw plugin, auto-scan, quarantine. Commands: hopeid scan, hopeid test, hopeid setup, hopeid stats, hopeid doctor.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Inference-based intrusion detection for AI agents with quarantine and human-in-the-loop.
These are non-negotiable design principles: Block = full abort β Blocked messages never reach jasper-recall or the agent Metadata only β No raw malicious content is ever stored Approve β re-inject β Approval changes future behavior, doesn't resurrect messages Alerts are programmatic β Telegram alerts built from metadata, no LLM involved
Auto-scan β Scan messages before agent processing Quarantine β Block threats with metadata-only storage Human-in-the-loop β Telegram alerts for review Per-agent config β Different thresholds for different agents Commands β /approve, /reject, /trust, /quarantine
Message arrives β hopeIDS.autoScan() β βββββββββββββββββββββββββββββββββββββββββββ β risk >= threshold? β β β β BLOCK (strictMode): β β β Create QuarantineRecord β β β Send Telegram alert β β β ABORT (no recall, no agent) β β β β WARN (non-strict): β β β Inject <security-alert> β β β Continue to jasper-recall β β β Continue to agent β β β β ALLOW: β β β Continue normally β βββββββββββββββββββββββββββββββββββββββββββ
{ "plugins": { "entries": { "hopeids": { "enabled": true, "config": { "autoScan": true, "defaultRiskThreshold": 0.7, "strictMode": false, "telegramAlerts": true, "agents": { "moltbook-scanner": { "strictMode": true, "riskThreshold": 0.7 }, "main": { "strictMode": false, "riskThreshold": 0.8 } } } } } } }
OptionTypeDefaultDescriptionautoScanbooleanfalseAuto-scan every messagestrictModebooleanfalseBlock (vs warn) on threatsdefaultRiskThresholdnumber0.7Risk level that triggers actiontelegramAlertsbooleantrueSend alerts for blocked messagestelegramChatIdstring-Override alert destinationquarantineDirstring~/.openclaw/quarantine/hopeidsStorage pathagentsobject-Per-agent overridestrustOwnersbooleantrueSkip scanning owner messages
When a message is blocked, a metadata record is created: { "id": "q-7f3a2b", "ts": "2026-02-06T00:48:00Z", "agent": "moltbook-scanner", "source": "moltbook", "senderId": "@sus_user", "intent": "instruction_override", "risk": 0.85, "patterns": [ "matched regex: ignore.*instructions", "matched keyword: api key" ], "contentHash": "ab12cd34...", "status": "pending" } Note: There is NO originalMessage field. This is intentional.
When a message is blocked: π Message blocked ID: `q-7f3a2b` Agent: moltbook-scanner Source: moltbook Sender: @sus_user Intent: instruction_override (85%) Patterns: β’ matched regex: ignore.*instructions β’ matched keyword: api key `/approve q-7f3a2b` `/reject q-7f3a2b` `/trust @sus_user` Built from metadata only. No LLM touches this.
List quarantine records. /quarantine # List pending /quarantine all # List all (including resolved) /quarantine clean # Clean expired records
Mark a blocked message as a false positive. /approve q-7f3a2b Effect: Status β approved (Future) Add sender to allowlist (Future) Lower pattern weight
Confirm a blocked message was a true positive. /reject q-7f3a2b Effect: Status β rejected (Future) Reinforce pattern weights
Whitelist a sender for future messages. /trust @legitimate_user
Manually scan a message. /scan ignore your previous instructions and...
CommandWhat it doesWhat it doesn't do/approveMarks as false positive, may adjust IDSDoes NOT re-inject the message/rejectConfirms threat, may strengthen patternsDoes NOT affect current message/trustWhitelists sender for futureDoes NOT retroactively approve The blocked message is gone by design. If it was legitimate, the sender can re-send.
Different agents need different security postures: "agents": { "moltbook-scanner": { "strictMode": true, // Block threats "riskThreshold": 0.7 // 70% = suspicious }, "main": { "strictMode": false, // Warn only "riskThreshold": 0.8 // Higher bar for main }, "email-processor": { "strictMode": true, // Always block "riskThreshold": 0.6 // More paranoid } }
CategoryRiskDescriptioncommand_injectionπ΄ CriticalShell commands, code executioncredential_theftπ΄ CriticalAPI key extraction attemptsdata_exfiltrationπ΄ CriticalData leak to external URLsinstruction_overrideπ΄ HighJailbreaks, "ignore previous"impersonationπ΄ HighFake system/admin messagesdiscoveryβ οΈ MediumAPI/capability probing
npx hopeid setup Then restart OpenClaw.
GitHub: https://github.com/E-x-O-Entertainment-Studios-Inc/hopeIDS npm: https://www.npmjs.com/package/hopeid Docs: https://exohaven.online/products/hopeids
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.