Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Make AI-powered phone calls with custom personas and goals. Uses OpenAI Realtime API + Twilio for ultra-low latency voice conversations. Supports DTMF/IVR na...
Make AI-powered phone calls with custom personas and goals. Uses OpenAI Realtime API + Twilio for ultra-low latency voice conversations. Supports DTMF/IVR na...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
Make AI-powered phone calls with custom personas and goals using OpenAI Realtime API + Twilio.
Persona Calls: Define a persona, goal, and opening line for autonomous calls Full Realtime Mode: GPT-4o powered voice conversations with <~1s latency DTMF / IVR Navigation: AI automatically navigates automated phone menus (press 1 for X, enter your account number, etc.) by generating and injecting touch-tone digits into the audio stream Provider: Supports Twilio (full realtime) and mock provider for testing Streaming Audio: Bidirectional audio via WebSocket for real-time conversations Limited Access: Unlike the standard voice_call plugin, the person on the call doesn't have access to gateway agent, reducing attack surfaces.
CredentialSourcePurposeOPENAI_API_KEYOpenAIPowers the realtime voice AI (GPT-4o)TWILIO_ACCOUNT_SIDTwilio ConsoleTwilio account identifierTWILIO_AUTH_TOKENTwilio ConsoleTwilio API authentication
CredentialSourcePurposeNGROK_AUTHTOKENngrokngrok tunnel auth (only needed if using ngrok as tunnel provider) Credentials can be set via environment variables or in the plugin config (config takes precedence).
Install the plugin via npm or copy to your OpenClaw extensions directory Enable hooks for call completion callbacks (required): { "hooks": { "enabled": true, "token": "your-secret-token" } } Generate a secure token with: openssl rand -hex 24 โ ๏ธ Security: The hooks.token is sensitive โ it authenticates internal callbacks. Keep it secret and rotate if compromised. Configure the plugin in your openclaw config: { "plugins": { "entries": { "supercall": { "enabled": true, "config": { "provider": "twilio", "fromNumber": "+15551234567", "twilio": { "accountSid": "your-account-sid", "authToken": "your-auth-token" }, "streaming": { "openaiApiKey": "your-openai-key" }, "tunnel": { "provider": "ngrok", "ngrokDomain": "your-domain.ngrok.app" } } } } } } Important: The hooks.token is required for call completion callbacks. Without it, the agent won't be notified when calls finish.
Make phone calls with custom personas: supercall( action: "persona_call", to: "+1234567890", persona: "Personal assistant to the king", goal: "Confirm the callee's availabilities for dinner next week", openingLine: "Hey, this is Michael, Alex's Assistant..." )
persona_call - Start a new call with a persona get_status - Check call status and transcript end_call - End an active call list_calls - List active persona calls
The AI automatically handles automated phone menus (IVR systems) during calls. When it hears prompts like "press 1 for sales", it uses an internal send_dtmf tool to send touch-tone digits through the audio stream. This is fully automatic โ no extra configuration or agent intervention is needed. Supported characters: 0-9, *, #, A-D, w (500ms pause) Example sequences: 1 (press 1), 1234567890# (enter account number + pound), 1w123# (press 1, wait, then enter 123#) How it works: DTMF tones are generated as ITU-standard dual-frequency pairs, encoded to ยต-law (8kHz mono), and injected directly into the Twilio media stream. No external dependencies. This means persona calls can navigate phone trees end-to-end โ e.g., "call the pharmacy, navigate through their menu, and check on my prescription status."
OptionDescriptionDefaultproviderVoice provider (twilio/mock)RequiredfromNumberCaller ID (E.164 format)Required for real providerstoNumberDefault recipient number-twilio.accountSidTwilio Account SIDTWILIO_ACCOUNT_SID envtwilio.authTokenTwilio Auth TokenTWILIO_AUTH_TOKEN envstreaming.openaiApiKeyOpenAI API key for realtimeOPENAI_API_KEY envstreaming.silenceDurationMsVAD silence duration in ms800streaming.vadThresholdVAD threshold 0-1 (higher = less sensitive)0.5streaming.streamPathWebSocket path for media stream/voice/streamtunnel.providerTunnel for webhooks (ngrok/tailscale-serve/tailscale-funnel)nonetunnel.ngrokDomainFixed ngrok domain (recommended for production)-tunnel.ngrokAuthTokenngrok auth tokenNGROK_AUTHTOKEN envFull realtime requires an OpenAI API key.
Node.js 20+ Twilio account for full realtime calls (media streams) ngrok or Tailscale for webhook tunneling (production) OpenAI API key for real-time features
This is a fully standalone skill - it does not depend on the built-in voice-call plugin. All voice calling logic is self-contained.
This plugin is not instruction-only. It runs code, spawns processes, opens network listeners, and writes to disk. The following describes exactly what happens at runtime.
When tunnel.provider is set to ngrok, the plugin spawns the ngrok CLI binary via child_process.spawn. When set to tailscale-serve or tailscale-funnel, it spawns the tailscale CLI instead. These processes run for the lifetime of the plugin and are terminated on shutdown. If tunnel.provider is none (or a publicUrl is provided directly), no external processes are spawned.
Local webhook server: The plugin opens an HTTP server (default 0.0.0.0:3335) to receive Twilio webhook callbacks and WebSocket media streams. Startup self-test: On startup, the plugin sends an HTTP POST to its own public webhook URL with an x-supercall-self-test header to verify connectivity. If publicUrl is misconfigured to point at an unintended endpoint, this self-test token could be sent there. Always verify your publicUrl or tunnel configuration before starting. Outbound API calls: The plugin makes outbound requests to the OpenAI Realtime API (WebSocket) and Twilio REST API during calls.
Twilio calls: Verified using Twilio's X-Twilio-Signature header (HMAC-SHA1). Self-test requests: Authenticated using an internal token (x-supercall-self-test) generated at startup. ngrok free-tier relaxation: On free-tier ngrok domains (.ngrok-free.app, .ngrok.io), URL reconstruction may vary due to ngrok's request rewriting; Twilio signature mismatches are logged but allowed through. Paid/custom ngrok domains (.ngrok.app) are verified strictly. This relaxation is limited to free-tier domains only and does not affect Tailscale or direct publicUrl configurations.
Call transcripts are persisted to ~/clawd/supercall-logs. These logs may contain sensitive conversation content. Review and rotate logs periodically.
Protect your credentials โ Twilio and OpenAI keys grant access to paid services Verify your public URL โ ensure publicUrl or tunnel config points where you expect before starting Rotate hooks.token periodically and if you suspect compromise Review call logs โ transcripts stored on disk may contain sensitive content
Code helpers, APIs, CLIs, browser automation, testing, and developer operations.
Largest current source with strong distribution and engagement signals.