Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Windows voice companion for OpenClaw. Custom wake word via Porcupine, local STT via faster-whisper, streamed responses over the gateway WebSocket, and ElevenLabs TTS with natural chime/thinking sounds. Supports multi-turn conversation with automatic follow-up listening, mic suppression to prevent feedback, and a system tray with pause/resume. Recommended voices: Matilda (XrExE9yKIg1WjnnlVkGX, free tier) or Ivy (MClEFoImJXBTgLwdLI5n, paid tier). Fully customizable wake word, voice, hotkey, and silence thresholds.
Windows voice companion for OpenClaw. Custom wake word via Porcupine, local STT via faster-whisper, streamed responses over the gateway WebSocket, and ElevenLabs TTS with natural chime/thinking sounds. Supports multi-turn conversation with automatic follow-up listening, mic suppression to prevent feedback, and a system tray with pause/resume. Recommended voices: Matilda (XrExE9yKIg1WjnnlVkGX, free tier) or Ivy (MClEFoImJXBTgLwdLI5n, paid tier). Fully customizable wake word, voice, hotkey, and silence thresholds.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
A Python companion app that gives OpenClaw a voice. Say a wake word (or press a hotkey), speak naturally, and hear the AI respond β then keep talking for multi-turn conversation. Mic β Porcupine wake word β faster-whisper STT β OpenClaw Gateway β ElevenLabs TTS β Speaker
# 1. Navigate to the skill scripts cd {baseDir}/scripts # 2. Create a virtual environment and install dependencies python -m venv venv venv\Scripts\pip install -r requirements.txt # 3. Copy .env.example to .env and fill in your keys copy .env.example .env # 4. Run the assistant venv\Scripts\python src\assistant.py
ServiceWhat you needCostOpenClaw gatewayRunning locally on ws://127.0.0.1:18789 with a gateway tokenβElevenLabsAPI key + voice ID (free tier works with default voices)Free+PicovoiceAccess key from picovoice.ai (free tier works)FreePython3.10+ (tested on 3.14)βMicrophoneAny input deviceβ
# OpenClaw Gateway GATEWAY_URL=ws://127.0.0.1:18789 GATEWAY_TOKEN=your-gateway-token # ElevenLabs TTS ELEVENLABS_API_KEY=your-api-key ELEVENLABS_VOICE_ID=XrExE9yKIg1WjnnlVkGX # Matilda (free tier) β or MClEFoImJXBTgLwdLI5n for Ivy (paid) ELEVENLABS_MODEL_ID=eleven_v3 # Porcupine Wake Word PORCUPINE_ACCESS_KEY=your-access-key PORCUPINE_MODEL_PATH= # path to custom .ppn file (optional) # Whisper STT WHISPER_MODEL=base # tiny, base, small, medium, large # Tuning WAKE_SENSITIVITY=0.7 # 0.0β1.0 (higher = more sensitive) SILENCE_TIMEOUT=1.5 # seconds of silence to stop recording HOTKEY=ctrl+shift+k # global keyboard shortcut
Go to Picovoice Console Create a custom wake word (e.g. "Hey Claudia", "Hey OpenClaw") Download the .ppn file for your platform Set PORCUPINE_MODEL_PATH in .env to the file path Without a custom model, falls back to built-in "hey google"
The assistant plays short audio clips when activated ("Yep!", "Hi!") and while thinking ("Hmm...", "Let me think..."). Generate these in your chosen ElevenLabs voice: cd {baseDir}/scripts venv\Scripts\python generate_chime_sounds.py venv\Scripts\python generate_thinking_sounds.py Re-run these after changing ELEVENLABS_VOICE_ID.
Use start.bat to launch without a console window (runs via pythonw.exe). The assistant appears as a system tray icon with Pause/Resume/Quit controls. For auto-start on Windows, create a shortcut to start.bat in shell:startup.
Wake β Porcupine detects the wake word (or user presses hotkey) Chime β Plays a random activation sound ("Yep!", "Hi!") Record β Records speech until 1.5s of silence (2s grace period for initial silence) Thinking β Plays a filler sound ("Hmm...", "Let me think...") Transcribe β faster-whisper converts audio to text locally (CPU, int8) Gateway β Sends text to OpenClaw gateway via WebSocket, streams response Speak β ElevenLabs converts response to speech, plays through speakers Follow-up β Automatically listens for 5s after speaking for conversation continuity Idle β Returns to wake word listening after 5s of silence Mic suppression keeps the microphone muted during all speaker output to prevent feedback loops.
See references/architecture.md for source file breakdown, WebSocket protocol details, and audio pipeline internals.
See references/troubleshooting.md for common issues with mic detection, gateway connection, TTS errors, and wake word tuning.
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.