Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Local voice I/O for OpenClaw agents. Transcribe inbound audio/voice messages using local Whisper (whisper.cpp) and generate voice replies using local Piper T...
Local voice I/O for OpenClaw agents. Transcribe inbound audio/voice messages using local Whisper (whisper.cpp) and generate voice replies using local Piper T...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
Local-only voice I/O for OpenClaw agents. STT: transcribe.sh — converts audio to text via local Whisper binary TTS: speak.sh — converts text to speech via local Piper binary Network calls: none — both scripts run fully offline No cloud APIs, no API keys required
The following must be installed on the system before using this skill: RequirementPurposewhisper binarySpeech-to-text inferenceggml-base.en.bin model fileWhisper STT modelpiper binaryText-to-speech synthesis*.onnx voice model filesPiper TTS voicesffmpegAudio format conversion See README.md for installation and setup instructions.
VariableDefaultPurposeWHISPER_BINauto-detected via whichPath to whisper binaryWHISPER_MODEL~/.cache/whisper/ggml-base.en.binPath to Whisper model filePIPER_BINauto-detected via whichPath to piper binaryVOICECLAW_VOICES_DIR~/.local/share/piper/voicesDirectory containing .onnx voice model files
which whisper && echo "STT binary: OK" which piper && echo "TTS binary: OK" which ffmpeg && echo "ffmpeg: OK" ls "${WHISPER_MODEL:-$HOME/.cache/whisper/ggml-base.en.bin}" && echo "STT model: OK" ls "${VOICECLAW_VOICES_DIR:-$HOME/.local/share/piper/voices}"/*.onnx 2>/dev/null | head -1 && echo "TTS voices: OK"
# Transcribe audio → text (supports ogg, mp3, m4a, wav, flac) TRANSCRIPT=$(bash scripts/transcribe.sh /path/to/audio.ogg) Override model path: WHISPER_MODEL=/path/to/ggml-base.en.bin bash scripts/transcribe.sh audio.ogg
# Step 1: Generate WAV (local Piper — no network) WAV=$(bash scripts/speak.sh "Your response here." /tmp/reply.wav en_US-lessac-medium) # Step 2: Convert to OGG Opus (Telegram voice requirement) ffmpeg -i "$WAV" -c:a libopus -b:a 32k /tmp/reply.ogg -y -loglevel error # Step 3: Send via message tool (filePath=/tmp/reply.ogg) Override voice directory: VOICECLAW_VOICES_DIR=/path/to/voices bash scripts/speak.sh "Hello." /tmp/reply.wav
VoiceStyleen_US-lessac-mediumNeutral American (default)en_US-amy-mediumWarm American femaleen_US-joe-mediumAmerican maleen_US-kusal-mediumExpressive American maleen_US-danny-lowDeep American male (fast)en_GB-alba-mediumBritish femaleen_GB-northern_english_male-mediumNorthern British male
Voice in → Voice + Text out. Always respond with both a voice reply and a text reply when a voice message is received. Include the transcript. Show "🎙️ I heard: [transcript]" at the top of every text reply to a voice message. Keep voice responses concise. Piper TTS works best under ~200 words — summarize for audio, include full detail in text. Local only. Never use a cloud TTS/STT API. Only the local whisper and piper binaries. Send voice before text. Send the audio file first, then follow with the text reply.
# 1. Transcribe inbound voice message TRANSCRIPT=$(bash path/to/voiceclaw/scripts/transcribe.sh /path/to/voice.ogg) # 2. Compose reply and generate audio RESPONSE="Deployment complete. All checks passed." WAV=$(bash path/to/voiceclaw/scripts/speak.sh "$RESPONSE" /tmp/reply_$$.wav) ffmpeg -i "$WAV" -c:a libopus -b:a 32k /tmp/reply_$$.ogg -y -loglevel error # 3. Send voice + text # message(action=send, filePath=/tmp/reply_$$.ogg, ...) # reply: "🎙️ I heard: $TRANSCRIPT\n\n$RESPONSE"
IssueFixwhisper: command not foundEnsure whisper binary is installed and in PATHWhisper model not foundSet WHISPER_MODEL=/path/to/ggml-base.en.binpiper: command not foundEnsure piper binary is installed and in PATHVoice model missingSet VOICECLAW_VOICES_DIR=/path/to/voices/OGG won't play on TelegramEnsure -c:a libopus flag in ffmpeg command
Messaging, meetings, inboxes, CRM, and teammate communication surfaces.
Largest current source with strong distribution and engagement signals.