Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Send voice message replies in iMessage using local Kokoro-ONNX TTS. Generates native iMessage voice bubbles (CAF/Opus) that play inline with waveform — not f...
Send voice message replies in iMessage using local Kokoro-ONNX TTS. Generates native iMessage voice bubbles (CAF/Opus) that play inline with waveform — not f...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Generate and send native iMessage voice messages using local Kokoro TTS. Voice messages appear as inline playable bubbles with waveforms — identical to voice messages recorded in Messages.app.
Your text response → Kokoro TTS (local) → afconvert (native Apple encoder) → CAF/Opus → BlueBubbles → iMessage voice bubble
bash ${baseDir}/scripts/setup.sh Installs: kokoro-onnx, soundfile, numpy. Downloads Kokoro models (~136MB) to ~/.cache/kokoro-onnx/. Requires: BlueBubbles channel configured in OpenClaw (channels.bluebubbles).
Write the response text to a temp file, then pass it via --text-file to avoid shell injection: echo "Your response text here" > /tmp/voice_text.txt ${baseDir}/.venv/bin/python ${baseDir}/scripts/generate_voice_reply.py --text-file /tmp/voice_text.txt --output /tmp/voice_reply.caf Alternatively, pass text directly (ensure proper shell escaping): ${baseDir}/.venv/bin/python ${baseDir}/scripts/generate_voice_reply.py --text "Your response text here" --output /tmp/voice_reply.caf Options: --voice af_heart — Kokoro voice (default: af_heart) --speed 1.15 — Playback speed (default: 1.15) --lang en-us — Language code (default: en-us) Security note: The Python script uses argparse and subprocess.run with list arguments (no shell=True). Input is handled safely within the script. When calling from a shell, prefer --text-file for untrusted input to avoid shell metacharacter issues.
Use the message tool: { "action": "sendAttachment", "channel": "bluebubbles", "target": "+1XXXXXXXXXX", "path": "/tmp/voice_reply.caf", "filename": "Audio Message.caf", "contentType": "audio/x-caf", "asVoice": true } Critical parameters for native voice bubble: filename must be "Audio Message.caf" contentType must be "audio/x-caf" asVoice must be true All three are required for iMessage to render the message as an inline voice bubble with waveform instead of a file attachment.
LanguageFemaleMaleEnglishaf_heart ⭐am_puckSpanishef_doraem_alexFrenchff_siwis—Japanesejf_alphajm_betaChinesezf_xiaobeizm_yunjian
Reply with a voice message when: The user sent you a voice message (voice-for-voice) The user explicitly asks for an audio/voice response Always include a text reply alongside the voice message for accessibility.
macOS: CAF container, Opus codec, 48kHz mono, 32kbps — encoded by Apple's native afconvert. Identical to what Messages.app produces. Fallback: MP3 via ffmpeg (works but may not render as native voice bubble on all iMessage versions).
$0. Kokoro TTS runs entirely locally. No API calls for voice generation.
Voice message shows as file attachment — Ensure all three parameters are set: filename="Audio Message.caf", contentType="audio/x-caf", asVoice=true. First word clipped — The script prepends 150ms silence automatically. If still clipped, increase the silence pad in the script. Kokoro model not found — Run bash ${baseDir}/scripts/setup.sh. afconvert not found — Only available on macOS. Script falls back to ffmpeg/MP3 on Linux.
Messaging, meetings, inboxes, CRM, and teammate communication surfaces.
Largest current source with strong distribution and engagement signals.