Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Telegram voice-to-voice for macOS Apple Silicon: transcribe inbound .ogg voice notes with yap (Speech.framework) and reply with Telegram voice notes via say+ffmpeg. Not compatible with Linux/Windows.
Telegram voice-to-voice for macOS Apple Silicon: transcribe inbound .ogg voice notes with yap (Speech.framework) and reply with Telegram voice notes via say+ffmpeg. Not compatible with Linux/Windows.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
This is an OpenClaw skill.
macOS on Apple Silicon. yap CLI available in PATH (Speech.framework transcription). Project: https://github.com/finnvoor/yap (by finnvoor) ffmpeg available in PATH.
This skill is macOS-only (uses say + Speech.framework). The skill registry cannot enforce OS restrictions, so installing/running it on Linux/Windows will result in runtime failures.
Store a small per-user preference file in the workspace: State file: voice_state/telegram.json Key: Telegram sender user id (string) Values: "voice" (default): reply with a Telegram voice note "text": reply with a single text message If the file does not exist or the sender id is missing: assume "voice".
If an inbound text message is exactly: /audio off → set state to "text" and confirm with a short text reply. /audio on → set state to "voice" and confirm with a short text reply.
Telegram voice notes often show up as <media:audio> in message text. OpenClaw saves the attachment to disk (typically .ogg) under: ~/.openclaw/media/inbound/ Recommended approach: If the inbound message context includes an attachment path, use it. Otherwise, take the most recent *.ogg from ~/.openclaw/media/inbound/.
Default locale: macOS system locale. Optional env: YAP_LOCALE — override the transcription locale (e.g. it-IT, en-US). Preferred: yap transcribe --locale "${YAP_LOCALE:-<system>}" <path.ogg> If YAP_LOCALE is not set, the helper script will use the macOS system locale (from defaults read -g AppleLocale). If transcription fails or is empty: ask the user to repeat or send text. Helper script: scripts/transcribe_telegram_ogg.sh [path.ogg]
Voice default: SYSTEM (uses the current macOS system voice). You can override by passing a specific voice name to the helper script. Generate the reply text. Convert reply text to an OGG/Opus voice note using: scripts/tts_telegram_voice.sh "<reply text>" [SYSTEM|VoiceName] The script prints the generated .ogg path to stdout. Send the .ogg back to Telegram as a voice note (not a generic audio file): use the message tool with asVoice: true and media: <path.ogg> optionally set replyTo to thread the response Notes: Use SYSTEM to rely on the current macOS system voice (recommended).
Reply with a single text message: Transcription: <...> Reply: <...>
Messaging, meetings, inboxes, CRM, and teammate communication surfaces.
Largest current source with strong distribution and engagement signals.