Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).
Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
transcribe audio files using openai's gpt-4o-mini-transcribe model.
when receiving voice memos (especially via whatsapp), just run: uv run /Users/darin/clawd/skills/voice-transcribe/transcribe <audio-file> then respond based on the transcribed content.
if darin says a word was transcribed wrong, add it to vocab.txt (for hints) or replacements.txt (for guaranteed fix). see sections below.
mp3, mp4, mpeg, mpga, m4a, wav, webm, ogg, opus
# transcribe a voice memo transcribe /tmp/voice-memo.ogg # pipe to other tools transcribe /tmp/memo.ogg | pbcopy
add your openai api key to /Users/darin/clawd/skills/voice-transcribe/.env: OPENAI_API_KEY=sk-...
add words to vocab.txt (one per line) to help the model recognize names/jargon: Clawdis Clawdbot
if the model still gets something wrong, add a replacement to replacements.txt: wrong spelling -> correct spelling
assumes english (no language detection) uses gpt-4o-mini-transcribe model specifically caches by sha256 of audio file
Workflow acceleration for inboxes, docs, calendars, planning, and execution loops.
Largest current source with strong distribution and engagement signals.