Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.
Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Official ElevenLabs skill for speech-to-text transcription. Convert audio to text with state-of-the-art accuracy. Supports 90+ languages, speaker diarization, and realtime streaming.
ffmpeg installed (brew install ffmpeg on macOS) ELEVENLABS_API_KEY environment variable set Python 3.8+ (dependencies auto-install on first run)
{baseDir}/scripts/transcribe.sh <audio_file> [options] {baseDir}/scripts/transcribe.sh --url <stream_url> [options] {baseDir}/scripts/transcribe.sh --mic [options]
Transcribe a local audio file: {baseDir}/scripts/transcribe.sh recording.mp3 With speaker identification: {baseDir}/scripts/transcribe.sh meeting.mp3 --diarize Get full JSON response with timestamps: {baseDir}/scripts/transcribe.sh interview.wav --diarize --json
Stream from a URL (e.g., live radio, podcast): {baseDir}/scripts/transcribe.sh --url https://npr-ice.streamguys1.com/live.mp3 Transcribe from microphone: {baseDir}/scripts/transcribe.sh --mic Stream a local file in realtime (useful for testing): {baseDir}/scripts/transcribe.sh audio.mp3 --realtime
Suppress status messages on stderr: {baseDir}/scripts/transcribe.sh --mic --quiet
OptionDescription--diarizeIdentify different speakers in the audio--lang CODEISO language hint (e.g., en, pt, es, fr)--jsonOutput full JSON with timestamps and metadata--eventsTag audio events (laughter, music, applause)--realtimeStream local file instead of batch processing--partialsShow interim transcripts during realtime mode-q, --quietSuppress status messages (recommended for agents)
Plain text transcription: The quick brown fox jumps over the lazy dog.
{ "text": "The quick brown fox jumps over the lazy dog.", "language_code": "eng", "language_probability": 0.98, "words": [ {"text": "The", "start": 0.0, "end": 0.15, "type": "word", "speaker_id": "speaker_0"} ] }
Final transcripts print as they're committed. With --partials: [partial] The quick [partial] The quick brown fox The quick brown fox jumps over the lazy dog.
Audio: MP3, WAV, M4A, FLAC, OGG, WebM, AAC, AIFF, Opus Video: MP4, AVI, MKV, MOV, WMV, FLV, WebM, MPEG, 3GPP Limits: Up to 3GB file size, 10 hours duration
The script exits with non-zero status on errors: Missing API key: Set ELEVENLABS_API_KEY environment variable File not found: Check the file path exists Missing ffmpeg: Install with your package manager API errors: Check API key validity and rate limits
ScenarioCommandTranscribe a recording./transcribe.sh file.mp3Meeting with multiple speakers./transcribe.sh meeting.mp3 --diarizeLive radio/podcast stream./transcribe.sh --url <url>Voice input from user./transcribe.sh --mic --quietNeed word timestamps./transcribe.sh file.mp3 --json
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.