Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Local STT with selectable backends - Parakeet (best accuracy) or Whisper (fastest, multilingual).
Local STT with selectable backends - Parakeet (best accuracy) or Whisper (fastest, multilingual).
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Unified local speech-to-text using ONNX Runtime with int8 quantization. Choose your backend: Parakeet (default): Best accuracy for English, correctly captures names and filler words Whisper: Fastest inference, supports 99 languages
# Default: Parakeet v2 (best English accuracy) ~/.openclaw/skills/local-stt/scripts/local-stt.py audio.ogg # Explicit backend selection ~/.openclaw/skills/local-stt/scripts/local-stt.py audio.ogg -b whisper ~/.openclaw/skills/local-stt/scripts/local-stt.py audio.ogg -b parakeet -m v3 # Quiet mode (suppress progress) ~/.openclaw/skills/local-stt/scripts/local-stt.py audio.ogg --quiet
-b/--backend: parakeet (default), whisper -m/--model: Model variant (see below) --no-int8: Disable int8 quantization -q/--quiet: Suppress progress --room-id: Matrix room ID for direct message
ModelDescriptionv2 (default)English only, best accuracyv3Multilingual
ModelDescriptiontinyFastest, lower accuracybase (default)Good balancesmallBetter accuracylarge-v3-turboBest quality, slower
Backend/ModelTimeRTFNotesWhisper Base int80.43s0.018xFastestParakeet v2 int80.60s0.025xBest accuracyParakeet v3 int80.63s0.026xMultilingual
{ "tools": { "media": { "audio": { "enabled": true, "models": [ { "type": "cli", "command": "~/.openclaw/skills/local-stt/scripts/local-stt.py", "args": ["--quiet", "{{MediaPath}}"], "timeoutSeconds": 30 } ] } } } }
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.