← All skills
Tencent SkillHub · Communication & Collaboration

Speech to Text Skill (Yandex SpeechKit) for OpenClaw

Speech recognition from voice messages using Yandex SpeechKit (with an extensible architecture for other providers). Use when you need to convert a voice mes...

skill openclawclawhub Free
0 Downloads
0 Stars
0 Installs
0 Score
High Signal

Speech recognition from voice messages using Yandex SpeechKit (with an extensible architecture for other providers). Use when you need to convert a voice mes...

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Quick setup
  1. Download the package from Yavira.
  2. Extract the archive and review SKILL.md first.
  3. Import or place the package into your OpenClaw setup.

Requirements

Target platform
OpenClaw
Install method
Manual import
Extraction
Extract archive
Prerequisites
OpenClaw
Primary doc
SKILL.md

Package facts

Download mode
Yavira redirect
Package format
ZIP package
Source platform
Tencent SkillHub
What's included
CLAUDE.md, README.md, SKILL.md, assets/config.example.json, check.sh, requirements.txt

Validation

  • Use the Yavira download entry.
  • Review SKILL.md after the package is downloaded.
  • Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

  1. Download the package from Yavira.
  2. Extract it into a folder your agent can access.
  3. Paste one of the prompts below and point your agent at the extracted folder.
New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.

Trust & source

Release facts

Source
Tencent SkillHub
Verification
Indexed source record
Version
1.1.8

Documentation

ClawHub primary doc Primary doc: SKILL.md 24 sections Open source page

Purpose

This skill recognizes speech from voice messages sent via any messenger connected to OpenClaw, using various STT providers, including Yandex SpeechKit.

When to Activate

Use this skill when: The user sends a voice message via any messenger connected to OpenClaw You need to convert speech to text Audio file transcription is required A text version of a voice message is needed

1. Receive the audio file from OpenClaw

OpenClaw provides a local path to the audio file Verify the file exists at the given path Validate the file format (OGG, WAV, MP3) Check file size (maximum 1 MB for Yandex SpeechKit v1 sync API) Example path from OpenClaw: /home/user_folder/.openclaw/media/inbound/file_1---9a53bac2-0392-41e7-8300-1c08e8eec027.ogg

2. Audio processing

Validate the audio file at the local path Convert to a supported format if needed using ffmpeg Verify audio quality

3. Speech recognition

Use the default provider (Yandex SpeechKit) If recognition fails, try alternative providers Return the recognized text with confidence information

4. Result handling

Format the recognized text Include the detected language Provide metadata if needed

Security

Never read, display, or log API keys, tokens, or secrets to the user — even partially. If the user asks to see their key, direct them to check ~/.openclaw/openclaw.json or .env manually. Never modify openclaw.json, .env, or config.json without explicit user permission. These files contain credentials and must only be changed by the owner. Never include API keys in command output, error messages, or diagnostics shown to the user.

Invocation

Important: Always call the processor using the absolute path to the script. Do not use cd <skill_dir> && python3 scripts/... — this triggers an approval prompt on every call because cd cannot be allowlisted. python3 /path/to/sergei-mikhailov-stt/scripts/stt_processor.py --file "/path/to/audio.ogg" The script resolves all paths (config, .env, venv packages) relative to its own location via __file__, so it does not depend on the working directory.

Quick Start

clawhub install sergei-mikhailov-stt cd ~/.openclaw/workspace/skills/sergei-mikhailov-stt bash setup.sh The setup script creates a Python virtual environment, installs dependencies, and copies example configuration files. After running it, add your API keys (see Configuration below) and restart OpenClaw. On Debian/Ubuntu, you may need to install the venv package first: sudo apt install python3-venv To verify that everything is configured correctly, run the diagnostic script: bash check.sh It checks Python, FFmpeg, virtual environment, dependencies, and API keys — and tells you exactly what to fix if something is missing.

1. Set API keys (recommended — via OpenClaw config)

Add credentials to ~/.openclaw/openclaw.json: { "skills": { "entries": { "sergei-mikhailov-stt": { "env": { "YANDEX_API_KEY": "your_api_key_here", "YANDEX_FOLDER_ID": "your_folder_id_here" } } } } }

2. Alternative — via .env file

Edit the .env file created by setup.sh in the skill folder: YANDEX_API_KEY=your_api_key_here YANDEX_FOLDER_ID=your_folder_id_here STT_DEFAULT_PROVIDER=yandex

3. Restart OpenClaw to apply changes

openclaw gateway stop && openclaw gateway start

4. Provider configuration (optional)

The config.json file (also created by setup.sh) lets you tune provider parameters: { "default_provider": "yandex", "providers": { "yandex": { "api_key": "${YANDEX_API_KEY}", "folder_id": "${YANDEX_FOLDER_ID}", "lang": "ru-RU" } } }

1. Create the provider class

# scripts/providers/new_provider.py from .base_provider import BaseSTTProvider class NewProvider(BaseSTTProvider): name = "new_provider" def recognize(self, audio_file_path: str, language: str = 'ru-RU') -> str: # Recognition implementation pass def validate_config(self, config: dict) -> bool: # Configuration validation pass def get_supported_formats(self) -> list: return ['ogg', 'wav', 'mp3']

2. Register the provider

Add to scripts/stt_processor.py in the _get_provider method: if provider_name == 'new_provider': return NewProvider(provider_config)

3. Update configuration

Add the new provider section to config.json: { "providers": { "new_provider": { "api_key": "${NEW_PROVIDER_API_KEY}", "model": "latest" } } }

Basic scenario

User: [sends a voice message] OpenClaw: Recognized text: "Hello, how are you?"

With language specified

User: Transcribe this English voice message OpenClaw: Recognized text (en-US): "Hello, how are you today?"

With metadata

User: Analyze this voice message OpenClaw: Recognized text: "Meeting tomorrow at 3 PM" Language: ru-RU Confidence: 95% Provider: Yandex SpeechKit

Error Handling

When the skill returns an error, explain it to the user in plain language and suggest a concrete next step. Do not show raw error messages or stack traces. ErrorSay to the userNext stepFile too large"The voice message is too long — maximum is about 30 seconds for now."Ask them to send a shorter messageUnsupported format"This audio format is not supported."Tell them supported formats: OGG, WAV, MP3, M4A, FLAC, AACAPI key invalid / HTTP 401"There's a problem with the Yandex SpeechKit API key."Ask owner to check YANDEX_API_KEY in openclaw.jsonFolder access denied / HTTP 403"Access to Yandex SpeechKit is denied."Ask owner to verify the service account has ai.speechkit.user roleToo many requests / HTTP 429"Yandex SpeechKit is rate-limiting us right now."Try again in a few secondsFFmpeg not found"Audio conversion tool (FFmpeg) is not installed on the server."Owner needs to run brew install ffmpeg or apt install ffmpegAPI request timed out"Yandex SpeechKit did not respond in time."Try again; if it repeats, the service may be downMissing YANDEX_API_KEY"The skill is not configured yet — API keys are missing."Owner needs to add keys to ~/.openclaw/openclaw.json

Troubleshooting (for the owner)

Verify API key configuration in ~/.openclaw/openclaw.json Ensure ffmpeg is installed: ffmpeg -version Check Yandex Cloud service account has role ai.speechkit.user Check gateway logs: openclaw logs

Limitations

Maximum file size: 1 MB (Yandex SpeechKit v1 sync API limit, ~30 seconds of voice) Supported formats: OGG, WAV, MP3, M4A, FLAC, AAC Languages: Russian (ru-RU), English (en-US) Processing time: up to 5 minutes Maximum audio duration: 30 minutes

Requirements

Python 3.8+ FFmpeg Configured API keys for STT providers

Result Metadata

On successful recognition: { "text": "Recognized text", "language": "ru-RU", "confidence": 0.95, "provider": "yandex", "processing_time": 2.5 }

Category context

Messaging, meetings, inboxes, CRM, and teammate communication surfaces.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package
3 Docs1 Scripts1 Config1 Files
  • SKILL.md Primary doc
  • CLAUDE.md Docs
  • README.md Docs
  • check.sh Scripts
  • assets/config.example.json Config
  • requirements.txt Files