Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Convert text to speech using Microsoft Edge's TTS engine with customizable voices, direct playback, and automatic temporary file cleanup.
Convert text to speech using Microsoft Edge's TTS engine with customizable voices, direct playback, and automatic temporary file cleanup.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
The Voice skill provides enhanced text-to-speech functionality using edge-tts, allowing you to convert text to spoken audio with multiple playback options.
Text-to-speech conversion using Microsoft Edge's TTS engine Support for various voice options and audio settings Direct playback of generated audio Automatic cleanup of temporary audio files Integration with the MEDIA system for audio playback
Before using this skill, you need to install the required dependency: pip3 install edge-tts Or use the skill's install action: await skill.execute({ action: 'install' });
Speak text directly without storing to file: const result = await skill.execute({ action: 'speak', // New improved action text: 'Hello, how are you today?' }); // Audio is played directly and temporary file is cleaned up automatically
Convert text to speech with default settings: const result = await skill.execute({ action: 'tts', text: 'Hello, how are you today?' }); // Returns a MEDIA link to the audio file With direct playback: const result = await skill.execute({ action: 'tts', text: 'Hello, how are you today?', playImmediately: true // Plays the audio immediately after generation }); With custom options: const result = await skill.execute({ action: 'tts', text: 'This is a sample of voice customization.', options: { voice: 'zh-CN-XiaoxiaoNeural', rate: '+10%', volume: '-5%', pitch: '+10Hz' } });
Play an existing audio file: const result = await skill.execute({ action: 'play', filePath: '/path/to/audio/file.mp3' });
Get a list of available voices: const result = await skill.execute({ action: 'voices' });
Clean up temporary audio files older than 1 hour (default): const result = await skill.execute({ action: 'cleanup' }); Or specify a custom age threshold: const result = await skill.execute({ action: 'cleanup', options: { hoursOld: 2 // Clean files older than 2 hours } });
The following options are available for text-to-speech: voice: The voice to use (default: 'zh-CN-XiaoxiaoNeural') rate: Speech rate adjustment (default: '+0%') volume: Volume adjustment (default: '+0%') pitch: Pitch adjustment (default: '+0Hz')
Edge-TTS supports many voices in different languages: Chinese: zh-CN-XiaoxiaoNeural, zh-CN-YunxiNeural, zh-CN-YunyangNeural English (US): en-US-Standard-C, en-US-Standard-D, en-US-Wavenet-F English (UK): en-GB-Standard-A, en-GB-Wavenet-A Japanese: ja-JP-NanamiNeural Korean: ko-KR-SunHiNeural And many more...
Audio files are temporarily stored in the temp directory Files are automatically cleaned up after 1 hour (default) Direct speaking option cleans up files after 5 seconds
Python 3.x pip package manager edge-tts library (install via pip3 install edge-tts)
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.