Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Convert text to speech using Microsoft Edge TTS with real-time streaming, customizable voice settings, and support for multiple languages including Chinese a...
Convert text to speech using Microsoft Edge TTS with real-time streaming, customizable voice settings, and support for multiple languages including Chinese a...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
Text-to-speech skill using Microsoft Edge TTS engine with real-time streaming playback support.
Edge TTS Engine - High quality text-to-speech using Microsoft Edge Streaming Playback - Real-time audio streaming (边生成边播放) Multiple Voices - Support for Chinese, English, Japanese, Korean voices Customizable - Adjust rate, volume, and pitch Secure Implementation - No command injection vulnerabilities
pip install edge-tts
Windows: Download from: https://github.com/GyanD/codexffmpeg/releases Extract and add bin folder to PATH macOS: brew install ffmpeg Linux: sudo apt install ffmpeg
Real-time audio generation and playback: // Basic usage await skill.execute({ action: 'stream', text: '你好,我是小九' }); // With custom voice await skill.execute({ action: 'stream', text: 'Hello, how are you?', options: { voice: 'en-US-Standard-A', rate: '+10%', volume: '+0%', pitch: '+0Hz' } });
await skill.execute({ action: 'tts', text: 'Hello, how are you today?', options: { voice: 'zh-CN-XiaoxiaoNeural' } }); // Returns: { success: true, media: 'MEDIA: /path/to/file.mp3' }
await skill.execute({ action: 'speak', text: 'Hello!' });
await skill.execute({ action: 'voices' });
LanguageVoice IDChinese (Female)zh-CN-XiaoxiaoNeuralChinese (Male)zh-CN-YunxiNeuralChinese (Male)zh-CN-YunyangNeuralEnglish (US Female)en-US-Standard-AEnglish (US Male)en-US-Standard-DEnglish (UK)en-GB-Standard-AJapaneseja-JP-NanamiNeuralKoreanko-KR-SunHiNeural
OptionDefaultDescriptionvoicezh-CN-XiaoxiaoNeuralVoice IDrate+0%Speech rate (-50% to +100%)volume+0%Volume adjustment (-50% to +50%)pitch+0HzPitch adjustment
This skill implements enterprise-grade security best practices:
FeatureImplementationInput ValidationVoice parameter whitelist validation - only allowed voices can be usedNo Shell ExecutionUses spawn() with array arguments instead of shell command concatenationCommand Injection PreventionAll user inputs are properly validated and escapedPath SafetyFixed script path prevents path traversal
// ❌ UNSAFE - Don't use exec with string concatenation exec(`py script.py "${userText}" --voice ${userVoice}`); // ✅ SAFE - Use spawn with array arguments spawn('py', [scriptPath, text, '--voice', voice], { shell: false });
Only these voices are allowed: const allowedVoices = [ 'zh-CN-XiaoxiaoNeural', 'zh-CN-YunxiNeural', 'zh-CN-YunyangNeural', 'zh-CN-YunyouNeural', 'zh-CN-XiaomoNeural', 'en-US-Standard-C', 'en-US-Standard-D', 'en-US-Wavenet-F', 'en-GB-Standard-A', 'en-GB-Wavenet-A', 'ja-JP-NanamiNeural', 'ko-KR-SunHiNeural' ]; Any invalid voice parameter will be rejected and replaced with the default voice.
Enterprise-grade security - Full command injection protection Voice whitelist validation Replaced exec with spawn for secure process execution Input sanitization for all parameters
Add streaming playback support (边生成边播放) Add ffmpeg dependency Fix command injection vulnerability Add voice whitelist validation
Initial release with basic TTS support
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.