Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have natural voice conversations with an AI...
Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have natural voice conversations with an AI...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
A voice-to-voice AI assistant powered by Google's Gemini Live API. Speak to the AI and it responds with natural-sounding voice.
cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py "Your question or message"
cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py --audio /path/to/audio.ogg "optional context"
The handler returns a JSON response: { "message": "[[audio_as_voice]]\nMEDIA:/tmp/gemini_voice_xxx.ogg", "text": "Text response from Gemini" }
Set your Gemini API key: export GEMINI_API_KEY="your-api-key-here" Or create a .env file in the skill directory: GEMINI_API_KEY=your-api-key-here
The default model is gemini-2.5-flash-native-audio-preview-12-2025 for audio support. To use a different model, edit handler.py: MODEL = "gemini-2.0-flash-exp" # For text-only
google-genai>=1.0.0 numpy>=1.24.0 soundfile>=0.12.0 librosa>=0.10.0 (for audio input) FFmpeg (for audio conversion)
๐๏ธ Voice input/output support ๐ฌ Text conversations ๐ง Configurable system instructions โก Fast responses with Gemini Flash
Code helpers, APIs, CLIs, browser automation, testing, and developer operations.
Largest current source with strong distribution and engagement signals.