Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
One skill for all AI audio: TTS, music, SFX, and voice cloning. Routes your requests to 17+ models (ElevenLabs, fal.ai) via a single proxy. Free tier include...
One skill for all AI audio: TTS, music, SFX, and voice cloning. Routes your requests to 17+ models (ElevenLabs, fal.ai) via a single proxy. Free tier include...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
AudioMind turns a single sentence into a fully-produced podcast. It handles scripting, ElevenLabs voice narration, AI background music, and server-side audio mixing β all from one Manus command. No setup required. The public shared backend works out of the box. Just install and start creating.
Install: clawhub install audiomind Use immediately (no configuration needed): "Use AudioMind to create a 3-minute podcast about the future of AI agents." That's it. AudioMind uses the public shared backend by default β 20 free generations per month, no API key required.
VariableRequiredDescriptionAUDIOMIND_BACKEND_URLOptionalYour own Vercel backend URL. Defaults to the public shared backend.AUDIOMIND_API_KEYOptionalPro API key for unlimited generations. Get one at the landing page. Free Tier (default): 20 generations/month tracked by IP. No configuration needed. Pro Tier: Set AUDIOMIND_API_KEY with your Pro key for unlimited access. Self-hosted: Deploy your own backend from github.com/wells1137/audiomind-backend and set AUDIOMIND_BACKEND_URL to your instance.
When you ask Manus to create a podcast, the agent performs these steps automatically: Write Script β The agent uses its built-in LLM to write a structured podcast script based on your topic and desired length. Generate Narration β POST {BACKEND_URL}/api/workflow/generate_tts with the script. Returns MP3 audio narrated by an ElevenLabs voice. Generate Music β POST {BACKEND_URL}/api/workflow/generate_music with a mood/style prompt. Returns a background music MP3. Upload Audio β The agent uploads both MP3 files using manus-upload-file to obtain public URLs for the mixing step. Mix Final Audio β POST {BACKEND_URL}/api/workflow/mix_audio with { narration_url, music_url }. The backend mixes them with proper levels using ffmpeg and returns the final podcast MP3. Deliver β The agent saves and presents the finished podcast to you.
"Create a 5-minute podcast about the history of jazz with a smooth jazz background." "Make a daily news briefing about AI developments, formal tone, upbeat intro music." "Generate a meditation podcast, 10 minutes, calm narration, ambient soundscape." "Produce a tech explainer on quantum computing for a general audience."
All API keys (ElevenLabs) are stored server-side. The skill file contains zero credentials. This architecture passes VirusTotal and ClawHub security scans. See the GitHub repo for the full backend source code.
v3.3.0 β Removed local tools/start_server.sh entirely (not needed in v3 architecture). Declared FAL_KEY as optional env. Resolves all OpenClaw metadata inconsistency warnings. v3.1.0 β Zero-config install. Public shared backend is now the default. No AUDIOMIND_BACKEND_URL setup required for free tier users. v3.0.1 β Added openclaw.requires metadata to declare env vars and trusted network endpoints. Resolves OpenClaw security scanner warning. v3.0.0 β Full architecture rewrite. All commercial logic moved to Vercel backend. ElevenLabs API keys are now server-side only. Passes VirusTotal security scan.
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.