# Send Qwen3-TTS VoiceDesign to your agent
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
## Fast path
- Download the package from Yavira.
- Extract it into a folder your agent can access.
- Paste one of the prompts below and point your agent at the extracted folder.
## Suggested prompts
### New install

```text
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
```
### Upgrade existing

```text
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
```
## Machine-readable fields
```json
{
  "schemaVersion": "1.0",
  "item": {
    "slug": "qwen3-tts-voicedesign",
    "name": "Qwen3-TTS VoiceDesign",
    "source": "tencent",
    "type": "skill",
    "category": "AI 智能",
    "sourceUrl": "https://clawhub.ai/xiaoyaner0201/qwen3-tts-voicedesign",
    "canonicalUrl": "https://clawhub.ai/xiaoyaner0201/qwen3-tts-voicedesign",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadUrl": "/downloads/qwen3-tts-voicedesign",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=qwen3-tts-voicedesign",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "packageFormat": "ZIP package",
    "primaryDoc": "SKILL.md",
    "includedAssets": [
      "SKILL.md",
      "scripts/batch_seeds.sh",
      "scripts/say.sh",
      "scripts/setup.sh",
      "scripts/tts_server.py"
    ],
    "downloadMode": "redirect",
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/qwen3-tts-voicedesign"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    }
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/qwen3-tts-voicedesign",
    "downloadUrl": "https://openagent3.xyz/downloads/qwen3-tts-voicedesign",
    "agentUrl": "https://openagent3.xyz/skills/qwen3-tts-voicedesign/agent",
    "manifestUrl": "https://openagent3.xyz/skills/qwen3-tts-voicedesign/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/qwen3-tts-voicedesign/agent.md"
  }
}
```
## Documentation

### Qwen3-TTS VoiceDesign

Text → Speech with natural language voice descriptions + seed-based timbre fixation.

### Quick Start

# Generate speech (uses server defaults)
TTS_URL=http://your-server:8881 scripts/say.sh "Hello world!"

# Save to file
scripts/say.sh "Save this" output.mp3

# Batch compare seeds (voice exploration)
scripts/batch_seeds.sh "Hello world!" 42 123 201 456 789 /tmp/seeds

### Environment Variables

All config via env vars — text is the only required argument:

VariableDefaultDescriptionTTS_URLhttp://localhost:8881Server base URL (client side)TTS_SEED4096Random seed → controls timbreTTS_INSTRUCT(generic female voice)Voice description promptTTS_MODEL_PATHQwen/Qwen3-TTS-12Hz-1.7B-VoiceDesignModel weights pathTTS_PORT8881Server listen portTTS_HOST0.0.0.0Server bind addressTTS_FORMATmp3Output format: mp3 / wav

Server reads from .env file in its directory. Client scripts read from shell env.

### Voice Description Example

30岁男性播音员，声音低沉磁性，
语速稳重从容，咬字清晰标准，
像新闻联播主播的专业感，又带一点温暖。

Tip: Once you've found your perfect voice (description + seed), set them as server defaults in .env. Then client calls only need to pass text.

### OpenAI-Compatible

curl -X POST $TTS_URL/v1/audio/speech \\
  -H "Content-Type: application/json" \\
  -d '{"input": "Hello!"}' -o speech.mp3

### Custom (seed + instruct override)

curl -X POST $TTS_URL/tts \\
  -H "Content-Type: application/json" \\
  -d '{"text": "Hello!", "seed": 201, "instruct": "温柔女生"}' -o speech.mp3

### GET (quick test)

curl "$TTS_URL/tts?text=Hello&seed=201" -o test.mp3

### Seed Mechanics

Same (description + seed) → same timbre. Different seeds → completely different voices.

⚠️ Seeds are purely random — seed 42 and 43 can sound completely different. Finding a voice = opening blind boxes.

Workflow: fix description → batch 30-40 seeds → listen → shortlist 2-3 → compare across scenarios → pick.

### Deploy Your Own

# One-click setup (Python 3.10+ and CUDA GPU required)
bash scripts/setup.sh ./my-tts

# Configure voice in .env
echo 'TTS_SEED=201' >> ./my-tts/.env
echo 'TTS_INSTRUCT=Your voice description here' >> ./my-tts/.env

# Start server
bash scripts/setup.sh start ./my-tts

Setup installs: qwen-tts, soundfile, pydub, uvicorn, fastapi, torch (CUDA).
Downloads VoiceDesign model (~3.5GB) via ModelScope (China) or HuggingFace.

Requirements: CUDA GPU with 4GB+ VRAM, Python 3.10+, ~4GB disk.

### Scripts

ScriptPurposescripts/say.shGenerate speech — say.sh "text" [output.mp3]scripts/batch_seeds.shCompare seeds — batch_seeds.sh "text" seed1 seed2 ...scripts/tts_server.pyFastAPI server (fully env-configurable)scripts/setup.shOne-click deploy (venv + deps + model download)

### OpenClaw Integration

In openclaw.json:

{
  "env": { "OPENAI_TTS_BASE_URL": "http://<your-server>:8881/v1" },
  "messages": {
    "tts": {
      "provider": "openai",
      "openai": { "apiKey": "dummy", "model": "qwen3-tts", "voice": "default" },
      "timeoutMs": 120000
    }
  }
}

### Server Management

# Health check
curl -s $TTS_URL/health

# Start (foreground)
python tts_server.py

# Start (background, Linux/macOS)
nohup python tts_server.py > server.log 2>&1 &

# Auto-restart (Windows — scheduled task + guard script)
# Create tts_guard.bat:
#   @echo off
#   :loop
#   python tts_server.py
#   timeout /t 10
#   goto loop
# Register: schtasks /create /tn "TTS-Guard" /tr "tts_guard.bat" /sc onlogon /rl highest

# Auto-restart (Linux — systemd)
# See setup.sh output for systemd unit template

# Stop
# Linux/macOS: kill $(lsof -ti:8881)
# Windows: for /f "tokens=5" %a in ('netstat -aon ^| findstr :8881') do taskkill /PID %a /F

### Troubleshooting

Connection refused → Server not running; start it
30s+ first request → Cold start (model loading ~60s); subsequent requests 10-15s
Behind proxy → Set NO_PROXY=<server_ip> on client side
Windows firewall → netsh advfirewall firewall add rule name="TTS" dir=in action=allow protocol=TCP localport=8881
No flash-attn on Windows → Expected; falls back to PyTorch SDPA (slower but works)
PowerShell corrupts Chinese → Edit .env/config via Python or SCP, not PowerShell Set-Content
Process dies on SSH disconnect → Use scheduled task (Windows) or systemd (Linux) instead of foreground

### Voice Design Tips

Describe like casting a voice actor:

Age/gender: "18岁女大学生" / "30岁男性播音员"
Texture: "柔和温暖" / "清脆明亮" / "低沉磁性"
Emotion: "轻柔细腻" / "活泼开朗"
Accent: "南方口音软糯" / "台湾腔" / "东北大碴子味"
Metaphor: "像棉花糖" / "像播音主持" (helps the model capture feeling)

⚠️ Timbre ≠ description. Description controls style/emotion; seed controls timbre. Don't put personality traits ("灵动俏皮") in description — that's the seed's job.
## Trust
- Source: tencent
- Verification: Indexed source record
- Publisher: xiaoyaner0201
- Version: 1.0.0
## Source health
- Status: healthy
- Source download looks usable.
- Yavira can redirect you to the upstream package for this source.
- Health scope: source
- Reason: direct_download_ok
- Checked at: 2026-04-30T16:55:25.780Z
- Expires at: 2026-05-07T16:55:25.780Z
- Recommended action: Download for OpenClaw
## Links
- [Detail page](https://openagent3.xyz/skills/qwen3-tts-voicedesign)
- [Send to Agent page](https://openagent3.xyz/skills/qwen3-tts-voicedesign/agent)
- [JSON manifest](https://openagent3.xyz/skills/qwen3-tts-voicedesign/agent.json)
- [Markdown brief](https://openagent3.xyz/skills/qwen3-tts-voicedesign/agent.md)
- [Download page](https://openagent3.xyz/downloads/qwen3-tts-voicedesign)