# Send Qwen3 Tts Mlx to your agent
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
## Fast path
- Download the package from Yavira.
- Extract it into a folder your agent can access.
- Paste one of the prompts below and point your agent at the extracted folder.
## Suggested prompts
### New install

```text
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
```
### Upgrade existing

```text
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
```
## Machine-readable fields
```json
{
  "schemaVersion": "1.0",
  "item": {
    "slug": "qwen3-tts-mlx",
    "name": "Qwen3 Tts Mlx",
    "source": "tencent",
    "type": "skill",
    "category": "AI 智能",
    "sourceUrl": "https://clawhub.ai/h1bomb/qwen3-tts-mlx",
    "canonicalUrl": "https://clawhub.ai/h1bomb/qwen3-tts-mlx",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadUrl": "/downloads/qwen3-tts-mlx",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=qwen3-tts-mlx",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "packageFormat": "ZIP package",
    "primaryDoc": "SKILL.md",
    "includedAssets": [
      "SKILL.md",
      "references/dubbing_format.md",
      "scripts/batch_dubbing.py",
      "scripts/run_tts.py"
    ],
    "downloadMode": "redirect",
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/qwen3-tts-mlx"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    }
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/qwen3-tts-mlx",
    "downloadUrl": "https://openagent3.xyz/downloads/qwen3-tts-mlx",
    "agentUrl": "https://openagent3.xyz/skills/qwen3-tts-mlx/agent",
    "manifestUrl": "https://openagent3.xyz/skills/qwen3-tts-mlx/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/qwen3-tts-mlx/agent.md"
  }
}
```
## Documentation

### Qwen3-TTS MLX

Run Qwen3-TTS locally on Apple Silicon (M1/M2/M3/M4) using MLX. Supports 11 languages, 9 built-in voices, voice cloning, and voice design from text descriptions.

### When to Use

Generate speech fully offline on a Mac
Produce narration, audiobooks, podcasts, or video voiceovers
Create multilingual TTS with controllable style and emotion
Clone any voice from a short audio sample
Design custom voices from text descriptions

### Install

pip install mlx-audio
brew install ffmpeg

### Basic Usage

python scripts/run_tts.py custom-voice \\
  --text "Hello, welcome to local text to speech." \\
  --voice Ryan \\
  --output output.wav

### With Style Control

python scripts/run_tts.py custom-voice \\
  --text "Breaking news: local AI model achieves human-level speech." \\
  --voice Uncle_Fu \\
  --instruct "news anchor tone, calm and authoritative" \\
  --output news.wav

### Model Variants

VariantModelSizeMemoryUse CaseCustomVoicemlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-4bit~1GB~4GBBuilt-in voices + style control (recommended)VoiceDesignmlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-5bit~2GB~5GBCreate voices from text descriptionsBasemlx-community/Qwen3-TTS-12Hz-0.6B-Base-4bit~1GB~4GBVoice cloning from reference audio

### Supported Languages

LanguageCodeNotesAuto-detectautoDefault, detects from textChineseChineseMandarinEnglishEnglishJapaneseJapaneseKoreanKoreanFrenchFrenchGermanGermanSpanishSpanishPortuguesePortugueseItalianItalianRussianRussian

### Built-in Voices

VoiceLanguageCharacterVivianChineseFemale, bright, youngSerenaChineseFemale, gentle, softUncle_FuChineseMale, authoritative, news anchorDylanChineseMale, Beijing dialectEricChineseMale, Sichuan dialectRyanEnglishMale, energeticAidenEnglishMale, clear, neutralOno_AnnaJapaneseFemaleSoheeKoreanFemale

Voice Selection Guide:

ScenarioRecommended VoiceChinese news/narrationUncle_FuChinese casual/livelyEricChinese female, professionalVivianChinese female, storytellingSerenaEnglish energetic contentRyanEnglish neutral/educationalAidenJapanese contentOno_AnnaKorean contentSohee

### 1) CustomVoice

Use built-in voices with optional emotion/style control via --instruct.

python scripts/run_tts.py custom-voice \\
  --text "This is amazing news!" \\
  --voice Vivian \\
  --instruct "excited and happy" \\
  --output excited.wav

Style instruction examples:

"calm and warm" - Soft, friendly delivery
"news anchor, authoritative" - Professional broadcast style
"excited and energetic" - High energy, enthusiastic
"sad and melancholic" - Emotional, somber tone
"whispering, intimate" - Quiet, close-mic feel

### 2) VoiceDesign

Create a completely new voice by describing it in natural language.

python scripts/run_tts.py voice-design \\
  --text "Welcome to our podcast." \\
  --instruct "warm, mature male narrator with low pitch and gentle tone" \\
  --output podcast_intro.wav

Voice description examples:

"young cheerful female with high pitch"
"elderly wise male with deep resonant voice"
"professional female news anchor, clear articulation"
"friendly young male, casual and relaxed"

### 3) VoiceClone

Clone any voice from a reference audio sample (5-10 seconds recommended).

python scripts/run_tts.py voice-clone \\
  --text "This is my cloned voice speaking new content." \\
  --ref_audio reference.wav \\
  --ref_text "The exact transcript of the reference audio" \\
  --output cloned.wav

Tips for voice cloning:

Use clean audio without background noise
5-10 seconds of speech works best
Provide accurate transcript of the reference
Reference and output language should match

### CLI Parameters

ParameterRequiredDefaultDescription--textYes-Text to synthesize--voiceNoVivianBuilt-in voice (CustomVoice only)--lang_codeNoautoLanguage code--instructNo-Style control or voice description--speedNo1.0Speech speed multiplier--temperatureNo0.7Sampling temperature (higher = more variation)--modelNo(per mode)Override default model--outputNo-Output file path--out-dirNo./outputsOutput directory when --output not set--ref_audioVoiceClone-Reference audio file--ref_textVoiceClone-Reference audio transcript

### Using generate_audio (recommended)

from mlx_audio.tts.generate import generate_audio

# CustomVoice with style control
generate_audio(
    text="Hello from Qwen3-TTS!",
    model="mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-4bit",
    voice="Ryan",
    lang_code="english",
    instruct="friendly and warm",
    output_path=".",
    file_prefix="hello",
    audio_format="wav",
    join_audio=True,
    verbose=True,
)

### Using Model directly

from mlx_audio.tts.utils import load
import soundfile as sf
import numpy as np

# Load model
model = load("mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-4bit")

# Generate audio (returns a generator)
audio_chunks = []
for chunk in model.generate_custom_voice(
    text="Hello from Qwen3-TTS.",
    speaker="Ryan",
    language="english",
    instruct="clear, steady delivery"
):
    if hasattr(chunk, 'audio') and chunk.audio is not None:
        audio_chunks.append(chunk.audio)

# Combine and save
audio = np.concatenate(audio_chunks)
sf.write("output.wav", audio, 24000)

### VoiceDesign

from mlx_audio.tts.generate import generate_audio

generate_audio(
    text="Welcome to the show.",
    model="mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-5bit",
    instruct="warm, friendly female narrator with medium pitch",
    lang_code="english",
    output_path=".",
    file_prefix="voice_design",
    join_audio=True,
)

### VoiceClone

from mlx_audio.tts.generate import generate_audio

generate_audio(
    text="New content in the cloned voice.",
    model="mlx-community/Qwen3-TTS-12Hz-0.6B-Base-4bit",
    ref_audio="reference.wav",
    ref_text="Transcript of the reference audio",
    output_path=".",
    file_prefix="cloned",
    join_audio=True,
)

### Batch Processing

Use scripts/batch_dubbing.py for processing multiple lines:

python scripts/batch_dubbing.py \\
  --input dubbing.json \\
  --out-dir outputs

See references/dubbing_format.md for the JSON format.

### Performance

MetricValueSample rate24,000 HzReal-time factor~0.7x (faster than real-time)Peak memory~4-6 GBFirst runDownloads model (~1-2GB)

### Troubleshooting

IssueSolutionSlow generationUse 4-bit CustomVoice modelUnnatural pausesAdd punctuation, keep sentences shortWrong language detectedSpecify --lang_code explicitlyVoice cloning qualityUse cleaner reference audio, accurate transcriptTokenizer warningsHarmless, can be ignoredOut of memoryClose other apps, use 4-bit model
## Trust
- Source: tencent
- Verification: Indexed source record
- Publisher: h1bomb
- Version: 2.1.0
## Source health
- Status: healthy
- Source download looks usable.
- Yavira can redirect you to the upstream package for this source.
- Health scope: source
- Reason: direct_download_ok
- Checked at: 2026-04-30T16:55:25.780Z
- Expires at: 2026-05-07T16:55:25.780Z
- Recommended action: Download for OpenClaw
## Links
- [Detail page](https://openagent3.xyz/skills/qwen3-tts-mlx)
- [Send to Agent page](https://openagent3.xyz/skills/qwen3-tts-mlx/agent)
- [JSON manifest](https://openagent3.xyz/skills/qwen3-tts-mlx/agent.json)
- [Markdown brief](https://openagent3.xyz/skills/qwen3-tts-mlx/agent.md)
- [Download page](https://openagent3.xyz/downloads/qwen3-tts-mlx)