{
  "schemaVersion": "1.0",
  "item": {
    "slug": "voice-stt-tts",
    "name": "Voice messaging setup",
    "source": "tencent",
    "type": "skill",
    "category": "通讯协作",
    "sourceUrl": "https://clawhub.ai/aksenkin/voice-stt-tts",
    "canonicalUrl": "https://clawhub.ai/aksenkin/voice-stt-tts",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/voice-stt-tts",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=voice-stt-tts",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "SKILL.md"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/voice-stt-tts"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/voice-stt-tts",
    "agentPageUrl": "https://openagent3.xyz/skills/voice-stt-tts/agent",
    "manifestUrl": "https://openagent3.xyz/skills/voice-stt-tts/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/voice-stt-tts/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "Voice Messages (STT + TTS) for OpenClaw 🎙️",
        "body": "Complete voice message setup using faster-whisper for transcription and Edge TTS for voice replies."
      },
      {
        "title": "What we configure",
        "body": "✅ STT (Speech-to-Text) — transcribe voice messages via faster-whisper\n✅ TTS (Text-to-Speech) — voice replies via Edge TTS\n🎯 Result: voice → text → reply with voice"
      },
      {
        "title": "1. Create virtual environment (venv)",
        "body": "For Ubuntu create an isolated venv:\n\npython3 -m venv ~/.openclaw/workspace/voice-messages"
      },
      {
        "title": "2. Install faster-whisper",
        "body": "Install packages in venv:\n\n~/.openclaw/workspace/voice-messages/bin/pip install faster-whisper\n\nWhat gets installed:\n\nfaster-whisper — Python library for transcription\nDependencies: ctranslate2, onnxruntime, huggingface-hub, av, numpy, and others.\nSize: ~250 MB"
      },
      {
        "title": "Path and content",
        "body": "File: ~/.openclaw/workspace/voice-messages/transcribe.py\n\n#!/usr/bin/env python3\nimport argparse\nfrom faster_whisper import WhisperModel\n\n\ndef transcribe(audio_path: str, model_name: str = \"small\", lang: str = \"en\", device: str = \"cpu\") -> str:\n    model = WhisperModel(\n        model_name,\n        device=device,\n        compute_type=\"int8\" if device == \"cpu\" else \"float16\",\n    )\n    segments, _ = model.transcribe(audio_path, language=lang, vad_filter=True)\n    text = \" \".join(seg.text.strip() for seg in segments if seg.text and seg.text.strip()).strip()\n    return text\n\n\ndef main():\n    p = argparse.ArgumentParser()\n    p.add_argument(\"--audio\", required=True)\n    p.add_argument(\"--model\", default=\"small\")\n    p.add_argument(\"--lang\", default=\"en\")\n    p.add_argument(\"--device\", default=\"cpu\", choices=[\"cpu\", \"cuda\"])\n    args = p.parse_args()\n\n    text = transcribe(args.audio, args.model, args.lang, args.device)\n    print(text if text else \"\")\n\n\nif __name__ == \"__main__\":\n    main()\n\nWhat the script does:\n\nAccepts audio file path (--audio)\nLoads Whisper model (--model): small by default\nSets language (--lang): en for English\nTranscribes with VAD filter (Voice Activity Detection)\nOutputs clean text to stdout"
      },
      {
        "title": "Make file executable:",
        "body": "chmod +x ~/.openclaw/workspace/voice-messages/transcribe.py"
      },
      {
        "title": "1. Configure STT (tools.media.audio)",
        "body": "Add to ~/.openclaw/openclaw.json:\n\n{\n  \"tools\": {\n    \"media\": {\n      \"audio\": {\n        \"enabled\": true,\n        \"maxBytes\": 20971520,\n        \"models\": [\n          {\n            \"type\": \"cli\",\n            \"command\": \"~/.openclaw/workspace/voice-messages/bin/python\",\n            \"args\": [\n              \"~/.openclaw/workspace/voice-messages/transcribe.py\",\n              \"--audio\",\n              \"{{MediaPath}}\",\n              \"--lang\",\n              \"en\",\n              \"--model\",\n              \"small\"\n            ],\n            \"timeoutSeconds\": 120\n          }\n        ]\n      }\n    }\n  }\n}\n\nParameters:\n\nParameterValueDescriptionenabledtrueEnable audio transcriptionmaxBytes20971520Max file size (20 MB)type\"cli\"Model type: CLI commandcommandPython pathPath to python in venvargsargument arrayArguments for script{{MediaPath}}placeholderReplaced with audio file pathtimeoutSeconds120Transcription timeout (2 minutes)"
      },
      {
        "title": "2. Configure TTS (messages.tts)",
        "body": "Add to ~/.openclaw/openclaw.json:\n\n{\n  \"messages\": {\n    \"tts\": {\n      \"auto\": \"inbound\",\n      \"provider\": \"edge\",\n      \"edge\": {\n        \"voice\": \"en-US-JennyNeural\",\n        \"lang\": \"en-US\"\n      }\n    }\n  }\n}\n\nParameters:\n\nParameterValueDescriptionauto\"inbound\"Key mode! — reply with voice only on incoming voice messagesprovider\"edge\"TTS provider (free, no API key)voice\"en-US-JennyNeural\"Voice (see available below)lang\"en-US\"Locale (en-US for US english)"
      },
      {
        "title": "3. Full configuration example",
        "body": "{\n  \"tools\": {\n    \"media\": {\n      \"audio\": {\n        \"enabled\": true,\n        \"maxBytes\": 20971520,\n        \"models\": [\n          {\n            \"type\": \"cli\",\n            \"command\": \"~/.openclaw/workspace/voice-messages/bin/python\",\n            \"args\": [\n              \"~/.openclaw/workspace/voice-messages/transcribe.py\",\n              \"--audio\",\n              \"{{MediaPath}}\",\n              \"--lang\",\n              \"en\",\n              \"--model\",\n              \"small\"\n            ],\n            \"timeoutSeconds\": 120\n          }\n        ]\n      }\n    },\n  },\n  \"messages\": {\n    \"tts\": {\n      \"auto\": \"inbound\",\n      \"provider\": \"edge\",\n      \"edge\": {\n        \"voice\": \"en-US-JennyNeural\",\n        \"lang\": \"en-US\"\n      }\n    },\n    \"ackReactionScope\": \"group-mentions\"\n  }\n}"
      },
      {
        "title": "Restart Gateway",
        "body": "# Method 1: via openclaw CLI\nopenclaw gateway restart\n\n# Method 2: via systemd\nsystemctl --user restart openclaw-gateway\n\n# Check status\nsystemctl --user status openclaw-gateway\n# Should show: active (running)"
      },
      {
        "title": "Test STT (transcription)",
        "body": "Action: Send a voice message to your Telegram bot\n\nExpected result:\n\n[Audio] User text: [Telegram ...] <media:audio> Transcript: <transcribed text>\n\nExample response:\n\n[Audio] User text: [Telegram kd (@someuser) id:12345678 +5s ...] <media:audio> Transcript: Hello. How are you?"
      },
      {
        "title": "Test TTS (voice replies)",
        "body": "Action: After successful transcription, bot should send a voice reply\n\nExpected result:\n\nVoice file arrives in Telegram\nVoice note (round bubble)\n\nExpected behavior:\n\nIncoming voice → bot replies with voice\nText messages → bot replies with text (this is normal!)"
      },
      {
        "title": "Female voices",
        "body": "VoiceIDUsage exampleJennyen-US-JennyNeural← currentAnaen-US-AnaNeuralSofter"
      },
      {
        "title": "Male voices",
        "body": "VoiceIDUsage exampleDmitryen-US-RogerNeuralMore bass\n\nHow to change voice:\n\ncat ~/.openclaw/openclaw.json | \\\n  jq '.messages.tts.edge.voice = \"en-US-MichelleNeural\"' > ~/.openclaw/openclaw.json.tmp\nmv ~/.openclaw/openclaw.json.tmp ~/.openclaw/openclaw.json\nsystemctl --user restart openclaw-gateway"
      },
      {
        "title": "Adjusting speed, pitch, volume",
        "body": "{\n  \"messages\": {\n    \"tts\": {\n      \"edge\": {\n        \"voice\": \"en-US-JennyNeural\",\n        \"lang\": \"en-US\",\n        \"rate\": \"+10%\",      // Speed: -50% to +100%\n        \"pitch\": \"-5%\",     // Pitch: -50% to +50%\n        \"volume\": \"+5%\"     // Volume: -100% to +100%\n      }\n    }\n  }\n}"
      },
      {
        "title": "Problem: Voice not transcribed",
        "body": "Logs show:\n\n[ERROR] Transcription failed\n\nPossible causes:\n\nFile too large — > 20 MB\n# Solution: Increase maxBytes in config\nmaxBytes: 52428800  # 50 MB\n\n\n\nTimeout — transcription took > 2 minutes\n# Solution: Increase timeoutSeconds\ntimeoutSeconds: 180  # 3 minutes\n\n\n\nModel not downloaded — first run\n# Solution: Wait while it downloads (1-2 minutes)\n# Models are cached in ~/.cache/huggingface/"
      },
      {
        "title": "Problem: No voice reply",
        "body": "Possible causes:\n\nReply too short (< 10 characters)\n\nTTS skips very short replies\nSolution: this is expected behavior\n\n\n\nauto: \"inbound\" but text message\n\nTTS in inbound mode replies with voice only on voice messages\nText messages get text replies — this is correct!\n\n\n\nEdge TTS unavailable\n# Check\ncurl -s \"https://speech.platform.bing.com/consumer/api/v1/tts\" | head -c 100\n# If error — temporarily unavailable"
      },
      {
        "title": "Transcription time (Raspberry Pi 4/ARM)",
        "body": "Whisper ModelEst. timeQualitytiny~5-10 secLowbase~10-20 secMediumsmall~20-40 secHigh ← currentmedium~40-80 secVery highlarge~80-160 secMaximum\n\nRecommendation: For Raspberry Pi use small or base. medium/large will be very slow."
      },
      {
        "title": "Where Whisper models are stored",
        "body": "~/.cache/huggingface/\n\nModels download automatically on first run."
      },
      {
        "title": "Done! 🎉",
        "body": "After completing these steps:\n\n✅ faster-whisper installed in venv\n✅ transcribe.py script created\n✅ OpenClaw configured (STT + TTS)\n✅ Gateway restarted\n✅ Voice messages working\n\nNow your Telegram bot:\n\n🎙️ Accepts voice → transcribes via faster-whisper\n🎤 Replies with voice → generates via Edge TTS\n💬 Accepts text → replies with text (as usual)\n\nUseful links:\n\nOpenClaw docs: https://docs.openclaw.ai\nTTS docs: https://docs.openclaw.ai/tts\nAudio docs: https://docs.openclaw.ai/nodes/audio\nInstall skills: npx clawhub search voice\n\nCreated: 2026-03-01 for OpenClaw 2026.2.26"
      }
    ],
    "body": "Voice Messages (STT + TTS) for OpenClaw 🎙️\n\nComplete voice message setup using faster-whisper for transcription and Edge TTS for voice replies.\n\nWhat we configure\n✅ STT (Speech-to-Text) — transcribe voice messages via faster-whisper\n✅ TTS (Text-to-Speech) — voice replies via Edge TTS\n🎯 Result: voice → text → reply with voice\nInstallation\n1. Create virtual environment (venv)\n\nFor Ubuntu create an isolated venv:\n\npython3 -m venv ~/.openclaw/workspace/voice-messages\n\n2. Install faster-whisper\n\nInstall packages in venv:\n\n~/.openclaw/workspace/voice-messages/bin/pip install faster-whisper\n\n\nWhat gets installed:\n\nfaster-whisper — Python library for transcription\nDependencies: ctranslate2, onnxruntime, huggingface-hub, av, numpy, and others.\nSize: ~250 MB\nTranscription Script\nPath and content\n\nFile: ~/.openclaw/workspace/voice-messages/transcribe.py\n\n#!/usr/bin/env python3\nimport argparse\nfrom faster_whisper import WhisperModel\n\n\ndef transcribe(audio_path: str, model_name: str = \"small\", lang: str = \"en\", device: str = \"cpu\") -> str:\n    model = WhisperModel(\n        model_name,\n        device=device,\n        compute_type=\"int8\" if device == \"cpu\" else \"float16\",\n    )\n    segments, _ = model.transcribe(audio_path, language=lang, vad_filter=True)\n    text = \" \".join(seg.text.strip() for seg in segments if seg.text and seg.text.strip()).strip()\n    return text\n\n\ndef main():\n    p = argparse.ArgumentParser()\n    p.add_argument(\"--audio\", required=True)\n    p.add_argument(\"--model\", default=\"small\")\n    p.add_argument(\"--lang\", default=\"en\")\n    p.add_argument(\"--device\", default=\"cpu\", choices=[\"cpu\", \"cuda\"])\n    args = p.parse_args()\n\n    text = transcribe(args.audio, args.model, args.lang, args.device)\n    print(text if text else \"\")\n\n\nif __name__ == \"__main__\":\n    main()\n\n\nWhat the script does:\n\nAccepts audio file path (--audio)\nLoads Whisper model (--model): small by default\nSets language (--lang): en for English\nTranscribes with VAD filter (Voice Activity Detection)\nOutputs clean text to stdout\nMake file executable:\nchmod +x ~/.openclaw/workspace/voice-messages/transcribe.py\n\nOpenClaw Configuration\n1. Configure STT (tools.media.audio)\n\nAdd to ~/.openclaw/openclaw.json:\n\n{\n  \"tools\": {\n    \"media\": {\n      \"audio\": {\n        \"enabled\": true,\n        \"maxBytes\": 20971520,\n        \"models\": [\n          {\n            \"type\": \"cli\",\n            \"command\": \"~/.openclaw/workspace/voice-messages/bin/python\",\n            \"args\": [\n              \"~/.openclaw/workspace/voice-messages/transcribe.py\",\n              \"--audio\",\n              \"{{MediaPath}}\",\n              \"--lang\",\n              \"en\",\n              \"--model\",\n              \"small\"\n            ],\n            \"timeoutSeconds\": 120\n          }\n        ]\n      }\n    }\n  }\n}\n\n\nParameters:\n\nParameter\tValue\tDescription\nenabled\ttrue\tEnable audio transcription\nmaxBytes\t20971520\tMax file size (20 MB)\ntype\t\"cli\"\tModel type: CLI command\ncommand\tPython path\tPath to python in venv\nargs\targument array\tArguments for script\n{{MediaPath}}\tplaceholder\tReplaced with audio file path\ntimeoutSeconds\t120\tTranscription timeout (2 minutes)\n2. Configure TTS (messages.tts)\n\nAdd to ~/.openclaw/openclaw.json:\n\n{\n  \"messages\": {\n    \"tts\": {\n      \"auto\": \"inbound\",\n      \"provider\": \"edge\",\n      \"edge\": {\n        \"voice\": \"en-US-JennyNeural\",\n        \"lang\": \"en-US\"\n      }\n    }\n  }\n}\n\n\nParameters:\n\nParameter\tValue\tDescription\nauto\t\"inbound\"\tKey mode! — reply with voice only on incoming voice messages\nprovider\t\"edge\"\tTTS provider (free, no API key)\nvoice\t\"en-US-JennyNeural\"\tVoice (see available below)\nlang\t\"en-US\"\tLocale (en-US for US english)\n3. Full configuration example\n{\n  \"tools\": {\n    \"media\": {\n      \"audio\": {\n        \"enabled\": true,\n        \"maxBytes\": 20971520,\n        \"models\": [\n          {\n            \"type\": \"cli\",\n            \"command\": \"~/.openclaw/workspace/voice-messages/bin/python\",\n            \"args\": [\n              \"~/.openclaw/workspace/voice-messages/transcribe.py\",\n              \"--audio\",\n              \"{{MediaPath}}\",\n              \"--lang\",\n              \"en\",\n              \"--model\",\n              \"small\"\n            ],\n            \"timeoutSeconds\": 120\n          }\n        ]\n      }\n    },\n  },\n  \"messages\": {\n    \"tts\": {\n      \"auto\": \"inbound\",\n      \"provider\": \"edge\",\n      \"edge\": {\n        \"voice\": \"en-US-JennyNeural\",\n        \"lang\": \"en-US\"\n      }\n    },\n    \"ackReactionScope\": \"group-mentions\"\n  }\n}\n\nApply Changes\nRestart Gateway\n# Method 1: via openclaw CLI\nopenclaw gateway restart\n\n# Method 2: via systemd\nsystemctl --user restart openclaw-gateway\n\n# Check status\nsystemctl --user status openclaw-gateway\n# Should show: active (running)\n\nTesting\nTest STT (transcription)\n\nAction: Send a voice message to your Telegram bot\n\nExpected result:\n\n[Audio] User text: [Telegram ...] <media:audio> Transcript: <transcribed text>\n\n\nExample response:\n\n[Audio] User text: [Telegram kd (@someuser) id:12345678 +5s ...] <media:audio> Transcript: Hello. How are you?\n\nTest TTS (voice replies)\n\nAction: After successful transcription, bot should send a voice reply\n\nExpected result:\n\nVoice file arrives in Telegram\nVoice note (round bubble)\n\nExpected behavior:\n\nIncoming voice → bot replies with voice\nText messages → bot replies with text (this is normal!)\nAvailable Edge TTS Voices\nFemale voices\nVoice\tID\tUsage example\nJenny\ten-US-JennyNeural\t← current\nAna\ten-US-AnaNeural\tSofter\nMale voices\nVoice\tID\tUsage example\nDmitry\ten-US-RogerNeural\tMore bass\n\nHow to change voice:\n\ncat ~/.openclaw/openclaw.json | \\\n  jq '.messages.tts.edge.voice = \"en-US-MichelleNeural\"' > ~/.openclaw/openclaw.json.tmp\nmv ~/.openclaw/openclaw.json.tmp ~/.openclaw/openclaw.json\nsystemctl --user restart openclaw-gateway\n\nAdditional Edge TTS Parameters\nAdjusting speed, pitch, volume\n{\n  \"messages\": {\n    \"tts\": {\n      \"edge\": {\n        \"voice\": \"en-US-JennyNeural\",\n        \"lang\": \"en-US\",\n        \"rate\": \"+10%\",      // Speed: -50% to +100%\n        \"pitch\": \"-5%\",     // Pitch: -50% to +50%\n        \"volume\": \"+5%\"     // Volume: -100% to +100%\n      }\n    }\n  }\n}\n\nTroubleshooting\nProblem: Voice not transcribed\n\nLogs show:\n\n[ERROR] Transcription failed\n\n\nPossible causes:\n\nFile too large — > 20 MB\n\n# Solution: Increase maxBytes in config\nmaxBytes: 52428800  # 50 MB\n\n\nTimeout — transcription took > 2 minutes\n\n# Solution: Increase timeoutSeconds\ntimeoutSeconds: 180  # 3 minutes\n\n\nModel not downloaded — first run\n\n# Solution: Wait while it downloads (1-2 minutes)\n# Models are cached in ~/.cache/huggingface/\n\nProblem: No voice reply\n\nPossible causes:\n\nReply too short (< 10 characters)\n\nTTS skips very short replies\nSolution: this is expected behavior\n\nauto: \"inbound\" but text message\n\nTTS in inbound mode replies with voice only on voice messages\nText messages get text replies — this is correct!\n\nEdge TTS unavailable\n\n# Check\ncurl -s \"https://speech.platform.bing.com/consumer/api/v1/tts\" | head -c 100\n# If error — temporarily unavailable\n\nPerformance\nTranscription time (Raspberry Pi 4/ARM)\nWhisper Model\tEst. time\tQuality\ntiny\t~5-10 sec\tLow\nbase\t~10-20 sec\tMedium\nsmall\t~20-40 sec\tHigh ← current\nmedium\t~40-80 sec\tVery high\nlarge\t~80-160 sec\tMaximum\n\nRecommendation: For Raspberry Pi use small or base. medium/large will be very slow.\n\nWhere Whisper models are stored\n~/.cache/huggingface/\n\n\nModels download automatically on first run.\n\nDone! 🎉\n\nAfter completing these steps:\n\n✅ faster-whisper installed in venv\n✅ transcribe.py script created\n✅ OpenClaw configured (STT + TTS)\n✅ Gateway restarted\n✅ Voice messages working\n\nNow your Telegram bot:\n\n🎙️ Accepts voice → transcribes via faster-whisper\n🎤 Replies with voice → generates via Edge TTS\n💬 Accepts text → replies with text (as usual)\n\nUseful links:\n\nOpenClaw docs: https://docs.openclaw.ai\nTTS docs: https://docs.openclaw.ai/tts\nAudio docs: https://docs.openclaw.ai/nodes/audio\nInstall skills: npx clawhub search voice\n\nCreated: 2026-03-01 for OpenClaw 2026.2.26"
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/aksenkin/voice-stt-tts",
    "publisherUrl": "https://clawhub.ai/aksenkin/voice-stt-tts",
    "owner": "aksenkin",
    "version": "1.0.3",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/voice-stt-tts",
    "downloadUrl": "https://openagent3.xyz/downloads/voice-stt-tts",
    "agentUrl": "https://openagent3.xyz/skills/voice-stt-tts/agent",
    "manifestUrl": "https://openagent3.xyz/skills/voice-stt-tts/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/voice-stt-tts/agent.md"
  }
}