{
  "schemaVersion": "1.0",
  "item": {
    "slug": "zhipu-asr",
    "name": "Zhipu AI ASR",
    "source": "tencent",
    "type": "skill",
    "category": "AI 智能",
    "sourceUrl": "https://clawhub.ai/franklu0819-lang/zhipu-asr",
    "canonicalUrl": "https://clawhub.ai/franklu0819-lang/zhipu-asr",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/zhipu-asr",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=zhipu-asr",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "README.md",
      "SKILL.md",
      "_meta.json",
      "package.json",
      "scripts/speech_to_text.sh"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/zhipu-asr"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/zhipu-asr",
    "agentPageUrl": "https://openagent3.xyz/skills/zhipu-asr/agent",
    "manifestUrl": "https://openagent3.xyz/skills/zhipu-asr/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/zhipu-asr/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "Zhipu AI Automatic Speech Recognition (ASR)",
        "body": "Transcribe Chinese audio files to text using Zhipu AI's GLM-ASR model."
      },
      {
        "title": "Setup",
        "body": "1. Get your API Key:\nGet a key from Zhipu AI Console\n\n2. Set it in your environment:\n\nexport ZHIPU_API_KEY=\"your-key-here\""
      },
      {
        "title": "Supported Audio Formats",
        "body": "WAV - Recommended, best quality\nMP3 - Widely supported\nOGG - Auto-converted to MP3\nM4A - Auto-converted to MP3\nAAC - Auto-converted to MP3\nFLAC - Auto-converted to MP3\nWMA - Auto-converted to MP3\n\nNote: The script automatically converts unsupported formats to MP3 using ffmpeg. Only WAV and MP3 are accepted by the API, but you can use any format that ffmpeg supports."
      },
      {
        "title": "File Constraints",
        "body": "Maximum file size: 25 MB\nMaximum duration: 30 seconds\nRecommended sample rate: 16000 Hz or higher\nAudio channels: Mono or stereo"
      },
      {
        "title": "Basic Transcription",
        "body": "Transcribe an audio file with default settings:\n\nbash scripts/speech_to_text.sh recording.wav"
      },
      {
        "title": "Transcription with Context",
        "body": "Provide previous transcription or context for better accuracy:\n\nbash scripts/speech_to_text.sh recording.wav \"这是之前的转录内容，有助于提高准确性\""
      },
      {
        "title": "Transcription with Hotwords",
        "body": "Use custom vocabulary to improve recognition of specific terms:\n\nbash scripts/speech_to_text.sh recording.mp3 \"\" \"人名,地名,专业术语,公司名称\""
      },
      {
        "title": "Full Options",
        "body": "Combine context and hotwords:\n\nbash scripts/speech_to_text.sh recording.wav \"会议记录片段\" \"张三,李四,项目名称\"\n\nParameters:\n\naudio_file (required): Path to audio file (.wav or .mp3)\nprompt (optional): Previous transcription or context text (max 8000 chars)\nhotwords (optional): Comma-separated list of specific terms (max 100 words)"
      },
      {
        "title": "Context Prompts",
        "body": "Why use context prompts:\n\nImproves accuracy in long conversations\nHelps with domain-specific terminology\nMaintains consistency across multiple segments\n\nWhen to use:\n\nMulti-part conversations or meetings\nTechnical or specialized content\nContinuing from previous transcriptions\n\nExample:\n\nbash scripts/speech_to_text.sh part2.wav \"第一部分的转录内容：讨论了项目进展和下一步计划\""
      },
      {
        "title": "Hotwords",
        "body": "What are hotwords:\nCustom vocabulary list that boosts recognition accuracy for specific terms.\n\nBest use cases:\n\nProper names (people, places)\nDomain-specific terminology\nCompany names and products\nTechnical jargon\nIndustry-specific terms\n\nExamples:\n\n# Medical transcription\nbash scripts/speech_to_text.sh medical.wav \"\" \"患者,症状,诊断,治疗方案\"\n\n# Business meeting\nbash scripts/speech_to_text.sh meeting.wav \"\" \"张经理,李总,项目代号,预算\"\n\n# Tech discussion\nbash scripts/speech_to_text.sh tech.wav \"\" \"API,数据库,算法,框架\""
      },
      {
        "title": "Transcribe a Meeting",
        "body": "# Part 1\nbash scripts/speech_to_text.sh meeting_part1.wav\n\n# Part 2 with context\nbash scripts/speech_to_text.sh meeting_part2.wav \"第一部分讨论了项目进度\" \"张总,李经理,项目名称\"\n\n# Part 3 with context\nbash scripts/speech_to_text.sh meeting_part3.wav \"前两部分讨论了项目进度和预算\" \"张总,李经理,项目名称\""
      },
      {
        "title": "Transcribe a Lecture",
        "body": "bash scripts/speech_to_text.sh lecture.wav \"\" \"教授,课程名称,专业术语1,专业术语2\""
      },
      {
        "title": "Process Multiple Files",
        "body": "for file in recording_*.wav; do\n    bash scripts/speech_to_text.sh \"$file\"\ndone"
      },
      {
        "title": "Audio Quality Tips",
        "body": "Best practices for accurate transcription:\n\nClear audio source\n\nMinimize background noise\nUse good quality microphone\nSpeak clearly and at moderate pace\n\n\n\nOptimal audio settings\n\nSample rate: 16000 Hz or higher\nBit depth: 16-bit or higher\nSingle channel (mono) is sufficient\n\n\n\nFile preparation\n\nRemove silence from beginning/end\nNormalize audio levels\nEnsure consistent volume"
      },
      {
        "title": "Output Format",
        "body": "The script outputs JSON with:\n\nid: Task ID\ncreated: Request timestamp (Unix timestamp)\nrequest_id: Unique request identifier\nmodel: Model name used\ntext: Transcribed text\n\nExample output:\n\n{\n  \"id\": \"task-12345\",\n  \"created\": 1234567890,\n  \"request_id\": \"req-abc123\",\n  \"model\": \"glm-asr-2512\",\n  \"text\": \"你好，这是转录的文本内容\"\n}"
      },
      {
        "title": "Troubleshooting",
        "body": "File Size Issues:\n\nSplit audio files larger than 25 MB\nReduce sample rate or bit depth\nUse compression (MP3) for smaller files\n\nDuration Issues:\n\nSplit recordings longer than 30 seconds\nProcess segments separately\nUse context prompts to maintain continuity\n\nPoor Accuracy:\n\nImprove audio quality\nUse hotwords for specific terms\nProvide context prompts\nEnsure clear speech and minimal noise\n\nFormat Issues:\n\nEnsure file is .wav or .mp3\nCheck file is not corrupted\nVerify audio can be played by standard players"
      },
      {
        "title": "Limitations",
        "body": "Maximum audio duration: 30 seconds per request\nFile size limit: 25 MB\nMaximum hotwords: 100 terms\nContext prompt limit: 8000 characters\nBest performance with Chinese language audio"
      },
      {
        "title": "Performance Notes",
        "body": "Typical transcription time: 1-3 seconds\nReal-time or faster for most audio\nProcessing time scales with audio quality and length"
      }
    ],
    "body": "Zhipu AI Automatic Speech Recognition (ASR)\n\nTranscribe Chinese audio files to text using Zhipu AI's GLM-ASR model.\n\nSetup\n\n1. Get your API Key: Get a key from Zhipu AI Console\n\n2. Set it in your environment:\n\nexport ZHIPU_API_KEY=\"your-key-here\"\n\nSupported Audio Formats\nWAV - Recommended, best quality\nMP3 - Widely supported\nOGG - Auto-converted to MP3\nM4A - Auto-converted to MP3\nAAC - Auto-converted to MP3\nFLAC - Auto-converted to MP3\nWMA - Auto-converted to MP3\n\nNote: The script automatically converts unsupported formats to MP3 using ffmpeg. Only WAV and MP3 are accepted by the API, but you can use any format that ffmpeg supports.\n\nFile Constraints\nMaximum file size: 25 MB\nMaximum duration: 30 seconds\nRecommended sample rate: 16000 Hz or higher\nAudio channels: Mono or stereo\nUsage\nBasic Transcription\n\nTranscribe an audio file with default settings:\n\nbash scripts/speech_to_text.sh recording.wav\n\nTranscription with Context\n\nProvide previous transcription or context for better accuracy:\n\nbash scripts/speech_to_text.sh recording.wav \"这是之前的转录内容，有助于提高准确性\"\n\nTranscription with Hotwords\n\nUse custom vocabulary to improve recognition of specific terms:\n\nbash scripts/speech_to_text.sh recording.mp3 \"\" \"人名,地名,专业术语,公司名称\"\n\nFull Options\n\nCombine context and hotwords:\n\nbash scripts/speech_to_text.sh recording.wav \"会议记录片段\" \"张三,李四,项目名称\"\n\n\nParameters:\n\naudio_file (required): Path to audio file (.wav or .mp3)\nprompt (optional): Previous transcription or context text (max 8000 chars)\nhotwords (optional): Comma-separated list of specific terms (max 100 words)\nFeatures\nContext Prompts\n\nWhy use context prompts:\n\nImproves accuracy in long conversations\nHelps with domain-specific terminology\nMaintains consistency across multiple segments\n\nWhen to use:\n\nMulti-part conversations or meetings\nTechnical or specialized content\nContinuing from previous transcriptions\n\nExample:\n\nbash scripts/speech_to_text.sh part2.wav \"第一部分的转录内容：讨论了项目进展和下一步计划\"\n\nHotwords\n\nWhat are hotwords: Custom vocabulary list that boosts recognition accuracy for specific terms.\n\nBest use cases:\n\nProper names (people, places)\nDomain-specific terminology\nCompany names and products\nTechnical jargon\nIndustry-specific terms\n\nExamples:\n\n# Medical transcription\nbash scripts/speech_to_text.sh medical.wav \"\" \"患者,症状,诊断,治疗方案\"\n\n# Business meeting\nbash scripts/speech_to_text.sh meeting.wav \"\" \"张经理,李总,项目代号,预算\"\n\n# Tech discussion\nbash scripts/speech_to_text.sh tech.wav \"\" \"API,数据库,算法,框架\"\n\nWorkflow Examples\nTranscribe a Meeting\n# Part 1\nbash scripts/speech_to_text.sh meeting_part1.wav\n\n# Part 2 with context\nbash scripts/speech_to_text.sh meeting_part2.wav \"第一部分讨论了项目进度\" \"张总,李经理,项目名称\"\n\n# Part 3 with context\nbash scripts/speech_to_text.sh meeting_part3.wav \"前两部分讨论了项目进度和预算\" \"张总,李经理,项目名称\"\n\nTranscribe a Lecture\nbash scripts/speech_to_text.sh lecture.wav \"\" \"教授,课程名称,专业术语1,专业术语2\"\n\nProcess Multiple Files\nfor file in recording_*.wav; do\n    bash scripts/speech_to_text.sh \"$file\"\ndone\n\nAudio Quality Tips\n\nBest practices for accurate transcription:\n\nClear audio source\n\nMinimize background noise\nUse good quality microphone\nSpeak clearly and at moderate pace\n\nOptimal audio settings\n\nSample rate: 16000 Hz or higher\nBit depth: 16-bit or higher\nSingle channel (mono) is sufficient\n\nFile preparation\n\nRemove silence from beginning/end\nNormalize audio levels\nEnsure consistent volume\nOutput Format\n\nThe script outputs JSON with:\n\nid: Task ID\ncreated: Request timestamp (Unix timestamp)\nrequest_id: Unique request identifier\nmodel: Model name used\ntext: Transcribed text\n\nExample output:\n\n{\n  \"id\": \"task-12345\",\n  \"created\": 1234567890,\n  \"request_id\": \"req-abc123\",\n  \"model\": \"glm-asr-2512\",\n  \"text\": \"你好，这是转录的文本内容\"\n}\n\nTroubleshooting\n\nFile Size Issues:\n\nSplit audio files larger than 25 MB\nReduce sample rate or bit depth\nUse compression (MP3) for smaller files\n\nDuration Issues:\n\nSplit recordings longer than 30 seconds\nProcess segments separately\nUse context prompts to maintain continuity\n\nPoor Accuracy:\n\nImprove audio quality\nUse hotwords for specific terms\nProvide context prompts\nEnsure clear speech and minimal noise\n\nFormat Issues:\n\nEnsure file is .wav or .mp3\nCheck file is not corrupted\nVerify audio can be played by standard players\nLimitations\nMaximum audio duration: 30 seconds per request\nFile size limit: 25 MB\nMaximum hotwords: 100 terms\nContext prompt limit: 8000 characters\nBest performance with Chinese language audio\nPerformance Notes\nTypical transcription time: 1-3 seconds\nReal-time or faster for most audio\nProcessing time scales with audio quality and length"
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/franklu0819-lang/zhipu-asr",
    "publisherUrl": "https://clawhub.ai/franklu0819-lang/zhipu-asr",
    "owner": "franklu0819-lang",
    "version": "1.0.2",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/zhipu-asr",
    "downloadUrl": "https://openagent3.xyz/downloads/zhipu-asr",
    "agentUrl": "https://openagent3.xyz/skills/zhipu-asr/agent",
    "manifestUrl": "https://openagent3.xyz/skills/zhipu-asr/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/zhipu-asr/agent.md"
  }
}