{
  "schemaVersion": "1.0",
  "item": {
    "slug": "ima-all-ai",
    "name": "IMA Studio All AI Generation",
    "source": "tencent",
    "type": "skill",
    "category": "开发工具",
    "sourceUrl": "https://clawhub.ai/allenfancy-gan/ima-all-ai",
    "canonicalUrl": "https://clawhub.ai/allenfancy-gan/ima-all-ai",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/ima-all-ai",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=ima-all-ai",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "_meta.json",
      "requirements.txt",
      "clawhub.json",
      "SKILL.md",
      "scripts/ima_create.py",
      "scripts/ima_logger.py"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/ima-all-ai"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/ima-all-ai",
    "agentPageUrl": "https://openagent3.xyz/skills/ima-all-ai/agent",
    "manifestUrl": "https://openagent3.xyz/skills/ima-all-ai/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/ima-all-ai/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "⚠️ 重要：模型 ID 参考",
        "body": "CRITICAL: When calling the script, you MUST use the exact model_id (second column), NOT the friendly model name. Do NOT infer model_id from the friendly name (e.g., ❌ nano-banana-pro is WRONG; ✅ gemini-3-pro-image is CORRECT).\n\nQuick Reference Table:"
      },
      {
        "title": "图像模型 (Image Models)",
        "body": "友好名称 (Friendly Name)model_id说明 (Notes)Nano Banana2gemini-3.1-flash-image❌ NOT nano-banana-2, 预算选择 4-13 ptsNano Banana Progemini-3-pro-image❌ NOT nano-banana-pro, 高质量 10-18 ptsSeeDream 4.5doubao-seedream-4.5✅ Recommended default, 5 ptsMidjourneymidjourney✅ Same as friendly name, 8-10 pts"
      },
      {
        "title": "视频模型 (Video Models)",
        "body": "友好名称 (Friendly Name)model_id (t2v)model_id (i2v)说明 (Notes)Wan 2.6wan2.6-t2vwan2.6-i2v⚠️ Note -t2v/-i2v suffixKling O1kling-video-o1kling-video-o1⚠️ Note video- prefixKling 2.6kling-v2-6kling-v2-6⚠️ Note v prefixHailuo 2.3MiniMax-Hailuo-2.3MiniMax-Hailuo-2.3⚠️ Note MiniMax- prefixHailuo 2.0MiniMax-Hailuo-02MiniMax-Hailuo-02⚠️ Note 02 not 2.0Google Veo 3.1veo-3.1-generate-previewveo-3.1-generate-preview⚠️ Note -generate-preview suffixSora 2 Prosora-2-prosora-2-pro✅ StraightforwardPixversepixversepixverse✅ Same as friendly name"
      },
      {
        "title": "音乐模型 (Music Models)",
        "body": "友好名称 (Friendly Name)model_id说明 (Notes)Suno (sonic v4)sonic⚠️ Simplified to sonicDouBao BGMGenBGM❌ NOT doubao-bgmDouBao SongGenSong❌ NOT doubao-song"
      },
      {
        "title": "语音模型 (Speech/TTS Models)",
        "body": "友好名称 (Friendly Name)model_id说明 (Notes)seed-tts-2.0seed-tts-2.0✅ Same as friendly name (default)\n\nHow to get the correct model_id:\n\nCheck this table first\nUse --list-models --task-type <type> to query available models\nRefer to command examples in this SKILL.md\n\nExample:\n\n# ❌ WRONG: Inferring from friendly name\n--model-id nano-banana-pro\n\n# ✅ CORRECT: Using exact model_id from table\n--model-id gemini-3-pro-image"
      },
      {
        "title": "⚠️ MANDATORY PRE-CHECK: Read Knowledge Base First!",
        "body": "If ima-knowledge-ai is not installed: Skip all \"Read …\" steps below; use only this SKILL's 📥 User Input Parsing (media type → task_type) and the Recommended Defaults / model tables for each media type.\n\nBEFORE executing ANY multi-media generation task, you MUST:\n\nCheck for workflow complexity — Read ima-knowledge-ai/references/workflow-design.md if:\n\nUser mentions: \"MV\"、\"宣传片\"、\"完整作品\"、\"配乐\"、\"soundtrack\"\nTask spans multiple media types (image + video, video + music, etc.)\nComplex multi-step workflows that need task decomposition\n\n\n\nCheck for visual consistency needs — Read ima-knowledge-ai/references/visual-consistency.md if:\n\nUser mentions: \"系列\"、\"多张\"、\"同一个\"、\"角色\"、\"续\"、\"series\"、\"same\"\nTask involves: multiple images/videos, character continuity, product shots\nSecond+ request about same subject (e.g., \"旺财在游泳\" after \"生成旺财照片\")\n\n\n\nCheck video modes — Read ima-knowledge-ai/references/video-modes.md if:\n\nAny video generation task\nNeed to understand: image_to_video vs reference_image_to_video difference\n\n\n\nCheck model selection — Read ima-knowledge-ai/references/model-selection.md if:\n\nUnsure which model to use\nNeed cost/quality trade-off guidance\nUser specifies budget or quality requirements\n\nWhy this matters:\n\nMulti-media workflows need proper task sequencing (e.g., video duration → matching music duration)\nAI generation defaults to 独立生成 each time — without reference images, results will be inconsistent\nWrong video mode = wrong result (image_to_video ≠ reference_image_to_video)\nModel choice affects cost and quality significantly\n\nExample multi-media workflow:\n\nUser: \"帮我做个产品宣传MV，有背景音乐，主角是旺财小狗\"\n\n❌ Wrong: \n  1. Generate dog image (random look)\n  2. Generate video (different dog)\n  3. Generate music (unrelated)\n\n✅ Right:\n  1. Read workflow-design.md + visual-consistency.md\n  2. Generate Master Reference: 旺财小狗图片\n  3. Generate video shots using image_to_video with 旺财 as first frame\n  4. Get video duration (e.g., 15s)\n  5. Generate BGM with matching duration and mood\n\nHow to check:\n\n# Step 0: Determine media type first (image / video / music / speech)\n# From user request: \"画\"/\"生成图\"/\"image\" → image; \"视频\"/\"video\" → video; \"音乐\"/\"歌\"/\"music\"/\"BGM\" → music; \"语音\"/\"朗读\"/\"TTS\"/\"speech\" → speech\n# Then choose task_type and model from the corresponding section (image: text_to_image/image_to_image; video: text_to_video/...; music: text_to_music; speech: text_to_speech)\n\n# Step 1: Read knowledge base based on task type\nif multi_media_workflow:\n    read(\"~/.openclaw/skills/ima-knowledge-ai/references/workflow-design.md\")\n\nif \"same subject\" or \"series\" or \"character\":\n    read(\"~/.openclaw/skills/ima-knowledge-ai/references/visual-consistency.md\")\n\nif video_generation:\n    read(\"~/.openclaw/skills/ima-knowledge-ai/references/video-modes.md\")\n\n# Step 2: Execute with proper sequencing and reference images\n# (see workflow-design.md for specific patterns)\n\nNo exceptions — for simple single-media requests, you can proceed directly. For complex multi-media workflows, read the knowledge base first."
      },
      {
        "title": "📥 User Input Parsing (Media Type & Task Routing)",
        "body": "Purpose: So that any agent parses user intent consistently, first determine the media type from the user's request, then choose task_type and model."
      },
      {
        "title": "1. User phrasing → media type (do this first)",
        "body": "User intent / keywordsMedia typetask_type examples画 / 生成图 / 图片 / image / 画一张 / 图生图imagetext_to_image, image_to_image视频 / 生成视频 / video / 图生视频 / 文生视频videotext_to_video, image_to_video, first_last_frame_to_video, reference_image_to_video音乐 / 歌 / BGM / 背景音乐 / music / 作曲musictext_to_music语音 / 朗读 / TTS / 语音合成 / 配音 / speech / read aloud / text-to-speechspeechtext_to_speech\n\nIf the request mixes media (e.g. \"宣传片+配乐\"), treat as multi-media workflow: read workflow-design.md, then plan image → video → music steps and use the correct task_type for each step."
      },
      {
        "title": "2. Model and parameter parsing",
        "body": "Image: For model name → model_id and size/aspect_ratio parsing, follow the same rules as in ima-image-ai skill (User Input Parsing section).\n\n\nVideo: For task_type (t2v / i2v / first_last / reference), model alias → model_id, and duration/resolution/aspect_ratio, follow ima-video-ai skill (User Input Parsing section).\n\n\nMusic: Suno (sonic) vs DouBao BGM/Song — infer from \"BGM\"/\"背景音乐\" → BGM; \"带歌词\"/\"人声\" → Suno or Song. Use model_id sonic, GenBGM, GenSong per \"Recommended Defaults\" and \"Music Generation\" tables below.\n\n\nSpeech (TTS): Get model_id from GET /open/v1/product/list?category=text_to_speech or run script with --task-type text_to_speech --list-models. Map user intent to parameters using product form_config:\nUser intent / phrasingParameter (if in form_config)Notes女声 / 女声朗读 / female voicevoice_id / voice_typeUse value from form_config options男声 / 男声朗读 / male voicevoice_id / voice_typeUse value from form_config options语速快/慢 / speed up/slowspeede.g. 0.8–1.2音调 / pitchpitchIf supported大声/小声 / volumevolumeIf supported\nIf the user does not specify, use form_config defaults. Pass extra params via --extra-params '{\"speed\":1.0}'. Only send parameters present in the product’s credit_rules/attributes or form_config (script reflection strips others on retry)."
      },
      {
        "title": "⚙️ How This Skill Works",
        "body": "For transparency: This skill uses a bundled Python script (scripts/ima_create.py) to call the IMA Open API. The script:\n\nSends your prompt to two IMA-owned domains (see \"Network Endpoints\" below)\nUses --user-id only locally as a key for storing your model preferences\nReturns image/video/music URLs when generation is complete\n\nWhat gets sent to IMA servers:\n\n✅ Your prompt/description (image/video/music)\n✅ Model selection (SeeDream/Wan/Suno/etc.)\n✅ Generation parameters (size, duration, style, etc.)\n❌ NO API key in prompts (key is used for authentication only)\n❌ NO user_id (it's only used locally)\n\nWhat's stored locally:\n\n~/.openclaw/memory/ima_prefs.json - Your model preferences (< 1 KB)\n~/.openclaw/logs/ima_skills/ - Generation logs (auto-deleted after 7 days)"
      },
      {
        "title": "🌐 Network Endpoints Used",
        "body": "DomainOwnerPurposeData SentPrivacyapi.imastudio.comIMA StudioMain API (product list, task creation, task polling)Prompts, model IDs, generation params, your API keyStandard HTTPS, data processed for AI generationimapi.liveme.comIMA StudioImage/Video upload service (presigned URL generation)Your API key, file metadata (MIME type, extension)Standard HTTPS, used for image/video tasks only*.aliyuncs.com, *.esxscloud.comAlibaba Cloud (OSS)Image/video storage (file upload, CDN delivery)Raw image/video bytes (via presigned URL, NO API key)IMA-managed OSS buckets, presigned URLs expire after 7 days\n\nKey Points:\n\nMusic tasks (text_to_music) and TTS tasks (text_to_speech) only use api.imastudio.com.\nImage/video tasks require imapi.liveme.com to obtain presigned URLs for uploading input images.\nYour API key is sent to both api.imastudio.com and imapi.liveme.com (both owned by IMA Studio).\nVerify network calls: tcpdump -i any -n 'host api.imastudio.com or host imapi.liveme.com'. See this document: 🌐 Network Endpoints Used and ⚠️ Credential Security Notice for full disclosure."
      },
      {
        "title": "⚠️ Credential Security Notice",
        "body": "Your API key is sent to both IMA-owned domains:\n\nAuthorization: Bearer ima_xxx... → api.imastudio.com (main API)\nQuery param appUid=ima_xxx... → imapi.liveme.com (upload service)\n\nSecurity best practices:\n\n🧪 Use test keys for experiments: Generate a separate API key for testing.\n🔍 Monitor usage: Check https://imastudio.com/dashboard for unauthorized activity.\n⏱️ Rotate keys: Regenerate your API key periodically (monthly recommended).\n📊 Review logs: Check ~/.openclaw/logs/ima_skills/ for unexpected API calls.\n\nWhy two domains? IMA Studio uses a microservices architecture:\n\napi.imastudio.com: Core AI generation API\nimapi.liveme.com: Specialized image/video upload service (shared infrastructure)\n\nBoth domains are operated by IMA Studio. The same API key grants access to both services."
      },
      {
        "title": "Agent Execution (Internal Reference)",
        "body": "Note for users: You can review the script source at scripts/ima_create.py anytime.\nThe agent uses this script to simplify API calls. Music tasks use only api.imastudio.com, while image/video tasks also call imapi.liveme.com for file uploads (see \"Network Endpoints\" above).\n\nUse the bundled script internally for all task types — it ensures correct parameter construction:\n\n# ─── Image Generation ──────────────────────────────────────────────────────────\n\n# Basic text-to-image (default model)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_image \\\n  --model-id doubao-seedream-4.5 --prompt \"a cute puppy on grass, photorealistic\" \\\n  --user-id {user_id} --output-json\n\n# Text-to-image with size override (Nano Banana2)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_image \\\n  --model-id gemini-3.1-flash-image --prompt \"city skyline at sunset, 4K\" \\\n  --size 2k --user-id {user_id} --output-json\n\n# Image-to-image with input URL\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type image_to_image \\\n  --model-id doubao-seedream-4.5 --prompt \"turn into oil painting style\" \\\n  --input-images https://example.com/photo.jpg --user-id {user_id} --output-json\n\n# ─── Video Generation ──────────────────────────────────────────────────────────\n\n# Basic text-to-video (default model, 5s 720P)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_video \\\n  --model-id wan2.6-t2v --prompt \"a puppy dancing happily, cinematic\" \\\n  --user-id {user_id} --output-json\n\n# Text-to-video with extra params (10s 1080P)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_video \\\n  --model-id wan2.6-t2v --prompt \"dramatic ocean waves, sunset\" \\\n  --extra-params '{\"duration\":10,\"resolution\":\"1080P\",\"aspect_ratio\":\"16:9\"}' \\\n  --user-id {user_id} --output-json\n\n# Image-to-video (animate static image)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type image_to_video \\\n  --model-id wan2.6-i2v --prompt \"camera slowly zooms in, gentle movement\" \\\n  --input-images https://example.com/photo.jpg --user-id {user_id} --output-json\n\n# First-last frame video (two images)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type first_last_frame_to_video \\\n  --model-id kling-video-o1 --prompt \"smooth transition between frames\" \\\n  --input-images https://example.com/frame1.jpg https://example.com/frame2.jpg \\\n  --user-id {user_id} --output-json\n\n# ─── Music Generation ──────────────────────────────────────────────────────────\n\n# Basic text-to-music (Suno default)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_music \\\n  --model-id sonic --prompt \"upbeat electronic music, 120 BPM, no vocals\" \\\n  --user-id {user_id} --output-json\n\n# Music with custom lyrics (Suno custom mode)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_music \\\n  --model-id sonic --prompt \"pop ballad, emotional\" \\\n  --extra-params '{\"custom_mode\":true,\"lyrics\":\"Your custom lyrics here...\",\"vocal_gender\":\"female\"}' \\\n  --user-id {user_id} --output-json\n\n# Background music (DouBao BGM)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_music \\\n  --model-id GenBGM --prompt \"relaxing ambient music for meditation\" \\\n  --user-id {user_id} --output-json\n\n# ─── Text-to-Speech (TTS) ─────────────────────────────────────────────────────\n\n# List TTS models first to get model_id, then generate speech\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_speech --list-models\n\n# TTS: use model_id from list above (prompt = text to speak)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_speech \\\n  --model-id <model_id from list> --prompt \"Text to be spoken here.\" \\\n  --user-id {user_id} --output-json\n\nThe script outputs JSON with url, model_name, credit — use these values in the UX protocol messages below. The script internals (product list query, parameter construction, polling) are invisible to users."
      },
      {
        "title": "Overview",
        "body": "Call IMA Open API to create AI-generated content. All endpoints require an ima_* API key. The core flow is: query products → create task → poll until done."
      },
      {
        "title": "🔒 Security & Transparency Policy",
        "body": "This skill is community-maintained and open for inspection."
      },
      {
        "title": "✅ What Users CAN Do",
        "body": "Full transparency:\n\n✅ Review all source code: Check scripts/ima_create.py and ima_logger.py anytime\n✅ Verify network calls: Music tasks use api.imastudio.com only; image/video tasks also use imapi.liveme.com (see \"Network Endpoints\" section)\n✅ Inspect local data: View ~/.openclaw/memory/ima_prefs.json and log files\n✅ Control privacy: Delete preferences/logs anytime, or disable file writes (see below)\n\nConfiguration allowed:\n\n✅ Set API key in environment or agent config:\n\nEnvironment variable: export IMA_API_KEY=ima_your_key_here\nOpenClaw/MCP config: Add IMA_API_KEY to agent's environment configuration\nGet your key at: https://imastudio.com\n\n\n✅ Use scoped/test keys: Test with limited API keys, rotate after testing\n✅ Disable file writes: Make prefs/logs read-only or symlink to /dev/null\n\nData control:\n\n✅ View stored data: cat ~/.openclaw/memory/ima_prefs.json\n✅ Delete preferences: rm ~/.openclaw/memory/ima_prefs.json (resets to defaults)\n✅ Delete logs: rm -rf ~/.openclaw/logs/ima_skills/ (auto-cleanup after 7 days anyway)"
      },
      {
        "title": "⚠️ Advanced Users: Fork & Modify",
        "body": "If you need to modify this skill for your use case:\n\nFork the repository (don't modify the original)\nUpdate your fork with your changes\nTest thoroughly with limited API keys\nDocument your changes for troubleshooting\n\nNote: Modified skills may break API compatibility or introduce security issues. Official support only covers the unmodified version."
      },
      {
        "title": "❌ What to AVOID (Security Risks)",
        "body": "Actions that could compromise security:\n\n❌ Sharing API keys publicly or in skill files\n❌ Modifying API endpoints to unknown servers\n❌ Disabling SSL/TLS certificate verification\n❌ Logging sensitive user data (prompts, IDs, etc.)\n❌ Bypassing authentication or billing mechanisms\n\nWhy this matters:\n\nAPI Compatibility: Skill logic aligns with IMA Open API schema\nSecurity: Malicious modifications could leak credentials or bypass billing\nSupport: Modified skills may not be supported\nCommunity: Breaking changes affect all users"
      },
      {
        "title": "📋 Privacy & Data Handling Summary",
        "body": "What this skill does with your data:\n\nData TypeSent to IMA?Stored Locally?User ControlPrompts (image/video/music)✅ Yes (required for generation)❌ NoNone (required)API key✅ Yes (authentication header)❌ NoSet via env varuser_id (optional CLI arg)❌ Never (local preference key only)✅ Yes (as prefs file key)Change --user-id valueModel preferences❌ No✅ Yes (~/.openclaw)Delete anytimeGeneration logs❌ No✅ Yes (~/.openclaw)Auto-cleanup 7 days\n\nPrivacy recommendations:\n\nUse test/scoped API keys for initial testing\nNote: --user-id is never sent to IMA servers - it's only used locally as a key for storing preferences in ~/.openclaw/memory/ima_prefs.json\nReview source code at scripts/ima_create.py to verify network calls (search for create_task function)\nRotate API keys after testing or if compromised\n\nGet your IMA API key: Visit https://imastudio.com to register and get started."
      },
      {
        "title": "🔧 For Skill Maintainers Only",
        "body": "Version control:\n\nAll changes must go through Git with proper version bumps (semver)\nCHANGELOG.md must document all changes\nProduction deployments require code review\n\nFile checksums (optional):\n\n# Verify skill integrity\nsha256sum SKILL.md scripts/ima_create.py\n\nIf users report issues, verify file integrity first."
      },
      {
        "title": "🧠 User Preference Memory (Image)",
        "body": "User preferences have highest priority when they exist. But preferences are only saved when users explicitly express model preferences — not from automatic model selection."
      },
      {
        "title": "Storage: ~/.openclaw/memory/ima_prefs.json",
        "body": "Single file, shared across all IMA skills:\n\n{\n  \"user_{user_id}\": {\n    \"text_to_image\":  { \"model_id\": \"doubao-seedream-4.5\", \"model_name\": \"SeeDream 4.5\", \"credit\": 5,  \"last_used\": \"2026-02-27T03:07:27Z\" },\n    \"image_to_image\": { \"model_id\": \"doubao-seedream-4.5\", \"model_name\": \"SeeDream 4.5\", \"credit\": 5,  \"last_used\": \"2026-02-27T03:07:27Z\" },\n    \"text_to_speech\": { \"model_id\": \"<from product list>\", \"model_name\": \"...\", \"credit\": 2, \"last_used\": \"...\" }\n  }\n}"
      },
      {
        "title": "Model Selection Flow (Image Generation)",
        "body": "Step 1: Get knowledge-ai recommendation (if installed)\n\nknowledge_recommended_model = read_ima_knowledge_ai()  # e.g., \"SeeDream 4.5\"\n\nStep 2: Check user preference\n\nuser_pref = load_prefs().get(f\"user_{user_id}\", {}).get(task_type)  # e.g., {\"model_id\": \"midjourney\", ...}\n\nStep 3: Decide which model to use\n\nif user_pref exists:\n    use_model = user_pref[\"model_id\"]  # Highest priority\nelse:\n    use_model = knowledge_recommended_model or fallback_default\n\nStep 4: Check for mismatch (for later hint)\n\nif user_pref exists and knowledge_recommended_model != user_pref[\"model_id\"]:\n    mismatch = True  # Will add hint in success message"
      },
      {
        "title": "When to Write (User Explicit Preference ONLY)",
        "body": "✅ Save preference when user explicitly specifies a model:\n\nUser saysAction用XXX / 换成XXX / 改用XXXSwitch to model XXX + save as preference以后都用XXX / 默认用XXX / always use XXXSave + confirm: ✅ 已记住！以后图片生成默认用 [XXX]我喜欢XXX / 我更喜欢XXXSave as preference\n\n❌ Do NOT save when:\n\nAgent auto-selects from knowledge-ai → not user preference\nAgent uses fallback default → not user preference\nUser says generic quality requests (see \"Clear Preference\" below) → clear preference instead"
      },
      {
        "title": "When to Clear (User Abandons Preference)",
        "body": "🗑️ Clear preference when user wants automatic selection:\n\nUser saysAction用最好的 / 用最合适的 / best / recommendedClear pref + use knowledge-ai recommendation推荐一个 / 你选一个 / 自动选择Clear pref + use knowledge-ai recommendation用默认的 / 用新的Clear pref + use knowledge-ai recommendation试试别的 / 换个试试 (without specific model)Clear pref + use knowledge-ai recommendation重新推荐Clear pref + use knowledge-ai recommendation\n\nImplementation:\n\ndel prefs[f\"user_{user_id}\"][task_type]\nsave_prefs(prefs)"
      },
      {
        "title": "⭐ Model Selection Priority (Image)",
        "body": "Selection flow:\n\nUser preference (if exists) → Highest priority, always respect\nima-knowledge-ai skill (if installed) → Professional recommendation based on task\nFallback defaults → Use table below (only if neither 1 nor 2 exists)\n\nImportant notes:\n\nUser preference is only saved when user explicitly specifies a model (see \"When to Write\" above)\nKnowledge-ai is always consulted (even when user pref exists) to detect mismatches\nWhen mismatch detected → add gentle hint in success message (does NOT interrupt generation)\n\nThe defaults below are FALLBACK only. User preferences have highest priority, then knowledge-ai recommendations.\n\nWhen using user preference for image generation, show a line like:\n\n🎨 根据你的使用习惯，将用 [Model Name] 帮你生成…\n• 模型：[Model Name]（你的常用模型）\n• 预计耗时：[X ~ Y 秒]\n• 消耗积分：[N pts]"
      },
      {
        "title": "Preference Change Confirmation",
        "body": "When user switches to a different model than their saved preference:\n\n💡 你之前喜欢用 [Old Model]，这次换成了 [New Model]。\n要把 [New Model] 设为以后的默认吗？\n回复「是」保存 / 回复「否」仅本次使用"
      },
      {
        "title": "⭐ Recommended Defaults",
        "body": "These are fallback defaults — only used when no user preference exists.\nAlways default to the newest and most popular model. Do NOT default to the cheapest.\n\nTask TypeDefault Modelmodel_idversion_idCostWhytext_to_imageSeeDream 4.5doubao-seedream-4.5doubao-seedream-4-5-2511285 ptsLatest doubao flagship, photorealistic 4Ktext_to_image (budget)Nano Banana2gemini-3.1-flash-imagegemini-3.1-flash-image4 ptsFastest and cheapest optiontext_to_image (premium)Nano Banana Progemini-3-pro-imagegemini-3-pro-image-preview10/10/18 ptsPremium quality, 1K/2K/4K optionstext_to_image (artistic)Midjourney 🎨midjourneyv68/10 ptsArtist-level aesthetics, creative stylesimage_to_imageSeeDream 4.5doubao-seedream-4.5doubao-seedream-4-5-2511285 ptsLatest, best i2i qualityimage_to_image (budget)Nano Banana2gemini-3.1-flash-imagegemini-3.1-flash-image4 ptsCheapest optionimage_to_image (premium)Nano Banana Progemini-3-pro-imagegemini-3-pro-image-preview10 ptsPremium qualityimage_to_image (artistic)Midjourney 🎨midjourneyv68/10 ptsArtist-level aesthetics, style transfertext_to_videoWan 2.6wan2.6-t2vwan2.6-t2v25 pts🔥 Most popular t2v, balanced costtext_to_video (premium)Hailuo 2.3MiniMax-Hailuo-2.3MiniMax-Hailuo-2.338 ptsHigher qualitytext_to_video (budget)Vidu Q2viduq2viduq25 ptsLowest cost t2vimage_to_videoWan 2.6wan2.6-i2vwan2.6-i2v25 pts🔥 Most popular i2v, 1080Pimage_to_video (premium)Kling 2.6kling-v2-6kling-v2-640-160 ptsPremium Kling i2vfirst_last_frame_to_videoKling O1kling-video-o1kling-video-o148 ptsNewest Kling reasoning modelreference_image_to_videoKling O1kling-video-o1kling-video-o148 ptsBest reference fidelitytext_to_musicSuno (sonic-v4)sonicsonic25 ptsLatest Suno engine, best qualitytext_to_speech(query product list)———Run --task-type text_to_speech --list-models; use first or user-preferred model_id\n\nPremium options:\n\nImage: Nano Banana Pro — Highest quality with size control (1K/2K/4K), higher cost (10-18 pts for text_to_image, 10 pts for image_to_image)\nVideo: Kling O1, Sora 2 Pro, Google Veo 3.1 — Premium quality with longer duration options\n\nQuick selection guide (production as of 2026-02-27, sorted by popularity):\n\nImage (4 models available) → SeeDream 4.5 (5, default); artistic → Midjourney 🎨 (8-10); budget → Nano Banana2 (4, 512px); premium → Nano Banana Pro (10-18)\n🔥 Video from text (most popular) → Wan 2.6 (25, balanced); premium → Hailuo 2.3 (38); budget → Vidu Q2 (5)\n🔥 Video from image (most popular) → Wan 2.6 (25)\nMusic → Suno (25); DouBao BGM/Song (30 each)\nCheapest → Nano Banana2 512px (4) for image; Vidu Q2 (5) for video\n\nSelection guide by use case:\n\nImage Generation:\n\nGeneral image generation → SeeDream 4.5 (5pts)\nCustom aspect ratio (16:9, 9:16, 4:3, etc.) → SeeDream 4.5 🌟 or Nano Banana Pro/2 🆕 (native support)\nBudget-conscious / fast generation → Nano Banana2 (4pts)\nHighest quality with size control (1K/2K/4K) → Nano Banana Pro (text_to_image: 10-18pts, image_to_image: 10pts)\nArtistic/creative styles, illustrations, paintings → Midjourney 🎨 (8-10pts)\nStyle transfer / image editing → SeeDream 4.5 (5pts) or Midjourney 🎨 (artistic)\n\nVideo Generation:\n\nGeneral video generation → Wan 2.6 (25pts, most popular)\nPremium cinematic quality → Google Veo 3.1 (70-330pts) or Sora 2 Pro (122+pts)\nBudget video → Vidu Q2 (5pts) or Hailuo 2.0 (5pts)\nWith audio support → Kling O1 (48+pts) or Google Veo 3.1 (70+pts)\nFirst/last frame animation → Kling O1 (48+pts)\nReference image consistency → Kling O1 (48+pts) or Google Veo 3.1 (70+pts)\n\nMusic Generation:\n\nCustom song with lyrics, vocals, style → Suno sonic-v5 (25pts, default, ~2min)\n\nFull control: custom_mode, lyrics, vocal_gender, tags, negative_tags\nBest for: complete songs, vocal tracks, artistic compositions\n\n\nBackground music / ambient loop → DouBao BGM (30pts, ~30s)\n\nSimplified: prompt-only, no advanced parameters\nBest for: video backgrounds, ambient music, short loops\n\n\nSimple song generation → DouBao Song (30pts, ~30s)\n\nSimplified: prompt-only\nBest for: quick song generation, structured vocal compositions\n\n\nUser explicitly asks for cheapest → DouBao BGM/Song (6pts each) — only if explicitly requested\n\nSpeech (TTS) Generation:\n\nText-to-speech / 语音合成 / 朗读 → text_to_speech. Always query GET /open/v1/product/list?category=text_to_speech (or --list-models) to get current model_id and credit. No fixed default; use first available or user preference. Voice/speed/format parameters: see \"Model and parameter parsing\" (TTS table) and \"Speech (TTS) — text_to_speech\" in this document.\n\n⚠️ Technical Note for Suno:\n\nmodel_version inside parameters.parameters (e.g., \"sonic-v5\") is different from the outer model_version field (which is \"sonic\"). Always set both correctly when creating Suno tasks.\n\n⚠️ Production Image Models (4 available):\n\nSeeDream 4.5 (doubao-seedream-4.5) — 5 pts, default\nMidjourney 🎨 (midjourney) — 8/10 pts for 480p/720p, artistic styles\nNano Banana2 (gemini-3.1-flash-image) — 4/6/10/13 pts for 512px/1K/2K/4K\nNano Banana Pro (gemini-3-pro-image) — 10/10/18 pts for 1K/2K/4K\n\nAll other image models mentioned in older documentation are no longer available in production.\n\n🌟 Parameter Support Notes (All Task Types):"
      },
      {
        "title": "Image Models (text_to_image / image_to_image)",
        "body": "🆕 MAJOR UPDATE: Nano Banana series now has NATIVE aspect_ratio support!\n\nNano Banana Pro: ✅ Supports aspect_ratio (1:1, 16:9, 9:16, 4:3, 3:4) NATIVELY\nNano Banana2: ✅ Supports aspect_ratio (1:1, 16:9, 9:16, 4:3, 3:4) NATIVELY\nSeeDream 4.5: ✅ Supports 8 ratios via virtual params (1:1, 16:9, 9:16, 4:3, 3:4, 2:3, 3:2, 21:9)\nMidjourney: ❌ 1:1 only (fixed 1024x1024)\n\naspect_ratio support details:\n\n✅ aspect_ratio:\n\nSeeDream 4.5: ✅ Supports 8 ratios via virtual params (1:1, 16:9, 9:16, 4:3, 3:4, 2:3, 3:2, 21:9)\nNano Banana2: ✅ Native support for 5 ratios (1:1, 16:9, 9:16, 4:3, 3:4)\nNano Banana Pro: ✅ Native support for 5 ratios (1:1, 16:9, 9:16, 4:3, 3:4)\nMidjourney: ❌ 1:1 only (fixed 1024x1024)\n\n\n✅ size:\n\nNano Banana2: 512px, 1K, 2K, 4K (via different attribute_ids, 4-13 pts)\nNano Banana Pro: 1K, 2K, 4K (via different attribute_ids, 10-18 pts)\nSeeDream 4.5: Adaptive default (5 pts)\nMidjourney: 480p/720p (via attribute_id, 8/10 pts)\n\n\n❌ 8K: No model supports 8K (max is 4K via Nano Banana Pro)\n❌ Non-standard aspect ratios (7:3, 8:5, etc.): Not supported. Use closest supported ratio or video models.\n✅ n: Multiple outputs supported (1-4), credit × n\n\nWhen user requests unsupported combinations for images:\n\nMidjourney + aspect_ratio (16:9, etc.): Recommend SeeDream 4.5 or Nano Banana series instead\n❌ Midjourney 暂不支持自定义 aspect_ratio（仅支持 1024x1024 方形）\n\n✅ 推荐方案：\n  1. SeeDream 4.5（支持虚拟参数 aspect_ratio）\n     • 支持比例：1:1, 16:9, 9:16, 4:3, 3:4, 2:3, 3:2, 21:9\n     • 成本：5 积分（性价比最佳）\n  2. Nano Banana Pro/2（原生支持 aspect_ratio）\n     • 支持比例：1:1, 16:9, 9:16, 4:3, 3:4\n     • 成本：4-18 积分（按尺寸）\n\n需要我帮你用 SeeDream 4.5 生成吗？\n\n\nAny model + 8K: Inform user no model supports 8K, max is 4K (Nano Banana Pro)\nAny model + non-standard ratio (7:3, 8:5, etc.): Non-standard ratio, not supported. Suggest closest supported ratio (e.g., 21:9 for ultra-wide, 2:3 for portrait)"
      },
      {
        "title": "Video Models (text_to_video / image_to_video / first_last_frame / reference_image)",
        "body": "✅ resolution: 540P, 720P, 1080P, 2K, 4K (model-dependent, higher res = higher cost)\n✅ aspect_ratio: 16:9, 9:16, 1:1, 4:3 (model-dependent, check form_config)\n✅ duration: 4s, 5s, 10s, 15s (model-dependent, longer = higher cost)\n⚠️ generate_audio: Supported by Veo 3.1, Kling O1, Hailuo (check form_config)\n✅ prompt_extend: AI-powered prompt enhancement (most models support)\n✅ negative_prompt: Content exclusion (most models support)\n✅ shot_type: Single/multi-shot control (model-dependent)\n✅ seed: Reproducibility control (most models support, -1 = random)\n✅ n: Multiple outputs (1-4), credit × n\n\n🆕 Special Case: Pixverse Model Parameter (v1.0.7+)\n\nAuto-Inference Logic for Pixverse V5.5/V5/V4:\n\nProblem: Pixverse V5.5, V5, V4 lack model field in form_config from Product List API\nBackend Requirement: Backend requires model parameter (e.g., \"v5.5\", \"v5\", \"v4\")\nAuto-Fix: System automatically extracts version from model_name and injects it\n\nExample: model_name: \"Pixverse V5.5\" → auto-inject model: \"v5.5\"\nExample: model_name: \"Pixverse V4\" → auto-inject model: \"v4\"\n\n\nNote: V4.5 and V3.5 include model in form_config (no auto-inference needed)\nRelevant Task Types: All video modes (text_to_video, image_to_video, first_last_frame_to_video, reference_image_to_video)\n\nError Prevention:\n\nWithout auto-inference: err_code=400017 err_msg=Invalid value for model\nWith auto-inference (v1.0.7+): Pixverse V5.5/V5/V4 work seamlessly ✅"
      },
      {
        "title": "Music Models (text_to_music)",
        "body": "Suno sonic-v5 (Full-Featured):\n\n✅ custom_mode: Suno only (enables vocal_gender, lyrics, tags support)\n✅ vocal_gender: Suno only (male/female/mixed, requires custom_mode=True)\n✅ lyrics: Suno only (custom lyrics support, requires custom_mode=True)\n✅ make_instrumental: Suno only (force instrumental, no vocals)\n✅ auto_lyrics: Suno only (AI-generated lyrics)\n✅ tags: Suno only (genre/style tags)\n✅ negative_tags: Suno only (exclude unwanted styles)\n✅ title: Suno only (song title)\n❌ duration: Fixed-length output (DouBao ~30s, Suno ~2min, not user-controllable)\n✅ n: Multiple outputs supported (1-2), credit × n\n\nDouBao BGM/Song (Simplified):\n\n✅ prompt: Text description only\n❌ No advanced parameters (no custom_mode, lyrics, vocal control)\n❌ duration: Fixed ~30s output\n\n🎵 Suno Prompt Writing Guide (for gpt_description_prompt):\n\nWhen using Suno, structure your prompt with these elements:\n\nGenre/Style:\n\nExamples: \"lo-fi hip hop\", \"orchestral cinematic\", \"upbeat pop\", \"dark ambient\", \"indie folk\", \"electronic dance\"\n\n\n\nTempo/BPM:\n\nExamples: \"80 BPM\", \"fast tempo\", \"slow ballad\", \"moderate pace 110 BPM\"\n\n\n\nVocals Control:\n\nNo vocals: \"no vocals\" → set make_instrumental=true\nWith vocals: \"female vocals\" → set vocal_gender=\"female\"\nMale vocals: \"male vocals\" → set vocal_gender=\"male\"\nMixed: Set vocal_gender=\"mixed\"\n\n\n\nMood/Emotion:\n\nExamples: \"happy and energetic\", \"melancholic\", \"tense and dramatic\", \"peaceful and calming\"\n\n\n\nNegative Tags (exclude styles):\n\nUse negative_tags: \"heavy metal, distortion, screaming\" to exclude unwanted elements\n\n\n\nDuration Hint:\n\nExamples: \"60 seconds\", \"30 second loop\", \"2 minute track\"\nNote: Suno typically generates ~2min, not strictly controllable\n\nExample Suno prompts:\n\n\"upbeat lo-fi hip hop, 90 BPM, no vocals, relaxed and chill\"\n→ Set: make_instrumental=true\n\n\"emotional pop ballad, slow tempo, female vocals, melancholic\"\n→ Set: vocal_gender=\"female\"\n\n\"orchestral cinematic trailer music, epic and dramatic, 120 BPM, no vocals\"\n→ Set: make_instrumental=true, tags=\"orchestral,cinematic,epic\"\n\n\"acoustic indie folk, gentle guitar, male vocals, warm and nostalgic\"\n→ Set: vocal_gender=\"male\", tags=\"acoustic,indie,folk\"\n\n⚠️ Technical Note for Suno:\n\nmodel_version inside parameters.parameters (e.g., \"sonic-v5\") is different from the outer model_version field (which is \"sonic\"). Always set both correctly."
      },
      {
        "title": "Common Parameter Patterns",
        "body": "n (batch generation): Supported by ALL models. Cost = base_credit × n. Creates n independent resources.\nseed: Supported by most models (-1 = random, >0 = reproducible results)\nprompt_extend: AI-powered prompt enhancement (video models only)"
      },
      {
        "title": "Decision Tree: When User Requests Unsupported Features",
        "body": "User asks for custom aspect ratio image (e.g. \"7:3 landscape\")\n  → ❌ Image models don't support custom ratios\n  → ✅ Solution: \"图片模型不支持自定义比例。建议用视频模型(Wan 2.6 t2v)生成16:9视频，然后截取首帧作为图片。\"\n\nUser asks for 8K image\n  → ❌ No model supports 8K\n  → ✅ Solution: \"当前最高支持4K分辨率(Nano Banana Pro，18积分)。要使用吗？\"\n\nUser asks for video with audio\n  → Check model: Veo 3.1 / Kling O1 / Hailuo have generate_audio\n  → ✅ Solution: \"Veo 3.1 和 Kling O1 支持音频生成(需在参数中设置 generate_audio=True)。要用哪个？\"\n\nUser asks for long music (e.g. \"5 minute track\")\n  → ❌ Duration not user-controllable\n  → ✅ Solution: \"Suno 生成约2分钟音乐。需要更长时长可以生成多段后拼接。\"\n\nUser asks for 30s video\n  → Check model: Most models max 15s\n  → ✅ Solution: \"当前最长15秒。可选模型：Wan 2.6(15s, 75积分), Kling O1(10s, 96积分)。\"\n\nWhen user requests unsupported combinations:\n\nVideo + audio (unsupported model) → \"该模型不支持音频。建议用 Veo 3.1 或 Kling O1 (支持 generate_audio 参数)\"\nMusic + custom duration → \"音乐时长由模型固定(Suno约2分钟,DouBao约30秒),无法自定义\"\nVideo duration > 15s → \"当前最长15秒。可选模型：Wan 2.6(15s, 75积分), Kling O1(10s, 96积分)\"\n\nNote: Image-specific unsupported combinations (Midjourney + aspect_ratio, 8K, non-standard ratios) are documented in the \"Image Models\" section above."
      },
      {
        "title": "🧠 User Preference Memory (Video)",
        "body": "User preferences have highest priority when they exist. But preferences are only saved when users explicitly express model preferences — not from automatic model selection."
      },
      {
        "title": "Storage: ~/.openclaw/memory/ima_prefs.json",
        "body": "{\n  \"user_{user_id}\": {\n    \"text_to_video\":              { \"model_id\": \"wan2.6-t2v\",      \"model_name\": \"Wan 2.6\",  \"credit\": 25, \"last_used\": \"...\" },\n    \"image_to_video\":             { \"model_id\": \"wan2.6-i2v\",      \"model_name\": \"Wan 2.6\",  \"credit\": 25, \"last_used\": \"...\" },\n    \"first_last_frame_to_video\":  { \"model_id\": \"kling-video-o1\", \"model_name\": \"Kling O1\", \"credit\": 48, \"last_used\": \"...\" },\n    \"reference_image_to_video\":   { \"model_id\": \"kling-video-o1\", \"model_name\": \"Kling O1\", \"credit\": 48, \"last_used\": \"...\" }\n  }\n}"
      },
      {
        "title": "Model Selection Flow (Video Generation)",
        "body": "Step 1: Get knowledge-ai recommendation (if installed)\n\nknowledge_recommended_model = read_ima_knowledge_ai()  # e.g., \"Wan 2.6\"\n\nStep 2: Check user preference\n\nuser_pref = load_prefs().get(f\"user_{user_id}\", {}).get(task_type)  # e.g., {\"model_id\": \"kling-video-o1\", ...}\n\nStep 3: Decide which model to use\n\nif user_pref exists:\n    use_model = user_pref[\"model_id\"]  # Highest priority\nelse:\n    use_model = knowledge_recommended_model or fallback_default\n\nStep 4: Check for mismatch (for later hint)\n\nif user_pref exists and knowledge_recommended_model != user_pref[\"model_id\"]:\n    mismatch = True  # Will add hint in success message"
      },
      {
        "title": "When to Write (User Explicit Preference ONLY)",
        "body": "✅ Save preference when user explicitly specifies a model:\n\nUser saysAction用XXX / 换成XXX / 改用XXXSwitch to model XXX + save as preference以后都用XXX / 默认用XXX / always use XXXSave + confirm: ✅ 已记住！以后视频生成默认用 [XXX]我喜欢XXX / 我更喜欢XXXSave as preference\n\n❌ Do NOT save when:\n\nAgent auto-selects from knowledge-ai → not user preference\nAgent uses fallback default → not user preference\nUser says generic quality requests (see \"Clear Preference\" below) → clear preference instead"
      },
      {
        "title": "When to Clear (User Abandons Preference)",
        "body": "🗑️ Clear preference when user wants automatic selection:\n\nUser saysAction用最好的 / 用最合适的 / best / recommendedClear pref + use knowledge-ai recommendation推荐一个 / 你选一个 / 自动选择Clear pref + use knowledge-ai recommendation用默认的 / 用新的Clear pref + use knowledge-ai recommendation试试别的 / 换个试试 (without specific model)Clear pref + use knowledge-ai recommendation重新推荐Clear pref + use knowledge-ai recommendation\n\nImplementation:\n\ndel prefs[f\"user_{user_id}\"][task_type]\nsave_prefs(prefs)"
      },
      {
        "title": "⭐ Model Selection Priority (Video)",
        "body": "Selection flow:\n\nUser preference (if exists) → Highest priority, always respect\nima-knowledge-ai skill (if installed) → Professional recommendation based on task\nFallback defaults → Use table below (only if neither 1 nor 2 exists)\n\nImportant notes:\n\nUser preference is only saved when user explicitly specifies a model (see \"When to Write\" above)\nKnowledge-ai is always consulted (even when user pref exists) to detect mismatches\nWhen mismatch detected → add gentle hint in success message (does NOT interrupt generation)\n\nThe defaults below are FALLBACK only. User preferences have highest priority, then knowledge-ai recommendations."
      },
      {
        "title": "💬 User Experience Protocol (IM / Feishu / Discord) v2.0 🆕",
        "body": "v2.0 Updates (aligned with ima-image-ai v1.3):\n\nAdded Step 0 for correct message ordering (fixes group chat bug)\nAdded Step 5 for explicit task completion\nEnhanced Midjourney support with proper timing estimates\nNow 6 steps total (0-5): Acknowledgment → Pre-Gen → Progress → Success/Failure → Done\n\nThis skill runs inside IM platforms (Feishu, Discord via OpenClaw).\nGeneration takes 10 seconds (music) up to 6 minutes (video). Never let users wait in silence.\nAlways follow all 6 steps below, every single time."
      },
      {
        "title": "🚫 Never Say to Users",
        "body": "The following are internal implementation details. Never mention them in any user-facing message, under any circumstances:\n\n❌ Never say✅ What users care aboutima_create.py / 脚本 / script—自动化脚本 / automation script—自动处理产品列表查询—自动解析参数和配置—智能轮询 / polling / 轮询—product list / 商品列表接口—attribute_id / model_version / form_config—API 调用 / HTTP 请求—任何技术参数名模型名称、积分、生成时间\n\nUser messages must only contain: model name, estimated/actual time, credits consumed, result URL, and natural language status updates."
      },
      {
        "title": "Estimated Generation Time (All Task Types)",
        "body": "Task TypeModelEstimated TimePoll EverySend Progress Everytext_to_imageSeeDream 4.525~60s5s20sNano Banana2 💚20~40s5s15sNano Banana Pro60~120s5s30sMidjourney 🎨40~90s8s25simage_to_imageSeeDream 4.525~60s5s20sNano Banana2 💚20~40s5s15sNano Banana Pro60~120s5s30sMidjourney 🎨40~90s8s25stext_to_videoWan 2.6, Hailuo 2.0/2.3, Vidu Q2, Pixverse60~120s8s30sSeeDance 1.5 Pro, Kling 2.6, Veo 3.190~180s8s40sKling O1, Sora 2 Pro180~360s8s60simage_to_videoSame ranges as text_to_video—8s40sfirst_last_frame / referenceKling O1, Veo 3.1180~360s8s60stext_to_musicDouBao BGM / Song10~25s5s10sSuno (sonic-v5)20~45s5s15stext_to_speech(varies by model)5~30s3s10s\n\nestimated_max_seconds = upper bound of the range (e.g. 60 for SeeDream 4.5, 40 for Nano Banana2, 120 for Nano Banana Pro, 90 for Midjourney, 180 for Kling 2.6, 360 for Kling O1)."
      },
      {
        "title": "Step 0 — Initial Acknowledgment Reply (Normal Reply) 🆕",
        "body": "⚠️ CRITICAL: This step is essential for correct message ordering in IM platforms (Feishu, Discord).\n\nBefore doing anything else, reply to the user with a friendly acknowledgment message using your normal reply (not message tool). This reply will automatically appear FIRST in the conversation.\n\nExample acknowledgment messages:\n\nFor images:\n\n好的!来帮你画一只萌萌的猫咪 🐱\n\n收到！马上为你生成一张 16:9 的风景照 🏔️\n\nOK! Starting image generation with SeeDream 4.5 🎨\n\nFor videos:\n\n好的!来帮你生成一段视频 🎬\n\n收到！开始用 Wan 2.6 生成视频 🎥\n\nFor music:\n\n好的!来帮你创作一首音乐 🎵\n\nRules:\n\nKeep it short and warm (< 15 words)\nMatch the user's language (Chinese/English)\nInclude relevant emoji (🐱/🎨/🎬/🎵/✨)\nThis is your ONLY normal reply — all subsequent updates use message tool\n\nWhy this matters:\n\nNormal replies automatically appear FIRST in the conversation thread\nmessage tool pushes appear in chronological order AFTER your initial reply\nThis ensures users see: \"好的!\" → \"🎨 开始生成...\" → \"⏳ 进度...\" → \"✅ 成功!\" (correct order)\nWithout Step 0, the confirmation might appear LAST, confusing users"
      },
      {
        "title": "Step 1 — Pre-Generation Notification (Push via message tool)",
        "body": "After Step 0 reply, use the message tool to push a notification immediately:\n\n[Emoji] 开始生成 [内容类型]，请稍候…\n• 模型：[Model Name]\n• 预计耗时：[X ~ Y 秒]\n• 消耗积分：[N pts]\n\nEmoji by content type:\n\n图片 → 🎨\n视频 → 🎬（加注:视频生成需要较长时间，我会定时汇报进度）\n音乐 → 🎵\n\nCost transparency (new requirement):\n\nAlways show credit cost with model tier context\nFor expensive models (>50 pts), offer cheaper alternative proactively\nExamples:\n\nBalanced (default): \"使用 Wan 2.6（25 积分，最新 Wan）\"\nPremium (user explicit): \"使用高端模型 Kling O1（48-120 积分），质量最佳\"\nPremium (auto-selected): \"使用 Wan 2.6（25 积分）。若需更高质量可选 Kling O1（48 积分起）\"\nBudget (user asked): \"使用 Vidu Q2（5 积分，最省钱）\"\n\nAdapt language to match the user (Chinese / English). For video, always add a note that it takes longer. For expensive models, always mention cheaper alternatives unless user explicitly requested premium."
      },
      {
        "title": "Step 2 — Progress Updates",
        "body": "Poll the task detail API every [Poll Every] seconds per the table.\nSend a progress update every [Send Progress Every] seconds.\n\n⏳ 正在生成中… [P]%\n已等待 [elapsed]s，预计最长 [max]s\n\nProgress formula:\n\nP = min(95, floor(elapsed_seconds / estimated_max_seconds * 100))\n\nCap at 95% — never reach 100% until the API confirms success\nIf elapsed > estimated_max: freeze at 95%, append 「快了，稍等一下…」\nFor video with max=360s: at 120s → 33%, at 250s → 69%, at 400s → 95% (frozen)"
      },
      {
        "title": "Step 3 — Success Notification",
        "body": "When task status = success:\n\nFor Video Tasks (text_to_video / image_to_video / first_last_frame / reference_image)\n\n3.1 Send video player first (IM platforms like Feishu will render inline player):\n\n# Get result URL from script output or task detail API\nresult = get_task_result(task_id)\nvideo_url = result[\"medias\"][0][\"url\"]\n\n# Build caption\ncaption = f\"\"\"✅ 视频生成成功！\n• 模型：[Model Name]\n• 耗时：预计 [X~Y]s，实际 [actual]s\n• 消耗积分：[N pts]\n\n[视频描述]\"\"\"\n\n# Add mismatch hint if user pref conflicts with knowledge-ai recommendation\nif user_pref_exists and knowledge_recommended_model != used_model:\n    caption += f\"\"\"\n\n💡 提示：当前任务也许用 {knowledge_recommended_model} 也会不错（{reason}，{cost} pts）\"\"\"\n\n# Send video with caption (use message tool if available)\nmessage(\n    action=\"send\",\n    media=video_url,  # ⚠️ Use HTTPS URL directly, NOT local file path\n    caption=caption\n)\n\nImportant:\n\nHint is non-intrusive — does NOT interrupt generation\nOnly shown when user pref conflicts with knowledge-ai recommendation\nUser can ignore the hint; video is already delivered\n\n3.2 Then send link as text (for copying/sharing):\n\n# Send link message immediately after video\nmessage(action=\"send\", text=f\"🔗 视频链接（可复制分享）：\\n{video_url}\")\n\n⚠️ Critical for video:\n\nSend video player FIRST (inline preview)\nSend text link SECOND (for copying)\nInclude first-frame thumbnail URL if available: result[\"medias\"][0][\"cover\"]\n\nFor Image Tasks (text_to_image / image_to_image)\n\n# Build caption\ncaption = f\"\"\"✅ 图片生成成功！\n• 模型：[Model Name]\n• 耗时：预计 [X~Y]s，实际 [actual]s\n• 消耗积分：[N pts]\n\n🔗 原始链接：{image_url}\"\"\"\n\n# Add mismatch hint if user pref conflicts with knowledge-ai recommendation\nif user_pref_exists and knowledge_recommended_model != used_model:\n    caption += f\"\"\"\n\n💡 提示：当前任务也许用 {knowledge_recommended_model} 也会不错（{reason}，{cost} pts）\"\"\"\n\n# Send image with caption\nmessage(\n    action=\"send\",\n    media=image_url,\n    caption=caption\n)\n\nImportant:\n\nHint is non-intrusive — does NOT interrupt generation\nOnly shown when user pref conflicts with knowledge-ai recommendation\nUser can ignore the hint; image is already delivered\n\nFor Music Tasks (text_to_music)\n\nSend audio file with player:\n\n✅ 音乐生成成功！\n• 模型：[Model Name]\n• 耗时：预计 [X~Y]s，实际 [actual]s\n• 消耗积分：[N pts]\n• 时长：约 [duration]\n\n[音频URL或直接发送音频文件]\n\nFor TTS Tasks (text_to_speech) — Full UX Protocol (Steps 0–5)\n\nStep 0 — Initial acknowledgment (normal reply)\nFirst reply with a short acknowledgment, e.g.: 好的，正在帮你把这段文字转成语音。 / OK, converting this text to speech.\n\nStep 1 — Pre-generation (message tool)\nPush once:\n\n🔊 开始语音合成，请稍候…\n• 模型：[Model Name]\n• 预计耗时：[X ~ Y 秒]\n• 消耗积分：[N pts]\n\nStep 2 — Progress\nPoll every 2–5s. Every 10–15s send: ⏳ 语音合成中… [P]%，已等待 [elapsed]s，预计最长 [max]s. Cap progress at 95% until API returns success.\n\nStep 3 — Success (message tool)\nWhen resource_status == 1 and status != \"failed\", send media = medias[0].url and caption:\n\n✅ 语音合成成功！\n• 模型：[Model Name]\n• 耗时：实际 [actual]s\n• 消耗积分：[N pts]\n🔗 原始链接：[url]\n\nUse the URL from the API (do not use local file paths).\n\nStep 4 — Failure (message tool)\nOn failure, send user-friendly message. TTS error translation (do not expose raw API errors):\n\nTechnical✅ Say (CN)✅ Say (EN)401 Unauthorized密钥无效或未授权，请至 imaclaw.ai 生成新密钥API key invalid; generate at imaclaw.ai4008 Insufficient points积分不足，请至 imaclaw.ai 购买积分Insufficient points; buy at imaclaw.aiInvalid product attribute参数配置异常，请稍后重试Configuration error, try again laterError 6006 / 6010积分或参数不匹配，请换模型或重试Points/params mismatch, try another modelresource_status == 2 / status failed语音合成失败，建议换模型或缩短文本Synthesis failed, try another model or shorter texttimeout合成超时，请稍后重试Timed out, try again laterNetwork error网络不稳定，请检查后重试Network unstable, check and retryText too long (TTS)文本过长，请缩短后重试Text too long, please shorten\n\nLinks: API key — https://www.imaclaw.ai/imaclaw/apikey ；Credits — https://www.imaclaw.ai/imaclaw/subscription\n\nStep 5 — Done\nAfter Step 0–4, no further reply needed. Do not send duplicate confirmations."
      },
      {
        "title": "Step 4 — Failure Notification",
        "body": "When task status = failed or any API/network error, send:\n\n❌ [内容类型]生成失败\n• 原因：[natural_language_error_message]\n• 建议改用：\n  - [Alt Model 1]（[特点]，[N pts]）\n  - [Alt Model 2]（[特点]，[N pts]）\n\n需要我帮你用其他模型重试吗？\n\n⚠️ CRITICAL: Error Message Translation\n\nNEVER show technical error messages to users. Always translate API errors into natural language.\nAPI key & credits: 密钥与积分管理入口为 imaclaw.ai（与 imastudio.com 同属 IMA 平台）。Key and subscription management: imaclaw.ai (same IMA platform as imastudio.com).\n\nTechnical Error❌ Never Say✅ Say Instead (Chinese)✅ Say Instead (English)401 Unauthorized 🆕Invalid API key / 401 Unauthorized❌ API密钥无效或未授权<br>💡 生成新密钥: https://www.imaclaw.ai/imaclaw/apikey❌ API key is invalid or unauthorized<br>💡 Generate API Key: https://www.imaclaw.ai/imaclaw/apikey4008 Insufficient points 🆕Insufficient points / Error 4008❌ 积分不足，无法创建任务<br>💡 购买积分: https://www.imaclaw.ai/imaclaw/subscription❌ Insufficient points to create this task<br>💡 Buy Credits: https://www.imaclaw.ai/imaclaw/subscription\"Invalid product attribute\" / \"Insufficient points\"Invalid product attribute生成参数配置异常，请稍后重试Configuration error, please try again laterError 6006 (credit mismatch)Error 6006积分计算异常，系统正在修复Points calculation error, system is fixingError 6009 (no matching rule)Error 6009参数组合不匹配，已自动调整Parameter mismatch, auto-adjustedError 6010 (attribute_id mismatch)Attribute ID does not match模型参数不匹配，请尝试其他模型Model parameters incompatible, try another modelerror 400 (bad request)error 400 / Bad request请求参数有误，请稍后重试Invalid request parameters, please try againresource_status == 2Resource status 2 / Failed生成过程遇到问题，建议换个模型试试Generation failed, please try another modelstatus == \"failed\" (no details)Task failed这次生成没成功，要不换个模型试试？Generation unsuccessful, try a different model?timeoutTask timed out / Timeout error生成时间过长已超时，建议用更快的模型Generation took too long, try a faster modelNetwork error / Connection refusedConnection refused / Network error网络连接不稳定，请检查网络后重试Network connection unstable, check network and retryRate limit exceeded429 Too Many Requests / Rate limit请求过于频繁，请稍等片刻再试Too many requests, please wait a momentPrompt moderation (Sora only)Content policy violation提示词包含敏感内容，请修改后重试Prompt contains restricted content, please modifyModel unavailableModel not available / 503 Service Unavailable当前模型暂时不可用，建议换个模型Model temporarily unavailable, try another modelLyrics format error (Suno only) 🎵Invalid lyrics format歌词格式有误，请调整后重试Lyrics format error, adjust and retryPrompt too short/long (Music) 🎵Prompt length invalid音乐描述过短或过长，请调整到合适长度 (建议20-100字)Music description too short or long, adjust to appropriate length (20-100 chars recommended)Text too long (TTS) 🔊TTS text length文本过长，请缩短后重试Text too long, please shorten and retry\n\nGeneric fallback (when error is unknown):\n\nChinese: 生成过程遇到问题，请稍后重试或换个模型试试\nEnglish: Generation encountered an issue, please try again or use another model\n\nBest Practices:\n\nFocus on user action: Tell users what to do next, not what went wrong technically\nBe reassuring: Use phrases like \"建议换个模型试试\" instead of \"失败了\"\nAvoid blame: Never say \"你的提示词有问题\" → say \"提示词需要调整一下\"\nProvide alternatives: Always suggest 1-2 alternative models in the failure message\n🆕 Include actionable links (v1.0.8+): For 401/4008 errors, provide clickable links to API key generation or credit purchase pages\n🎵 Music-specific (v1.2.0+):\n\nFor Suno lyrics errors, suggest simplifying lyrics or using auto-generated lyrics (auto_lyrics=true)\nFor prompt length errors, give example length (e.g., \"建议20-100字\")\nFor BGM requests, recommend DouBao BGM over Suno\n\n\n🔊 TTS-specific: Use the TTS error translation table in \"For TTS Tasks (text_to_speech)\" above; suggest another model via --list-models or shortening text."
      },
      {
        "title": "Step 5 — Done (No Further Action Needed) 🆕",
        "body": "After sending Step 3 (success) or Step 4 (failure):\n\nDO NOT send any additional messages unless the user asks a follow-up question\nThe task is complete — wait for the user's next request\nUser preference has been saved (if generation succeeded)\nThe conversation is ready for the next generation request\n\nWhy this step matters:\n\nPrevents unnecessary \"anything else?\" messages that clutter the chat\nAllows users to naturally continue the conversation when ready\nRespects the asynchronous nature of IM platforms\n\nException: If the user explicitly asks \"还有别的吗？\" or similar, then respond naturally.\n\n🆕 Enhanced Error Handling (v1.0.8):\n\nThe Reflection mechanism (3 automatic retries) now provides specific, actionable suggestions for common errors:\n\n401 Unauthorized: System suggests generating a new API key with clickable link\n4008 Insufficient Points: System suggests purchasing credits with clickable link\n500 Internal Server Error: Automatic parameter degradation (size, resolution, duration, quality)\n6009 No Rule Match: Automatic parameter completion from credit_rules\n6010 Attribute Mismatch: Automatic credit_rule reselection\nTimeout: Helpful info with dashboard link for background task status\n\nAll error handling is automatic and transparent — users receive natural language explanations with next steps.\n\nFailure fallback by task type:\n\nTask TypeFailed ModelFirst AltSecond Alttext_to_imageSeeDream 4.5Nano Banana2 (4pts, fast)Nano Banana Pro (10-18pts, premium)text_to_imageNano Banana2SeeDream 4.5 (5pts, better quality)Nano Banana Pro (10-18pts)text_to_imageNano Banana ProSeeDream 4.5 (5pts)Nano Banana2 (4pts, budget)image_to_imageSeeDream 4.5Nano Banana2 (4pts, fast)Nano Banana Pro (10pts)image_to_imageNano Banana2SeeDream 4.5 (5pts)Nano Banana Pro (10pts)image_to_imageNano Banana ProSeeDream 4.5 (5pts)Nano Banana2 (4pts)text_to_videoKling O1Wan 2.6 (25pts)Vidu Q2 (5pts)text_to_videoGoogle Veo 3.1Kling O1 (48pts)Sora 2 Pro (122pts)text_to_videoAnyWan 2.6 (25pts, most popular)Hailuo 2.0 (5pts)image_to_videoWan 2.6Kling O1 (48pts)Hailuo 2.0 i2v (25pts)image_to_videoAnyWan 2.6 (25pts, most popular)Vidu Q2 Pro (20pts)first_last / referenceKling O1Kling 2.6 (80pts)Veo 3.1 (70pts+)text_to_music 🎵SunoDouBao BGM (30pts, 背景音乐)DouBao Song (30pts, 歌曲生成)text_to_music 🎵DouBao BGMDouBao Song (30pts)Suno (25pts, 功能最强)text_to_music 🎵DouBao SongDouBao BGM (30pts)Suno (25pts, 功能最强)text_to_speech 🔊(any)Query --list-models for alternativesUse another model_id from product list\n\nMusic-specific failure guidance:\n\nIf Suno fails → Recommend DouBao BGM (for background music) or DouBao Song (for songs)\nIf DouBao BGM fails → Try DouBao Song first (similar pricing), then Suno (more powerful)\nIf DouBao Song fails → Try DouBao BGM first (similar pricing), then Suno (more powerful)\nFor lyrics errors in Suno → Suggest simplifying lyrics or using auto_lyrics=true\nFor prompt length errors → Recommend 20-100 characters\n\nTTS-specific failure guidance:\n\nIf TTS fails → Run --task-type text_to_speech --list-models and suggest another model_id; or shorten text / simplify content. Use the TTS error translation table in \"For TTS Tasks\" above for user-facing messages."
      },
      {
        "title": "Supported Models at a Glance",
        "body": "Source: production GET /open/v1/product/list (2026-02-27). Model count reduced significantly. Always query product list API at runtime."
      },
      {
        "title": "Image Generation (4 models each)",
        "body": "CategoryNamemodel_idCosttext_to_imageSeeDream 4.5 🌟doubao-seedream-4.55 ptstext_to_imageMidjourney 🎨midjourney8/10 pts (480p/720p)text_to_imageNano Banana2 💚gemini-3.1-flash-image4/6/10/13 ptstext_to_imageNano Banana Progemini-3-pro-image10/10/18 ptsimage_to_imageSeeDream 4.5 🌟doubao-seedream-4.55 ptsimage_to_imageMidjourney 🎨midjourney8/10 pts (480p/720p)image_to_imageNano Banana2 💚gemini-3.1-flash-image4/6/10/13 ptsimage_to_imageNano Banana Progemini-3-pro-image10 pts\n\nMidjourney attribute_ids: 5451/5452 (text_to_image), 5453/5454 (image_to_image)\nNano Banana2 size options: 512px (4pts), 1K (6pts), 2K (10pts), 4K (13pts)\nNano Banana Pro size options: 1K (10pts), 2K (10pts), 4K (18pts for t2i / 10pts for i2i)\n\nImage Model Capabilities (Parameter Support)\n\n⚠️ Critical: Models have varying parameter support. Custom aspect ratios are now supported by multiple models.\n\nModelCustom Aspect RatioMax ResolutionSize OptionsNotesSeeDream 4.5✅ (via virtual params)4K (adaptive)8 aspect ratiosSupports 1:1, 16:9, 9:16, 4:3, 3:4, 2:3, 3:2, 21:9 (5 pts)Nano Banana2✅ Native support 🆕4K (4096×4096)512px/1K/2K/4K + aspect ratiosSupports 1:1, 16:9, 9:16, 4:3, 3:4; size via attribute_idNano Banana Pro✅ Native support 🆕4K (4096×4096)1K/2K/4K + aspect ratiosSupports 1:1, 16:9, 9:16, 4:3, 3:4; size via attribute_idMidjourney 🎨❌ (1:1 only)1024px (square)480p/720p via attribute_idFixed 1024x1024, artistic style focus\n\nKey Capabilities:\n\n✅ Aspect ratio control: SeeDream 4.5 (virtual params), Nano Banana Pro/2 (native support)\n❌ 8K: Not supported by any model (max is 4K)\n✅ Size control: Nano Banana2, Nano Banana Pro, and Midjourney support multiple size options via different attribute_ids\n✅ Budget option: Nano Banana2 is the cheapest at 4 pts for 512px, but 4K costs 13pts\n🎨 Artistic styles: Midjourney excels at creative, artistic, and illustration styles\n💡 Best value: SeeDream 4.5 at 5pts offers aspect ratio flexibility; Nano Banana2 512px at 4pts for fastest/cheapest"
      },
      {
        "title": "Video Generation",
        "body": "CategoryNamemodel_idCost Rangetext_to_video (14)Wan 2.6 🔥wan2.6-t2v25-120 ptsHailuo 2.3MiniMax-Hailuo-2.332+ ptsHailuo 2.0MiniMax-Hailuo-025+ ptsVidu Q2viduq25-70 ptsSeeDance 1.5 Prodoubao-seedance-1.5-pro20+ ptsSora 2 Prosora-2-pro122+ ptsKling O1kling-video-o148-120 ptsKling 2.6kling-v2-680+ ptsGoogle Veo 3.1veo-3.1-generate-preview70-330 ptsPixverse V5.5 / V5 / V4.5 / V4 / V3.5pixverse12-48 ptsimage_to_video (14)Wan 2.6 🔥wan2.6-i2v25-120 ptsHailuo 2.3 / 2.0MiniMax-Hailuo-2.3/0225-32 ptsVidu Q2 Providuq2-pro20-70 ptsSeeDance 1.5 Prodoubao-seedance-1.5-pro47+ ptsSora 2 Prosora-2-pro122+ ptsKling O1 / 2.6kling-video-o1/v2-648-120 ptsGoogle Veo 3.1veo-3.1-generate-preview70-330 ptsPixverse V5.5-V3.5pixverse12-48 ptsfirst_last_frame (11)Kling O1 🌟kling-video-o148-120 ptsKling 2.6kling-v2-680+ ptsOthers (9)Hailuo 2.0, Vidu Q2 Pro, SeeDance 1.5 Pro, Veo 3.1, Pixverse V5.5-V3.5—reference_image (6)Kling O1 🌟kling-video-o148-120 ptsGoogle Veo 3.1veo-3.1-generate-preview70-330 ptsOthers (4)Vidu Q2, Pixverse V5.5/V5/V4.5—\n\n| text_to_video | SeeDance 1.5 Pro / 1.0 Pro | doubao-seedance-1.5-pro / doubao-seedance-1.0-pro | 16 / 15 pts |\n| text_to_video | Sora 2 Pro / Sora 2 | sora-2-pro / sora-2 | 120 / 35 pts |\n| text_to_video | Kling O1 / 2.6 / 2.5 Turbo / 1.6 | kling-video-o1 / kling-v2-6 / kling-v2-5-turbo / kling-v1-6 | 48 / 80 / 24 / 32 pts |\n| text_to_video | Google Veo 3.1 Fast / 3.1 / 3.0 | veo-3.1-fast-generate-preview / veo-3.1-generate-preview / veo-3.0-generate-preview | 55 / 140 / 280 pts |\n| text_to_video | Pixverse V3.5–V5.5 | pixverse | 12 pts |\n| image_to_video | Wan 2.6 / 2.6 Flash / 2.5 / 2.2 Plus | wan2.6-i2v / wan2.6-i2v-flash / wan2.5-i2v-preview / wan2.2-i2v-plus | 25 / 12 / 12 / 10 pts |\n| image_to_video | Kling 2.1 Master | kling-v2-1-master | 150 pts |\n| first_last_frame_to_video | Kling O1 | kling-video-o1 | 70 pts |\n| reference_image_to_video | Kling O1 / Vidu Q2 / Q1 | kling-video-o1 / viduq2 / viduq1 | 48 / 10 / 25 pts |"
      },
      {
        "title": "Music Generation",
        "body": "CategoryNamemodel_idCostNotestext_to_musicSunosonic25 ptssonic-v5; custom_mode, lyrics, vocal_gendertext_to_musicDouBao BGMGenBGM30 ptsBackground musictext_to_musicDouBao SongGenSong30 ptsSong generation"
      },
      {
        "title": "Speech (TTS) — text_to_speech",
        "body": "Models and credits are not fixed. Always call GET /open/v1/product/list?category=text_to_speech (or run the script with --task-type text_to_speech --list-models) to get current model_id, attribute_id, and credit.\n\nima-all-ai has complete TTS capability: This document and the bundled ima_create.py provide full TTS support (routing, parameters, create/poll, UX protocol Steps 0–5, error translation). The ima-tts-ai skill is an optional standalone package with the same specification.\n\nTTS Task Detail — Response Shape\n\nPoll POST /open/v1/tasks/detail until completion. For TTS, medias[] uses the same structure as other IMA audio tasks:\n\nFieldTypeMeaningresource_statusint or null0=处理中, 1=可用, 2=失败, 3=已删除；null 视为 0statusstring\"pending\" / \"processing\" / \"success\" / \"failed\"urlstringAudio URL when resource_status=1 (mp3/wav)duration_strstringOptional, e.g. \"12s\"formatstringOptional, e.g. \"mp3\", \"wav\"\n\nSuccess example: When all medias have resource_status == 1 and status != \"failed\", read medias[0].url (or watermark_url). Example: {\"medias\":[{\"resource_status\":1,\"status\":\"success\",\"url\":\"https://cdn.../output.mp3\",\"duration_str\":\"12s\",\"format\":\"mp3\"}]}.\n\nTTS Create Task — Request Shape\n\ntask_type: \"text_to_speech\". No image input: src_img_url: [], input_images: []. prompt (text to speak) must be inside parameters[].parameters, not at top level. Extra fields (e.g. voice_id, speed) come from product form_config; pass via --extra-params and only include params present in the product’s credit_rules/form_config.\n\nTTS Common Mistakes\n\nMistakeFixprompt at top levelPut prompt inside parameters[].parameters (script does this)Wrong or missing attribute_idAlways call product list first; use credit_rulesSingle pollPoll until all medias have resource_status == 1Ignoring status when resource_status=1Check status != \"failed\"Sending params not in form_config/credit_rulesUse only params from product list; script reflection strips others on retry\n\nAlways call GET /open/v1/product/list?category=<type> first to get the live attribute_id and form_config defaults required for task creation.\n\nThere are two equivalent route systems serving the same backend logic:\n\nRouteAuthUse Case/open/v1/Authorization: Bearer ima_* onlyThird-party / agent access/api/v3/Token + API Key (dual auth)Frontend App\n\nThis skill documents the /open/v1/ Open API. All business logic (credit validation, N-flattening, risk control) runs identically on both paths."
      },
      {
        "title": "Environment",
        "body": "Base URL: https://api.imastudio.com\n\nRequired/recommended headers for all /open/v1/ endpoints:\n\nHeaderRequiredValueNotesAuthorization✅Bearer ima_your_api_key_hereAPI key authenticationx-app-source✅ima_skillsFixed value — identifies skill-originated requestsx_app_languagerecommendeden / zhProduct label language; defaults to en if omitted\n\nAuthorization: Bearer ima_your_api_key_here\nx-app-source: ima_skills\nx_app_language: en"
      },
      {
        "title": "📤 When to Upload Images (Quick Reference)",
        "body": "The IMA Open API does NOT accept raw bytes or base64 images. All image inputs must be public HTTPS URLs.\n\nTask TypeInput Required?Upload Before Create?Notestext_to_image❌ No—Prompt onlyimage_to_image✅ Yes (1 image)✅ Upload firstSingle input imagetext_to_video❌ No—Prompt onlyimage_to_video✅ Yes (1 image)✅ Upload firstSingle input imagefirst_last_frame_to_video✅ Yes (2 images)✅ Upload firstFirst + last framereference_image_to_video✅ Yes (1+ images)✅ Upload firstReference image(s)text_to_music❌ No—Prompt onlytext_to_speech❌ No—Prompt only (text to speak)\n\nUpload flow:\n\nUser provides local file path or bytes → call prepare_image_url() (see section below)\nUser provides HTTPS URL → use directly, no upload needed\nUse the returned CDN URL (fdl) as the value for input_images / src_img_url\n\nExample workflow (image_to_image):\n\n# User provides local file\nimage_url = prepare_image_url(\"/path/to/photo.jpg\", api_key)\n# → Returns: https://ima-ga.esxscloud.com/webAgent/privite/2026/02/27/..._uuid.jpeg\n\n# Then create task with this URL\ncreate_task(\n    task_type=\"image_to_image\",\n    input_images=[image_url],  # Use uploaded URL\n    prompt=\"turn into oil painting\"\n)"
      },
      {
        "title": "⚠️ MANDATORY: Always Query Product List First",
        "body": "CRITICAL: You MUST call /open/v1/product/list BEFORE creating any task.\nThe attribute_id field is REQUIRED in the create request. If it is 0 or missing, you get:\n\"Invalid product attribute\" → \"Insufficient points\" → task fails completely.\nNEVER construct a create request from the model table alone. Always fetch the product first."
      },
      {
        "title": "How to get attribute_id (all task types)",
        "body": "# Query product list with the correct category\nGET /open/v1/product/list?app=ima&platform=web&category=<task_type>\n# task_type: text_to_image | image_to_image | text_to_video | image_to_video |\n#            first_last_frame_to_video | reference_image_to_video | text_to_music | text_to_speech\n\n# Walk the V2 tree to find your target model (type=3 leaf nodes only)\nfor group in response[\"data\"]:\n    for version in group.get(\"children\", []):\n        if version[\"type\"] == \"3\" and version[\"model_id\"] == target_model_id:\n            attribute_id  = version[\"credit_rules\"][0][\"attribute_id\"]\n            credit        = version[\"credit_rules\"][0][\"points\"]\n            model_version = version[\"id\"]    # = version_id / model_version\n            model_name    = version[\"name\"]\n            form_defaults = {f[\"field\"]: f[\"value\"] for f in version[\"form_config\"]}\n            break"
      },
      {
        "title": "Quick Reference: Known attribute_ids",
        "body": "Pre-queried values for convenience. Always call the product list at runtime for accuracy.\n\nModelTask Typemodel_idattribute_idcreditNotestext_to_imageSeeDream 4.5text_to_imagedoubao-seedream-4.523415 ptsDefault, balancedNano Banana Pro (1K)text_to_imagegemini-3-pro-image239910 pts1024×1024Nano Banana Pro (2K)text_to_imagegemini-3-pro-image240010 pts2048×2048Nano Banana Pro (4K)text_to_imagegemini-3-pro-image240118 pts4096×4096text_to_videoWan 2.6 (720P, 5s)text_to_videowan2.6-t2v205725 ptsDefault, balancedWan 2.6 (1080P, 5s)text_to_videowan2.6-t2v205840 pts—Wan 2.6 (720P, 10s)text_to_videowan2.6-t2v205950 pts—Wan 2.6 (1080P, 10s)text_to_videowan2.6-t2v206080 pts—Wan 2.6 (720P, 15s)text_to_videowan2.6-t2v206175 pts—Wan 2.6 (1080P, 15s)text_to_videowan2.6-t2v2062120 pts—Kling O1 (5s, std)text_to_videokling-video-o1231348 ptsLatest KlingKling O1 (5s, pro)text_to_videokling-video-o1231460 pts—Kling O1 (10s, std)text_to_videokling-video-o1231596 pts—Kling O1 (10s, pro)text_to_videokling-video-o12316120 pts—text_to_musicSuno (sonic-v4)text_to_musicsonic237025 ptsDefaultDouBao BGMtext_to_musicGenBGM439930 pts—DouBao Songtext_to_musicGenSong439830 pts—All othersany—→ query /open/v1/product/list—Always runtime query\n\n⚠️ Production warning: attribute_id and credit values change frequently in production. Always call /open/v1/product/list at runtime; above table is pre-queried reference only (2026-02-27)."
      },
      {
        "title": "Common Mistakes (and resulting errors)",
        "body": "MistakeErrorattribute_id is 0 or missing\"Invalid product attribute\" + \"Insufficient points\"attribute_id outdated (production changed)Same errors; always query product list firstattribute_id doesn't match parameter combinationError 6010: \"Attribute ID does not match the calculated rule\"prompt at outer parameters[] levelPrompt ignored; wrong routingcast missing from inner parameters.parametersBilling validation failurecredit value wrong or missingError 6006model_name / model_version missingWrong backend routingSkipped product list, used table values directlyAll of the above\n\n⚠️ Critical for Google Veo 3.1 and multi-rule models:\n\nModels like Google Veo 3.1 have multiple credit_rules, each with a different attribute_id for different parameter combinations:\n\n720p + 4s + optimized → attribute_id A\n720p + 8s + optimized → attribute_id B\n4K + 4s + high → attribute_id C\n\nThe script automatically selects the correct attribute_id by matching your parameters (duration, resolution, compression_quality, generate_audio) against each rule's attributes. If the match fails, you get error 6010.\n\nFix: The bundled script now checks these video-specific parameters for smart credit_rule selection. Always use the script, not manual API construction."
      },
      {
        "title": "Core Flow",
        "body": "1. GET /open/v1/product/list?app=ima&platform=web&category=<type>\n   → REQUIRED: Get attribute_id, credit, model_version, model_name, form_config defaults\n\n[If input image required]\n2. Upload image → get public HTTPS URL\n   → See \"Image Upload\" section below\n\n3. POST /open/v1/tasks/create\n   → Must include: attribute_id, model_name, model_version, credit, cast, prompt (nested!)\n\n4. POST /open/v1/tasks/detail  {\"task_id\": \"...\"}\n   → Poll until medias[].resource_status == 1\n   → Extract url from completed media"
      },
      {
        "title": "Image Upload (Required Before Image Tasks)",
        "body": "The IMA Open API does NOT accept raw bytes or base64 images. All image inputs must be public HTTPS URLs.\n\nWhen a user provides an image (local file, bytes, base64), you must upload it first and get a URL. This is exactly what the IMA frontend does before every image task."
      },
      {
        "title": "Real Upload Flow (from IMA Frontend Source)",
        "body": "The frontend uses a two-step presigned URL flow via the IM platform:\n\nStep 1: GET /api/rest/oss/getuploadtoken   → returns { ful, fdl }\n          ful = presigned PUT URL (upload destination, expires ~7 days)\n          fdl = final CDN download URL (use this as input_images value)\n\nStep 2: PUT {ful}  with raw image bytes + Content-Type header\n          → image is stored in Aliyun OSS: zhubite-imagent-bot.oss-us-east-1.aliyuncs.com\n          → accessible via CDN: https://ima-ga.esxscloud.com/..."
      },
      {
        "title": "Step 1: Get Upload Token",
        "body": "GET https://imapi-qa.liveme.com/api/rest/oss/getuploadtoken\n\nRequired query parameters (11 total — sourced directly from frontend generateUploadInfo):\n\nParameterExampleDescriptionappUidima_xxx...Use IMA API key directly — no separate login neededappIdwebAgentApp identifier (fixed)appKey32jdskjdk320eewApp secret (fixed, used for sign generation)cmimTokenima_xxx...Use IMA API key directly — same as appUidsign117CF6CF...IM auth HMAC: SHA1(\"webAgent|32jdskjdk320eew|{timestamp}|{nonce}\").upper()timestamp1772042430Unix timestamp (seconds), generated per requestnonceCxI1FLI5ajLJZ1jlxZmegRandom nonce string, generated per requestfServicepriviteFixed: storage service typefTypepicturepicture for images, video, audiofSuffixjpegFile extension: jpeg, png, mp4, mp3fContentTypeimage/jpegMIME type of the file\n\n简化认证：直接使用 IMA API key 填充 appUid 和 cmimToken 参数，无需单独获取凭证。\n\nResponse:\n\n{\n  \"ful\": \"https://zhubite-imagent-bot.oss-us-east-1.aliyuncs.com/webAgent/privite/2026/02/26/..._uuid.jpeg?Expires=...&OSSAccessKeyId=...&Signature=...\",\n  \"fdl\": \"https://ima-ga.esxscloud.com/webAgent/privite/2026/02/26/..._uuid.jpeg\",\n  \"ful_expire\": \"...\",\n  \"fdl_expire\": \"...\",\n  \"fdl_key\": \"...\"\n}"
      },
      {
        "title": "Step 2: Upload Image via Presigned URL",
        "body": "PUT {ful}\nContent-Type: image/jpeg\nBody: [raw image bytes]\n\nNo auth headers needed — the presigned URL already encodes the credentials."
      },
      {
        "title": "Step 3: Use fdl as the Image URL",
        "body": "After the PUT succeeds, use fdl (the CDN URL) as the value for input_images / src_img_url."
      },
      {
        "title": "Python Implementation",
        "body": "import hashlib, time, uuid, requests, mimetypes\n\n# ── 🌐 IMA Upload Service Endpoint (IMA-owned, for image/video uploads) ──────\nIMA_IM_BASE = \"https://imapi-qa.liveme.com\"   # prod: https://imapi.liveme.com\n\n# ── 🔑 Hardcoded APP_KEY (Public, Shared Across All Users) ──────────────────\n# This APP_KEY is a PUBLIC identifier used by IMA Studio's image/video upload \n# service. It is NOT a secret—it's intentionally shared across all users and \n# embedded in the IMA web frontend. This key is used to generate HMAC signatures \n# for upload token requests, but your IMA API key (ima_xxx...) is the ACTUAL \n# authentication credential. Think of APP_KEY as a \"client ID\" rather than a \n# \"client secret.\"\n#\n# ⚠️ Security Note: Your ima_xxx... API key is the sensitive credential. It is \n# sent to imapi.liveme.com as query parameters (appUid, cmimToken). Always use \n# test keys for experiments and rotate your API key regularly.\n#\n# 📖 See SECURITY.md for complete disclosure and network verification guide.\nAPP_ID    = \"webAgent\"\nAPP_KEY   = \"32jdskjdk320eew\"   # Public shared key (used for HMAC sign generation)\nAPP_UID   = \"<your_app_uid>\"    # POST /api/v3/login/app → data.user_id\nAPP_TOKEN = \"<your_app_token>\"  # POST /api/v3/login/app → data.token\n\n\ndef _gen_sign() -> tuple[str, str, str]:\n    \"\"\"Generate per-request (sign, timestamp, nonce).\"\"\"\n    nonce = uuid.uuid4().hex[:21]\n    ts    = str(int(time.time()))\n    raw   = f\"{APP_ID}|{APP_KEY}|{ts}|{nonce}\"\n    sign  = hashlib.sha1(raw.encode()).hexdigest().upper()\n    return sign, ts, nonce\n\n\ndef get_upload_token(app_uid: str, app_token: str,\n                     suffix: str, content_type: str) -> dict:\n    \"\"\"Step 1: Get presigned upload URL from IMA's upload service.\n    \n    Calls GET imapi.liveme.com/api/rest/oss/getuploadtoken with exactly 11 params.\n    Returns: { \"ful\": \"<presigned PUT URL>\", \"fdl\": \"<CDN download URL>\" }\n    \n    Args:\n        app_uid: Your IMA API key (ima_xxx...), used as appUid parameter\n        app_token: Your IMA API key (ima_xxx...), used as cmimToken parameter\n        suffix: File extension (jpeg, png, mp4, mp3)\n        content_type: MIME type (image/jpeg, video/mp4, etc.)\n    \n    Security Note:\n        Your IMA API key (ima_xxx...) is sent to imapi.liveme.com as query \n        parameters (appUid, cmimToken). This is IMA Studio's image/video upload \n        service, separate from the main api.imastudio.com API. Both domains are \n        owned by IMA Studio—this is part of IMA's microservices architecture.\n        \n        Why two domains?\n        - api.imastudio.com: Core AI generation API (product list, task creation)\n        - imapi.liveme.com: Specialized upload service (presigned URL generation)\n        \n        Your API key grants access to both services. For security verification, \n        see SECURITY.md section \"Network Traffic Verification.\"\n    \"\"\"\n    sign, ts, nonce = _gen_sign()\n    r = requests.get(\n        f\"{IMA_IM_BASE}/api/rest/oss/getuploadtoken\",\n        params={\n            # App Key params\n            \"appUid\":       app_uid,       # APP_UID\n            \"appId\":        APP_ID,\n            \"appKey\":       APP_KEY,\n            \"cmimToken\":    app_token,     # APP_TOKEN\n            \"sign\":         sign,\n            \"timestamp\":    ts,\n            \"nonce\":        nonce,\n            # File params\n            \"fService\":     \"privite\",     # fixed\n            \"fType\":        \"picture\",     # picture / video / audio\n            \"fSuffix\":      suffix,        # jpeg / png / mp4 / mp3\n            \"fContentType\": content_type,\n        },\n    )\n    r.raise_for_status()\n    return r.json()[\"data\"]\n\n\ndef upload_image_to_oss(image_bytes: bytes, content_type: str, ful: str) -> None:\n    \"\"\"Step 2: PUT image bytes to the presigned OSS URL. No auth needed.\"\"\"\n    resp = requests.put(ful, data=image_bytes, headers={\"Content-Type\": content_type})\n    resp.raise_for_status()\n\n\ndef prepare_image_url(source, api_key: str) -> str:\n    \"\"\"\n    Full workflow: upload any image and return the CDN URL (fdl).\n    \n    Args:\n        source: file path (str), raw bytes, or already-public HTTPS URL\n        api_key: IMA API key for upload authentication\n    \n    Returns: public HTTPS CDN URL ready to use as input_images value\n    \"\"\"\n    # Already a public URL → use directly, no upload needed\n    if isinstance(source, str) and source.startswith(\"https://\"):\n        return source\n    \n    # Read file bytes\n    if isinstance(source, str):\n        ext = source.rsplit(\".\", 1)[-1].lower() if \".\" in source else \"jpeg\"\n        with open(source, \"rb\") as f:\n            image_bytes = f.read()\n        content_type = mimetypes.guess_type(source)[0] or \"image/jpeg\"\n    else:\n        image_bytes = source\n        ext = \"jpeg\"\n        content_type = \"image/jpeg\"\n\n    # Step 1: Get presigned URL using API key directly\n    token_data = get_upload_token(api_key, ext, content_type)\n    ful = token_data[\"ful\"]\n    fdl = token_data[\"fdl\"]\n\n    # Step 2: Upload to OSS\n    upload_image_to_oss(image_bytes, content_type, ful)\n\n    # Step 3: Return CDN URL\n    return fdl   # use this as input_images / src_img_url value\n\nOSS path format: webAgent/privite/{YYYY}/{MM}/{DD}/{timestamp}_{uid}_{uuid}.{ext}\nCDN base: https://ima-ga.esxscloud.com/\nOSS bucket: zhubite-imagent-bot.oss-us-east-1.aliyuncs.com"
      },
      {
        "title": "Task Types (category values)",
        "body": "categoryCapabilityInputtext_to_imageText → Imagepromptimage_to_imageImage → Imageprompt + input_imagestext_to_videoText → Videopromptimage_to_videoImage → Videoprompt + input_imagesfirst_last_frame_to_videoFirst+Last Frame → Videoprompt + src_img_url[2]reference_image_to_videoReference Image → Videoprompt + src_img_url[1+]text_to_musicText → Musicprompttext_to_speechText → Speechprompt (text to speak)"
      },
      {
        "title": "Detail API status values",
        "body": "Each media in medias[] has two fields:\n\nFieldTypeValuesDescriptionresource_statusint (or null)0, 1, 2, 30=处理中, 1=可用, 2=失败, 3=已删除。API 可能返回 null，需当作 0。statusstring\"pending\", \"processing\", \"success\", \"failed\"任务状态文案。轮询时以 resource_status 为准；status == \"failed\" 表示失败。\n\nPoll on resource_status first, then ensure status is not \"failed\":\n\nresource_statusstatusMeaningAction0 or nullpending / processing处理中Keep polling; do not stop (null = 0)1success (or completed)完成Read url; stop only when all medias are 11failed失败 (status 优先)Stop, handle error2any失败Stop, handle error3any已删除Stop\n\nImportant: (1) Treat resource_status: null as 0. (2) Stop only when all medias have resource_status == 1. (3) When resource_status=1, still check status != \"failed\"."
      },
      {
        "title": "API 1: Product List",
        "body": "GET /open/v1/product/list?app=ima&platform=web&category=text_to_image\n\nInternally calls downstream /v1/products/listv2. Returns a V2 tree structure: type=2 nodes are model groups, type=3 nodes are versions (leaves). Only type=3 nodes contain credit_rules and form_config.\n\nwebAgent is auto-converted to ima by the gateway — you can use either value for app.\n\n[\n  {\n    \"id\": \"SeeDream\",\n    \"type\": \"2\",\n    \"name\": \"SeeDream\",\n    \"model_id\": \"\",\n    \"children\": [\n      {\n        \"id\": \"doubao-seedream-4-0-250828\",\n        \"type\": \"3\",\n        \"name\": \"SeeDream 4.0\",\n        \"model_id\": \"doubao-seedream-4.0\",\n        \"credit_rules\": [\n          { \"attribute_id\": 332, \"points\": 5, \"attributes\": { \"default\": \"enabled\" } }\n        ],\n        \"form_config\": [\n          { \"field\": \"size\", \"type\": \"tags\", \"value\": \"1K\",\n            \"options\": [{\"label\":\"1K\",\"value\":\"1K\"}, {\"label\":\"2K\",\"value\":\"2K\"}] }\n        ]\n      }\n    ]\n  }\n]\n\nHow to pick a version for task creation:\n\nTraverse nodes to find type=3 leaves (versions)\nUse model_id and id (= model_version) from the leaf\nPick credit_rules[].attribute_id matching your desired quality/size (attributes field shows the config)\nUse form_config[].value as default parameters values\n\ncredit_rules[].attribute_id → required for task creation as attribute_id.\ncredit_rules[].points → required for task creation as credit and cast.points."
      },
      {
        "title": "API 2: Create Task",
        "body": "POST /open/v1/tasks/create"
      },
      {
        "title": "Request Structure",
        "body": "{\n  \"task_type\": \"text_to_image\",\n  \"enable_multi_model\": false,\n  \"src_img_url\": [],\n  \"upload_img_src\": \"\",\n  \"parameters\": [\n    {\n      \"attribute_id\": 8538,\n      \"model_id\":      \"doubao-seedream-4.5\",\n      \"model_name\":    \"SeeDream 4.5\",\n      \"model_version\": \"doubao-seedream-4-5-251128\",\n      \"app\":           \"ima\",\n      \"platform\":      \"web\",\n      \"category\":      \"text_to_image\",\n      \"credit\":        5,\n      \"parameters\": {\n        \"prompt\":       \"a beautiful mountain sunset, photorealistic\",\n        \"size\":         \"4k\",\n        \"n\":            1,\n        \"input_images\": [],\n        \"cast\":         {\"points\": 5, \"attribute_id\": 8538}\n      }\n    }\n  ]\n}"
      },
      {
        "title": "Field Reference",
        "body": "FieldRequiredDescriptiontask_type✅Must match parameters[].categoryparameters[].attribute_id✅From credit_rules[].attribute_id in product listparameters[].model_id✅From type=3 leaf node model_idparameters[].model_version✅From type=3 leaf node idparameters[].app✅Use ima (or webAgent, auto-converted)parameters[].platform✅Use webparameters[].category✅Must match top-level task_typeparameters[].credit✅Must equal credit_rules[].points. Error 6006 if wrong.parameters[].parameters.prompt✅The actual prompt text used by downstream serviceparameters[].parameters.cast✅{\"points\": N, \"attribute_id\": N} — mirrors creditparameters[].parameters.n✅Number of outputs (usually 1). Gateway flattens N>1 into separate resources.parameters[].parameters.input_imagesimage tasksArray of input image URLstop-level src_img_urlmulti-imageArray for first_last_frame / reference tasks"
      },
      {
        "title": "N-Field Flattening (Gateway Internal Logic)",
        "body": "When n > 1, the gateway automatically:\n\nGenerates n independent resourceBizId values\nDeducts credits n times (one per resource)\nCreates n separate tasks in the downstream service\n\nResponse medias[] will contain n items. Poll until all have resource_status == 1."
      },
      {
        "title": "Response",
        "body": "{\n  \"code\": 0,\n  \"data\": {\n    \"id\": \"task_abc123\",\n    \"biz_id\": \"biz_xxx\",\n    \"task_type\": \"text_to_image\",\n    \"medias\": [],\n    \"generate_count\": 1,\n    \"created_at\": 1700000000000,\n    \"timeout_at\": 1700000300000\n  }\n}\n\ndata.id = task ID for polling. timeout_at = Unix ms deadline."
      },
      {
        "title": "API 3: Task Detail (Poll)",
        "body": "POST /open/v1/tasks/detail\n{\"task_id\": \"<id from create response>\"}\n\nPoll every 2–5s (8s+ for video). Completed response:\n\n{\n  \"id\": \"task_abc\",\n  \"medias\": [{\n    \"resource_status\": 1,\n    \"status\": \"success\",\n    \"url\": \"https://cdn.../output.jpg\",\n    \"cover\": \"https://cdn.../cover.jpg\",\n    \"format\": \"jpg\",\n    \"width\": 1024,\n    \"height\": 1024\n  }]\n}\n\nPolling stop condition (must implement exactly):\n\nTreat resource_status: null (or missing) as 0 (processing). Do not stop when you see null; backend may serialize Go *int as null.\nStop only when ALL medias[].resource_status == 1 and no status == \"failed\". If you return on the first media with resource_status == 1 while others are still 0, the task is not fully done and you will keep polling or get inconsistent state.\nStop immediately if any status == \"failed\" or resource_status == 2 or resource_status == 3."
      },
      {
        "title": "text_to_image ✅ Verified",
        "body": "No image input. src_img_url: [], input_images: []. See API 2 for full example."
      },
      {
        "title": "text_to_video ✅ Verified",
        "body": "Extra fields vs text_to_image — all from form_config defaults:\n\n{\n  \"task_type\": \"text_to_video\",\n  \"src_img_url\": [],\n  \"parameters\": [{\n    \"attribute_id\":  4838,\n    \"model_id\":      \"wan2.6-t2v\",\n    \"model_name\":    \"Wan 2.6\",\n    \"model_version\": \"wan2.6-t2v\",\n    \"category\":      \"text_to_video\",\n    \"credit\":        3,\n    \"app\": \"ima\", \"platform\": \"web\",\n    \"parameters\": {\n      \"prompt\":          \"a puppy dancing happily, sunny meadow\",\n      \"negative_prompt\": \"\",\n      \"prompt_extend\":   false,\n      \"duration\":        5,\n      \"resolution\":      \"1080P\",\n      \"aspect_ratio\":    \"16:9\",\n      \"shot_type\":       \"single\",\n      \"seed\":            -1,\n      \"n\":               1,\n      \"input_images\":    [],\n      \"cast\":            {\"points\": 3, \"attribute_id\": 4838}\n    }\n  }]\n}\n\nVideo-specific fields from form_config: duration (seconds), resolution, aspect_ratio, shot_type, negative_prompt, prompt_extend.\nPoll every 8s (video generation is slower). Response medias[].cover = first-frame thumbnail."
      },
      {
        "title": "text_to_music",
        "body": "No image input. src_img_url: [], input_images: []."
      },
      {
        "title": "image_to_image ✅ Verified",
        "body": "{\n  \"task_type\": \"image_to_image\",\n  \"src_img_url\": [\"https://...input.jpg\"],\n  \"parameters\": [{\n    \"attribute_id\":  8560,\n    \"model_id\":      \"doubao-seedream-4.5\",\n    \"model_version\": \"doubao-seedream-4-5-251128\",\n    \"category\":      \"image_to_image\",\n    \"credit\":        5,\n    \"app\": \"ima\", \"platform\": \"web\",\n    \"parameters\": {\n      \"prompt\":       \"turn into oil painting style\",\n      \"size\":         \"4k\",\n      \"n\":            1,\n      \"input_images\": [\"https://...input.jpg\"],\n      \"cast\":         {\"points\": 5, \"attribute_id\": 8560}\n    }\n  }]\n}\n\n⚠️ size must be from form_config options (e.g. \"2k\", \"4k\", \"2048x2048\"). \"adaptive\" is NOT valid for SeeDream 4.5 i2i — causes error 400.\nTop-level src_img_url and parameters.input_images must both contain the input image URL.\nSome i2i models (e.g. doubao-seededit-3.0-i2i) may not be available in test environments — fall back to SeeDream 4.5."
      },
      {
        "title": "image_to_video / first_last_frame_to_video / reference_image_to_video",
        "body": "{\n  \"src_img_url\": [\"https://first-frame.jpg\", \"https://last-frame.jpg\"]\n}\n\nIndex 0 = first frame (or reference), index 1 = last frame (first_last_frame only)."
      },
      {
        "title": "Common Mistakes",
        "body": "MistakeFixattribute_id not from credit_rulesAlways fetch product list firstcredit value wrongMust exactly match credit_rules[].points — error 6006prompt at wrong locationPut prompt in parameters[].parameters.prompt (nested), not only at top levelPolling biz_id instead of idUse id (task ID) for /tasks/detailSingle-poll instead of loopPoll until resource_status == 1 for ALL mediasMissing app / platform in parametersRequired fields — use ima / webcategory mismatchparameters[].category must match top-level task_typeresource_status == 2 not handledCheck for failure, don't loop foreverstatus == \"failed\" ignoredresource_status=1 + status=\"failed\" means actual failuren > 1 and only checking first mediaAll n media items must reach resource_status == 1"
      },
      {
        "title": "Complete Python Example",
        "body": "See the Python example sections throughout this documentation for implementation guidance covering all 7 task types."
      },
      {
        "title": "Supported Models & Search Terms",
        "body": "Image: SeeDream 4.5 (see dream), Midjourney (MJ), Nano Banana 2, Nano Banana Pro\nVideo: Wan 2.6, Kling O1, Kling 2.6, Google Veo 3.1 (veo), Sora 2 Pro, Pixverse V5.5, Hailuo 2.0, Hailuo 2.3, MiniMax Hailuo, SeeDance 1.5 Pro, Vidu Q2\nMusic: Suno sonic v4, Suno sonic v5, DouBao BGM (GenBGM), DouBao Song (GenSong)\nTTS: seed-tts-2.0 (seed tts, text-to-speech)\n\nCapabilities: multimodal AI creation, all-in-one, image generation, video generation, music generation, text-to-speech, text-to-image, image-to-video, text-to-music"
      }
    ],
    "body": "IMA AI Creation\n⚠️ 重要：模型 ID 参考\n\nCRITICAL: When calling the script, you MUST use the exact model_id (second column), NOT the friendly model name. Do NOT infer model_id from the friendly name (e.g., ❌ nano-banana-pro is WRONG; ✅ gemini-3-pro-image is CORRECT).\n\nQuick Reference Table:\n\n图像模型 (Image Models)\n友好名称 (Friendly Name)\tmodel_id\t说明 (Notes)\nNano Banana2\tgemini-3.1-flash-image\t❌ NOT nano-banana-2, 预算选择 4-13 pts\nNano Banana Pro\tgemini-3-pro-image\t❌ NOT nano-banana-pro, 高质量 10-18 pts\nSeeDream 4.5\tdoubao-seedream-4.5\t✅ Recommended default, 5 pts\nMidjourney\tmidjourney\t✅ Same as friendly name, 8-10 pts\n视频模型 (Video Models)\n友好名称 (Friendly Name)\tmodel_id (t2v)\tmodel_id (i2v)\t说明 (Notes)\nWan 2.6\twan2.6-t2v\twan2.6-i2v\t⚠️ Note -t2v/-i2v suffix\nKling O1\tkling-video-o1\tkling-video-o1\t⚠️ Note video- prefix\nKling 2.6\tkling-v2-6\tkling-v2-6\t⚠️ Note v prefix\nHailuo 2.3\tMiniMax-Hailuo-2.3\tMiniMax-Hailuo-2.3\t⚠️ Note MiniMax- prefix\nHailuo 2.0\tMiniMax-Hailuo-02\tMiniMax-Hailuo-02\t⚠️ Note 02 not 2.0\nGoogle Veo 3.1\tveo-3.1-generate-preview\tveo-3.1-generate-preview\t⚠️ Note -generate-preview suffix\nSora 2 Pro\tsora-2-pro\tsora-2-pro\t✅ Straightforward\nPixverse\tpixverse\tpixverse\t✅ Same as friendly name\n音乐模型 (Music Models)\n友好名称 (Friendly Name)\tmodel_id\t说明 (Notes)\nSuno (sonic v4)\tsonic\t⚠️ Simplified to sonic\nDouBao BGM\tGenBGM\t❌ NOT doubao-bgm\nDouBao Song\tGenSong\t❌ NOT doubao-song\n语音模型 (Speech/TTS Models)\n友好名称 (Friendly Name)\tmodel_id\t说明 (Notes)\nseed-tts-2.0\tseed-tts-2.0\t✅ Same as friendly name (default)\n\nHow to get the correct model_id:\n\nCheck this table first\nUse --list-models --task-type <type> to query available models\nRefer to command examples in this SKILL.md\n\nExample:\n\n# ❌ WRONG: Inferring from friendly name\n--model-id nano-banana-pro\n\n# ✅ CORRECT: Using exact model_id from table\n--model-id gemini-3-pro-image\n\n⚠️ MANDATORY PRE-CHECK: Read Knowledge Base First!\n\nIf ima-knowledge-ai is not installed: Skip all \"Read …\" steps below; use only this SKILL's 📥 User Input Parsing (media type → task_type) and the Recommended Defaults / model tables for each media type.\n\nBEFORE executing ANY multi-media generation task, you MUST:\n\nCheck for workflow complexity — Read ima-knowledge-ai/references/workflow-design.md if:\n\nUser mentions: \"MV\"、\"宣传片\"、\"完整作品\"、\"配乐\"、\"soundtrack\"\nTask spans multiple media types (image + video, video + music, etc.)\nComplex multi-step workflows that need task decomposition\n\nCheck for visual consistency needs — Read ima-knowledge-ai/references/visual-consistency.md if:\n\nUser mentions: \"系列\"、\"多张\"、\"同一个\"、\"角色\"、\"续\"、\"series\"、\"same\"\nTask involves: multiple images/videos, character continuity, product shots\nSecond+ request about same subject (e.g., \"旺财在游泳\" after \"生成旺财照片\")\n\nCheck video modes — Read ima-knowledge-ai/references/video-modes.md if:\n\nAny video generation task\nNeed to understand: image_to_video vs reference_image_to_video difference\n\nCheck model selection — Read ima-knowledge-ai/references/model-selection.md if:\n\nUnsure which model to use\nNeed cost/quality trade-off guidance\nUser specifies budget or quality requirements\n\nWhy this matters:\n\nMulti-media workflows need proper task sequencing (e.g., video duration → matching music duration)\nAI generation defaults to 独立生成 each time — without reference images, results will be inconsistent\nWrong video mode = wrong result (image_to_video ≠ reference_image_to_video)\nModel choice affects cost and quality significantly\n\nExample multi-media workflow:\n\nUser: \"帮我做个产品宣传MV，有背景音乐，主角是旺财小狗\"\n\n❌ Wrong: \n  1. Generate dog image (random look)\n  2. Generate video (different dog)\n  3. Generate music (unrelated)\n\n✅ Right:\n  1. Read workflow-design.md + visual-consistency.md\n  2. Generate Master Reference: 旺财小狗图片\n  3. Generate video shots using image_to_video with 旺财 as first frame\n  4. Get video duration (e.g., 15s)\n  5. Generate BGM with matching duration and mood\n\n\nHow to check:\n\n# Step 0: Determine media type first (image / video / music / speech)\n# From user request: \"画\"/\"生成图\"/\"image\" → image; \"视频\"/\"video\" → video; \"音乐\"/\"歌\"/\"music\"/\"BGM\" → music; \"语音\"/\"朗读\"/\"TTS\"/\"speech\" → speech\n# Then choose task_type and model from the corresponding section (image: text_to_image/image_to_image; video: text_to_video/...; music: text_to_music; speech: text_to_speech)\n\n# Step 1: Read knowledge base based on task type\nif multi_media_workflow:\n    read(\"~/.openclaw/skills/ima-knowledge-ai/references/workflow-design.md\")\n\nif \"same subject\" or \"series\" or \"character\":\n    read(\"~/.openclaw/skills/ima-knowledge-ai/references/visual-consistency.md\")\n\nif video_generation:\n    read(\"~/.openclaw/skills/ima-knowledge-ai/references/video-modes.md\")\n\n# Step 2: Execute with proper sequencing and reference images\n# (see workflow-design.md for specific patterns)\n\n\nNo exceptions — for simple single-media requests, you can proceed directly. For complex multi-media workflows, read the knowledge base first.\n\n📥 User Input Parsing (Media Type & Task Routing)\n\nPurpose: So that any agent parses user intent consistently, first determine the media type from the user's request, then choose task_type and model.\n\n1. User phrasing → media type (do this first)\nUser intent / keywords\tMedia type\ttask_type examples\n画 / 生成图 / 图片 / image / 画一张 / 图生图\timage\ttext_to_image, image_to_image\n视频 / 生成视频 / video / 图生视频 / 文生视频\tvideo\ttext_to_video, image_to_video, first_last_frame_to_video, reference_image_to_video\n音乐 / 歌 / BGM / 背景音乐 / music / 作曲\tmusic\ttext_to_music\n语音 / 朗读 / TTS / 语音合成 / 配音 / speech / read aloud / text-to-speech\tspeech\ttext_to_speech\n\nIf the request mixes media (e.g. \"宣传片+配乐\"), treat as multi-media workflow: read workflow-design.md, then plan image → video → music steps and use the correct task_type for each step.\n\n2. Model and parameter parsing\n\nImage: For model name → model_id and size/aspect_ratio parsing, follow the same rules as in ima-image-ai skill (User Input Parsing section).\n\nVideo: For task_type (t2v / i2v / first_last / reference), model alias → model_id, and duration/resolution/aspect_ratio, follow ima-video-ai skill (User Input Parsing section).\n\nMusic: Suno (sonic) vs DouBao BGM/Song — infer from \"BGM\"/\"背景音乐\" → BGM; \"带歌词\"/\"人声\" → Suno or Song. Use model_id sonic, GenBGM, GenSong per \"Recommended Defaults\" and \"Music Generation\" tables below.\n\nSpeech (TTS): Get model_id from GET /open/v1/product/list?category=text_to_speech or run script with --task-type text_to_speech --list-models. Map user intent to parameters using product form_config:\n\nUser intent / phrasing\tParameter (if in form_config)\tNotes\n女声 / 女声朗读 / female voice\tvoice_id / voice_type\tUse value from form_config options\n男声 / 男声朗读 / male voice\tvoice_id / voice_type\tUse value from form_config options\n语速快/慢 / speed up/slow\tspeed\te.g. 0.8–1.2\n音调 / pitch\tpitch\tIf supported\n大声/小声 / volume\tvolume\tIf supported\n\nIf the user does not specify, use form_config defaults. Pass extra params via --extra-params '{\"speed\":1.0}'. Only send parameters present in the product’s credit_rules/attributes or form_config (script reflection strips others on retry).\n\n⚙️ How This Skill Works\n\nFor transparency: This skill uses a bundled Python script (scripts/ima_create.py) to call the IMA Open API. The script:\n\nSends your prompt to two IMA-owned domains (see \"Network Endpoints\" below)\nUses --user-id only locally as a key for storing your model preferences\nReturns image/video/music URLs when generation is complete\n\nWhat gets sent to IMA servers:\n\n✅ Your prompt/description (image/video/music)\n✅ Model selection (SeeDream/Wan/Suno/etc.)\n✅ Generation parameters (size, duration, style, etc.)\n❌ NO API key in prompts (key is used for authentication only)\n❌ NO user_id (it's only used locally)\n\nWhat's stored locally:\n\n~/.openclaw/memory/ima_prefs.json - Your model preferences (< 1 KB)\n~/.openclaw/logs/ima_skills/ - Generation logs (auto-deleted after 7 days)\n🌐 Network Endpoints Used\nDomain\tOwner\tPurpose\tData Sent\tPrivacy\napi.imastudio.com\tIMA Studio\tMain API (product list, task creation, task polling)\tPrompts, model IDs, generation params, your API key\tStandard HTTPS, data processed for AI generation\nimapi.liveme.com\tIMA Studio\tImage/Video upload service (presigned URL generation)\tYour API key, file metadata (MIME type, extension)\tStandard HTTPS, used for image/video tasks only\n*.aliyuncs.com, *.esxscloud.com\tAlibaba Cloud (OSS)\tImage/video storage (file upload, CDN delivery)\tRaw image/video bytes (via presigned URL, NO API key)\tIMA-managed OSS buckets, presigned URLs expire after 7 days\n\nKey Points:\n\nMusic tasks (text_to_music) and TTS tasks (text_to_speech) only use api.imastudio.com.\nImage/video tasks require imapi.liveme.com to obtain presigned URLs for uploading input images.\nYour API key is sent to both api.imastudio.com and imapi.liveme.com (both owned by IMA Studio).\nVerify network calls: tcpdump -i any -n 'host api.imastudio.com or host imapi.liveme.com'. See this document: 🌐 Network Endpoints Used and ⚠️ Credential Security Notice for full disclosure.\n⚠️ Credential Security Notice\n\nYour API key is sent to both IMA-owned domains:\n\nAuthorization: Bearer ima_xxx... → api.imastudio.com (main API)\nQuery param appUid=ima_xxx... → imapi.liveme.com (upload service)\n\nSecurity best practices:\n\n🧪 Use test keys for experiments: Generate a separate API key for testing.\n🔍 Monitor usage: Check https://imastudio.com/dashboard for unauthorized activity.\n⏱️ Rotate keys: Regenerate your API key periodically (monthly recommended).\n📊 Review logs: Check ~/.openclaw/logs/ima_skills/ for unexpected API calls.\n\nWhy two domains? IMA Studio uses a microservices architecture:\n\napi.imastudio.com: Core AI generation API\nimapi.liveme.com: Specialized image/video upload service (shared infrastructure)\n\nBoth domains are operated by IMA Studio. The same API key grants access to both services.\n\nAgent Execution (Internal Reference)\n\nNote for users: You can review the script source at scripts/ima_create.py anytime.\nThe agent uses this script to simplify API calls. Music tasks use only api.imastudio.com, while image/video tasks also call imapi.liveme.com for file uploads (see \"Network Endpoints\" above).\n\nUse the bundled script internally for all task types — it ensures correct parameter construction:\n\n# ─── Image Generation ──────────────────────────────────────────────────────────\n\n# Basic text-to-image (default model)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_image \\\n  --model-id doubao-seedream-4.5 --prompt \"a cute puppy on grass, photorealistic\" \\\n  --user-id {user_id} --output-json\n\n# Text-to-image with size override (Nano Banana2)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_image \\\n  --model-id gemini-3.1-flash-image --prompt \"city skyline at sunset, 4K\" \\\n  --size 2k --user-id {user_id} --output-json\n\n# Image-to-image with input URL\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type image_to_image \\\n  --model-id doubao-seedream-4.5 --prompt \"turn into oil painting style\" \\\n  --input-images https://example.com/photo.jpg --user-id {user_id} --output-json\n\n# ─── Video Generation ──────────────────────────────────────────────────────────\n\n# Basic text-to-video (default model, 5s 720P)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_video \\\n  --model-id wan2.6-t2v --prompt \"a puppy dancing happily, cinematic\" \\\n  --user-id {user_id} --output-json\n\n# Text-to-video with extra params (10s 1080P)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_video \\\n  --model-id wan2.6-t2v --prompt \"dramatic ocean waves, sunset\" \\\n  --extra-params '{\"duration\":10,\"resolution\":\"1080P\",\"aspect_ratio\":\"16:9\"}' \\\n  --user-id {user_id} --output-json\n\n# Image-to-video (animate static image)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type image_to_video \\\n  --model-id wan2.6-i2v --prompt \"camera slowly zooms in, gentle movement\" \\\n  --input-images https://example.com/photo.jpg --user-id {user_id} --output-json\n\n# First-last frame video (two images)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type first_last_frame_to_video \\\n  --model-id kling-video-o1 --prompt \"smooth transition between frames\" \\\n  --input-images https://example.com/frame1.jpg https://example.com/frame2.jpg \\\n  --user-id {user_id} --output-json\n\n# ─── Music Generation ──────────────────────────────────────────────────────────\n\n# Basic text-to-music (Suno default)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_music \\\n  --model-id sonic --prompt \"upbeat electronic music, 120 BPM, no vocals\" \\\n  --user-id {user_id} --output-json\n\n# Music with custom lyrics (Suno custom mode)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_music \\\n  --model-id sonic --prompt \"pop ballad, emotional\" \\\n  --extra-params '{\"custom_mode\":true,\"lyrics\":\"Your custom lyrics here...\",\"vocal_gender\":\"female\"}' \\\n  --user-id {user_id} --output-json\n\n# Background music (DouBao BGM)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_music \\\n  --model-id GenBGM --prompt \"relaxing ambient music for meditation\" \\\n  --user-id {user_id} --output-json\n\n# ─── Text-to-Speech (TTS) ─────────────────────────────────────────────────────\n\n# List TTS models first to get model_id, then generate speech\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_speech --list-models\n\n# TTS: use model_id from list above (prompt = text to speak)\npython3 {baseDir}/scripts/ima_create.py \\\n  --api-key $IMA_API_KEY --task-type text_to_speech \\\n  --model-id <model_id from list> --prompt \"Text to be spoken here.\" \\\n  --user-id {user_id} --output-json\n\n\nThe script outputs JSON with url, model_name, credit — use these values in the UX protocol messages below. The script internals (product list query, parameter construction, polling) are invisible to users.\n\nOverview\n\nCall IMA Open API to create AI-generated content. All endpoints require an ima_* API key. The core flow is: query products → create task → poll until done.\n\n🔒 Security & Transparency Policy\n\nThis skill is community-maintained and open for inspection.\n\n✅ What Users CAN Do\n\nFull transparency:\n\n✅ Review all source code: Check scripts/ima_create.py and ima_logger.py anytime\n✅ Verify network calls: Music tasks use api.imastudio.com only; image/video tasks also use imapi.liveme.com (see \"Network Endpoints\" section)\n✅ Inspect local data: View ~/.openclaw/memory/ima_prefs.json and log files\n✅ Control privacy: Delete preferences/logs anytime, or disable file writes (see below)\n\nConfiguration allowed:\n\n✅ Set API key in environment or agent config:\nEnvironment variable: export IMA_API_KEY=ima_your_key_here\nOpenClaw/MCP config: Add IMA_API_KEY to agent's environment configuration\nGet your key at: https://imastudio.com\n✅ Use scoped/test keys: Test with limited API keys, rotate after testing\n✅ Disable file writes: Make prefs/logs read-only or symlink to /dev/null\n\nData control:\n\n✅ View stored data: cat ~/.openclaw/memory/ima_prefs.json\n✅ Delete preferences: rm ~/.openclaw/memory/ima_prefs.json (resets to defaults)\n✅ Delete logs: rm -rf ~/.openclaw/logs/ima_skills/ (auto-cleanup after 7 days anyway)\n⚠️ Advanced Users: Fork & Modify\n\nIf you need to modify this skill for your use case:\n\nFork the repository (don't modify the original)\nUpdate your fork with your changes\nTest thoroughly with limited API keys\nDocument your changes for troubleshooting\n\nNote: Modified skills may break API compatibility or introduce security issues. Official support only covers the unmodified version.\n\n❌ What to AVOID (Security Risks)\n\nActions that could compromise security:\n\n❌ Sharing API keys publicly or in skill files\n❌ Modifying API endpoints to unknown servers\n❌ Disabling SSL/TLS certificate verification\n❌ Logging sensitive user data (prompts, IDs, etc.)\n❌ Bypassing authentication or billing mechanisms\n\nWhy this matters:\n\nAPI Compatibility: Skill logic aligns with IMA Open API schema\nSecurity: Malicious modifications could leak credentials or bypass billing\nSupport: Modified skills may not be supported\nCommunity: Breaking changes affect all users\n📋 Privacy & Data Handling Summary\n\nWhat this skill does with your data:\n\nData Type\tSent to IMA?\tStored Locally?\tUser Control\nPrompts (image/video/music)\t✅ Yes (required for generation)\t❌ No\tNone (required)\nAPI key\t✅ Yes (authentication header)\t❌ No\tSet via env var\nuser_id (optional CLI arg)\t❌ Never (local preference key only)\t✅ Yes (as prefs file key)\tChange --user-id value\nModel preferences\t❌ No\t✅ Yes (~/.openclaw)\tDelete anytime\nGeneration logs\t❌ No\t✅ Yes (~/.openclaw)\tAuto-cleanup 7 days\n\nPrivacy recommendations:\n\nUse test/scoped API keys for initial testing\nNote: --user-id is never sent to IMA servers - it's only used locally as a key for storing preferences in ~/.openclaw/memory/ima_prefs.json\nReview source code at scripts/ima_create.py to verify network calls (search for create_task function)\nRotate API keys after testing or if compromised\n\nGet your IMA API key: Visit https://imastudio.com to register and get started.\n\n🔧 For Skill Maintainers Only\n\nVersion control:\n\nAll changes must go through Git with proper version bumps (semver)\nCHANGELOG.md must document all changes\nProduction deployments require code review\n\nFile checksums (optional):\n\n# Verify skill integrity\nsha256sum SKILL.md scripts/ima_create.py\n\n\nIf users report issues, verify file integrity first.\n\n🧠 User Preference Memory (Image)\n\nUser preferences have highest priority when they exist. But preferences are only saved when users explicitly express model preferences — not from automatic model selection.\n\nStorage: ~/.openclaw/memory/ima_prefs.json\n\nSingle file, shared across all IMA skills:\n\n{\n  \"user_{user_id}\": {\n    \"text_to_image\":  { \"model_id\": \"doubao-seedream-4.5\", \"model_name\": \"SeeDream 4.5\", \"credit\": 5,  \"last_used\": \"2026-02-27T03:07:27Z\" },\n    \"image_to_image\": { \"model_id\": \"doubao-seedream-4.5\", \"model_name\": \"SeeDream 4.5\", \"credit\": 5,  \"last_used\": \"2026-02-27T03:07:27Z\" },\n    \"text_to_speech\": { \"model_id\": \"<from product list>\", \"model_name\": \"...\", \"credit\": 2, \"last_used\": \"...\" }\n  }\n}\n\nModel Selection Flow (Image Generation)\n\nStep 1: Get knowledge-ai recommendation (if installed)\n\nknowledge_recommended_model = read_ima_knowledge_ai()  # e.g., \"SeeDream 4.5\"\n\n\nStep 2: Check user preference\n\nuser_pref = load_prefs().get(f\"user_{user_id}\", {}).get(task_type)  # e.g., {\"model_id\": \"midjourney\", ...}\n\n\nStep 3: Decide which model to use\n\nif user_pref exists:\n    use_model = user_pref[\"model_id\"]  # Highest priority\nelse:\n    use_model = knowledge_recommended_model or fallback_default\n\n\nStep 4: Check for mismatch (for later hint)\n\nif user_pref exists and knowledge_recommended_model != user_pref[\"model_id\"]:\n    mismatch = True  # Will add hint in success message\n\nWhen to Write (User Explicit Preference ONLY)\n\n✅ Save preference when user explicitly specifies a model:\n\nUser says\tAction\n用XXX / 换成XXX / 改用XXX\tSwitch to model XXX + save as preference\n以后都用XXX / 默认用XXX / always use XXX\tSave + confirm: ✅ 已记住！以后图片生成默认用 [XXX]\n我喜欢XXX / 我更喜欢XXX\tSave as preference\n\n❌ Do NOT save when:\n\nAgent auto-selects from knowledge-ai → not user preference\nAgent uses fallback default → not user preference\nUser says generic quality requests (see \"Clear Preference\" below) → clear preference instead\nWhen to Clear (User Abandons Preference)\n\n🗑️ Clear preference when user wants automatic selection:\n\nUser says\tAction\n用最好的 / 用最合适的 / best / recommended\tClear pref + use knowledge-ai recommendation\n推荐一个 / 你选一个 / 自动选择\tClear pref + use knowledge-ai recommendation\n用默认的 / 用新的\tClear pref + use knowledge-ai recommendation\n试试别的 / 换个试试 (without specific model)\tClear pref + use knowledge-ai recommendation\n重新推荐\tClear pref + use knowledge-ai recommendation\n\nImplementation:\n\ndel prefs[f\"user_{user_id}\"][task_type]\nsave_prefs(prefs)\n\n⭐ Model Selection Priority (Image)\n\nSelection flow:\n\nUser preference (if exists) → Highest priority, always respect\nima-knowledge-ai skill (if installed) → Professional recommendation based on task\nFallback defaults → Use table below (only if neither 1 nor 2 exists)\n\nImportant notes:\n\nUser preference is only saved when user explicitly specifies a model (see \"When to Write\" above)\nKnowledge-ai is always consulted (even when user pref exists) to detect mismatches\nWhen mismatch detected → add gentle hint in success message (does NOT interrupt generation)\n\nThe defaults below are FALLBACK only. User preferences have highest priority, then knowledge-ai recommendations.\n\nWhen using user preference for image generation, show a line like:\n\n🎨 根据你的使用习惯，将用 [Model Name] 帮你生成…\n• 模型：[Model Name]（你的常用模型）\n• 预计耗时：[X ~ Y 秒]\n• 消耗积分：[N pts]\n\nPreference Change Confirmation\n\nWhen user switches to a different model than their saved preference:\n\n💡 你之前喜欢用 [Old Model]，这次换成了 [New Model]。\n要把 [New Model] 设为以后的默认吗？\n回复「是」保存 / 回复「否」仅本次使用\n\n⭐ Recommended Defaults\n\nThese are fallback defaults — only used when no user preference exists.\nAlways default to the newest and most popular model. Do NOT default to the cheapest.\n\nTask Type\tDefault Model\tmodel_id\tversion_id\tCost\tWhy\ntext_to_image\tSeeDream 4.5\tdoubao-seedream-4.5\tdoubao-seedream-4-5-251128\t5 pts\tLatest doubao flagship, photorealistic 4K\ntext_to_image (budget)\tNano Banana2\tgemini-3.1-flash-image\tgemini-3.1-flash-image\t4 pts\tFastest and cheapest option\ntext_to_image (premium)\tNano Banana Pro\tgemini-3-pro-image\tgemini-3-pro-image-preview\t10/10/18 pts\tPremium quality, 1K/2K/4K options\ntext_to_image (artistic)\tMidjourney 🎨\tmidjourney\tv6\t8/10 pts\tArtist-level aesthetics, creative styles\nimage_to_image\tSeeDream 4.5\tdoubao-seedream-4.5\tdoubao-seedream-4-5-251128\t5 pts\tLatest, best i2i quality\nimage_to_image (budget)\tNano Banana2\tgemini-3.1-flash-image\tgemini-3.1-flash-image\t4 pts\tCheapest option\nimage_to_image (premium)\tNano Banana Pro\tgemini-3-pro-image\tgemini-3-pro-image-preview\t10 pts\tPremium quality\nimage_to_image (artistic)\tMidjourney 🎨\tmidjourney\tv6\t8/10 pts\tArtist-level aesthetics, style transfer\ntext_to_video\tWan 2.6\twan2.6-t2v\twan2.6-t2v\t25 pts\t🔥 Most popular t2v, balanced cost\ntext_to_video (premium)\tHailuo 2.3\tMiniMax-Hailuo-2.3\tMiniMax-Hailuo-2.3\t38 pts\tHigher quality\ntext_to_video (budget)\tVidu Q2\tviduq2\tviduq2\t5 pts\tLowest cost t2v\nimage_to_video\tWan 2.6\twan2.6-i2v\twan2.6-i2v\t25 pts\t🔥 Most popular i2v, 1080P\nimage_to_video (premium)\tKling 2.6\tkling-v2-6\tkling-v2-6\t40-160 pts\tPremium Kling i2v\nfirst_last_frame_to_video\tKling O1\tkling-video-o1\tkling-video-o1\t48 pts\tNewest Kling reasoning model\nreference_image_to_video\tKling O1\tkling-video-o1\tkling-video-o1\t48 pts\tBest reference fidelity\ntext_to_music\tSuno (sonic-v4)\tsonic\tsonic\t25 pts\tLatest Suno engine, best quality\ntext_to_speech\t(query product list)\t—\t—\t—\tRun --task-type text_to_speech --list-models; use first or user-preferred model_id\n\nPremium options:\n\nImage: Nano Banana Pro — Highest quality with size control (1K/2K/4K), higher cost (10-18 pts for text_to_image, 10 pts for image_to_image)\nVideo: Kling O1, Sora 2 Pro, Google Veo 3.1 — Premium quality with longer duration options\n\nQuick selection guide (production as of 2026-02-27, sorted by popularity):\n\nImage (4 models available) → SeeDream 4.5 (5, default); artistic → Midjourney 🎨 (8-10); budget → Nano Banana2 (4, 512px); premium → Nano Banana Pro (10-18)\n🔥 Video from text (most popular) → Wan 2.6 (25, balanced); premium → Hailuo 2.3 (38); budget → Vidu Q2 (5)\n🔥 Video from image (most popular) → Wan 2.6 (25)\nMusic → Suno (25); DouBao BGM/Song (30 each)\nCheapest → Nano Banana2 512px (4) for image; Vidu Q2 (5) for video\n\nSelection guide by use case:\n\nImage Generation:\n\nGeneral image generation → SeeDream 4.5 (5pts)\nCustom aspect ratio (16:9, 9:16, 4:3, etc.) → SeeDream 4.5 🌟 or Nano Banana Pro/2 🆕 (native support)\nBudget-conscious / fast generation → Nano Banana2 (4pts)\nHighest quality with size control (1K/2K/4K) → Nano Banana Pro (text_to_image: 10-18pts, image_to_image: 10pts)\nArtistic/creative styles, illustrations, paintings → Midjourney 🎨 (8-10pts)\nStyle transfer / image editing → SeeDream 4.5 (5pts) or Midjourney 🎨 (artistic)\n\nVideo Generation:\n\nGeneral video generation → Wan 2.6 (25pts, most popular)\nPremium cinematic quality → Google Veo 3.1 (70-330pts) or Sora 2 Pro (122+pts)\nBudget video → Vidu Q2 (5pts) or Hailuo 2.0 (5pts)\nWith audio support → Kling O1 (48+pts) or Google Veo 3.1 (70+pts)\nFirst/last frame animation → Kling O1 (48+pts)\nReference image consistency → Kling O1 (48+pts) or Google Veo 3.1 (70+pts)\n\nMusic Generation:\n\nCustom song with lyrics, vocals, style → Suno sonic-v5 (25pts, default, ~2min)\nFull control: custom_mode, lyrics, vocal_gender, tags, negative_tags\nBest for: complete songs, vocal tracks, artistic compositions\nBackground music / ambient loop → DouBao BGM (30pts, ~30s)\nSimplified: prompt-only, no advanced parameters\nBest for: video backgrounds, ambient music, short loops\nSimple song generation → DouBao Song (30pts, ~30s)\nSimplified: prompt-only\nBest for: quick song generation, structured vocal compositions\nUser explicitly asks for cheapest → DouBao BGM/Song (6pts each) — only if explicitly requested\n\nSpeech (TTS) Generation:\n\nText-to-speech / 语音合成 / 朗读 → text_to_speech. Always query GET /open/v1/product/list?category=text_to_speech (or --list-models) to get current model_id and credit. No fixed default; use first available or user preference. Voice/speed/format parameters: see \"Model and parameter parsing\" (TTS table) and \"Speech (TTS) — text_to_speech\" in this document.\n\n⚠️ Technical Note for Suno:\n\nmodel_version inside parameters.parameters (e.g., \"sonic-v5\") is different from the outer model_version field (which is \"sonic\"). Always set both correctly when creating Suno tasks.\n\n⚠️ Production Image Models (4 available):\n\nSeeDream 4.5 (doubao-seedream-4.5) — 5 pts, default\nMidjourney 🎨 (midjourney) — 8/10 pts for 480p/720p, artistic styles\nNano Banana2 (gemini-3.1-flash-image) — 4/6/10/13 pts for 512px/1K/2K/4K\nNano Banana Pro (gemini-3-pro-image) — 10/10/18 pts for 1K/2K/4K\n\nAll other image models mentioned in older documentation are no longer available in production.\n\n🌟 Parameter Support Notes (All Task Types):\n\nImage Models (text_to_image / image_to_image)\n\n🆕 MAJOR UPDATE: Nano Banana series now has NATIVE aspect_ratio support!\n\nNano Banana Pro: ✅ Supports aspect_ratio (1:1, 16:9, 9:16, 4:3, 3:4) NATIVELY\nNano Banana2: ✅ Supports aspect_ratio (1:1, 16:9, 9:16, 4:3, 3:4) NATIVELY\nSeeDream 4.5: ✅ Supports 8 ratios via virtual params (1:1, 16:9, 9:16, 4:3, 3:4, 2:3, 3:2, 21:9)\nMidjourney: ❌ 1:1 only (fixed 1024x1024)\n\naspect_ratio support details:\n\n✅ aspect_ratio:\nSeeDream 4.5: ✅ Supports 8 ratios via virtual params (1:1, 16:9, 9:16, 4:3, 3:4, 2:3, 3:2, 21:9)\nNano Banana2: ✅ Native support for 5 ratios (1:1, 16:9, 9:16, 4:3, 3:4)\nNano Banana Pro: ✅ Native support for 5 ratios (1:1, 16:9, 9:16, 4:3, 3:4)\nMidjourney: ❌ 1:1 only (fixed 1024x1024)\n✅ size:\nNano Banana2: 512px, 1K, 2K, 4K (via different attribute_ids, 4-13 pts)\nNano Banana Pro: 1K, 2K, 4K (via different attribute_ids, 10-18 pts)\nSeeDream 4.5: Adaptive default (5 pts)\nMidjourney: 480p/720p (via attribute_id, 8/10 pts)\n❌ 8K: No model supports 8K (max is 4K via Nano Banana Pro)\n❌ Non-standard aspect ratios (7:3, 8:5, etc.): Not supported. Use closest supported ratio or video models.\n✅ n: Multiple outputs supported (1-4), credit × n\n\nWhen user requests unsupported combinations for images:\n\nMidjourney + aspect_ratio (16:9, etc.): Recommend SeeDream 4.5 or Nano Banana series instead\n❌ Midjourney 暂不支持自定义 aspect_ratio（仅支持 1024x1024 方形）\n\n✅ 推荐方案：\n  1. SeeDream 4.5（支持虚拟参数 aspect_ratio）\n     • 支持比例：1:1, 16:9, 9:16, 4:3, 3:4, 2:3, 3:2, 21:9\n     • 成本：5 积分（性价比最佳）\n  2. Nano Banana Pro/2（原生支持 aspect_ratio）\n     • 支持比例：1:1, 16:9, 9:16, 4:3, 3:4\n     • 成本：4-18 积分（按尺寸）\n\n需要我帮你用 SeeDream 4.5 生成吗？\n\nAny model + 8K: Inform user no model supports 8K, max is 4K (Nano Banana Pro)\nAny model + non-standard ratio (7:3, 8:5, etc.): Non-standard ratio, not supported. Suggest closest supported ratio (e.g., 21:9 for ultra-wide, 2:3 for portrait)\nVideo Models (text_to_video / image_to_video / first_last_frame / reference_image)\n✅ resolution: 540P, 720P, 1080P, 2K, 4K (model-dependent, higher res = higher cost)\n✅ aspect_ratio: 16:9, 9:16, 1:1, 4:3 (model-dependent, check form_config)\n✅ duration: 4s, 5s, 10s, 15s (model-dependent, longer = higher cost)\n⚠️ generate_audio: Supported by Veo 3.1, Kling O1, Hailuo (check form_config)\n✅ prompt_extend: AI-powered prompt enhancement (most models support)\n✅ negative_prompt: Content exclusion (most models support)\n✅ shot_type: Single/multi-shot control (model-dependent)\n✅ seed: Reproducibility control (most models support, -1 = random)\n✅ n: Multiple outputs (1-4), credit × n\n🆕 Special Case: Pixverse Model Parameter (v1.0.7+)\n\nAuto-Inference Logic for Pixverse V5.5/V5/V4:\n\nProblem: Pixverse V5.5, V5, V4 lack model field in form_config from Product List API\nBackend Requirement: Backend requires model parameter (e.g., \"v5.5\", \"v5\", \"v4\")\nAuto-Fix: System automatically extracts version from model_name and injects it\nExample: model_name: \"Pixverse V5.5\" → auto-inject model: \"v5.5\"\nExample: model_name: \"Pixverse V4\" → auto-inject model: \"v4\"\nNote: V4.5 and V3.5 include model in form_config (no auto-inference needed)\nRelevant Task Types: All video modes (text_to_video, image_to_video, first_last_frame_to_video, reference_image_to_video)\n\nError Prevention:\n\nWithout auto-inference: err_code=400017 err_msg=Invalid value for model\nWith auto-inference (v1.0.7+): Pixverse V5.5/V5/V4 work seamlessly ✅\nMusic Models (text_to_music)\n\nSuno sonic-v5 (Full-Featured):\n\n✅ custom_mode: Suno only (enables vocal_gender, lyrics, tags support)\n✅ vocal_gender: Suno only (male/female/mixed, requires custom_mode=True)\n✅ lyrics: Suno only (custom lyrics support, requires custom_mode=True)\n✅ make_instrumental: Suno only (force instrumental, no vocals)\n✅ auto_lyrics: Suno only (AI-generated lyrics)\n✅ tags: Suno only (genre/style tags)\n✅ negative_tags: Suno only (exclude unwanted styles)\n✅ title: Suno only (song title)\n❌ duration: Fixed-length output (DouBao ~30s, Suno ~2min, not user-controllable)\n✅ n: Multiple outputs supported (1-2), credit × n\n\nDouBao BGM/Song (Simplified):\n\n✅ prompt: Text description only\n❌ No advanced parameters (no custom_mode, lyrics, vocal control)\n❌ duration: Fixed ~30s output\n\n🎵 Suno Prompt Writing Guide (for gpt_description_prompt):\n\nWhen using Suno, structure your prompt with these elements:\n\nGenre/Style:\n\nExamples: \"lo-fi hip hop\", \"orchestral cinematic\", \"upbeat pop\", \"dark ambient\", \"indie folk\", \"electronic dance\"\n\nTempo/BPM:\n\nExamples: \"80 BPM\", \"fast tempo\", \"slow ballad\", \"moderate pace 110 BPM\"\n\nVocals Control:\n\nNo vocals: \"no vocals\" → set make_instrumental=true\nWith vocals: \"female vocals\" → set vocal_gender=\"female\"\nMale vocals: \"male vocals\" → set vocal_gender=\"male\"\nMixed: Set vocal_gender=\"mixed\"\n\nMood/Emotion:\n\nExamples: \"happy and energetic\", \"melancholic\", \"tense and dramatic\", \"peaceful and calming\"\n\nNegative Tags (exclude styles):\n\nUse negative_tags: \"heavy metal, distortion, screaming\" to exclude unwanted elements\n\nDuration Hint:\n\nExamples: \"60 seconds\", \"30 second loop\", \"2 minute track\"\nNote: Suno typically generates ~2min, not strictly controllable\n\nExample Suno prompts:\n\n\"upbeat lo-fi hip hop, 90 BPM, no vocals, relaxed and chill\"\n→ Set: make_instrumental=true\n\n\"emotional pop ballad, slow tempo, female vocals, melancholic\"\n→ Set: vocal_gender=\"female\"\n\n\"orchestral cinematic trailer music, epic and dramatic, 120 BPM, no vocals\"\n→ Set: make_instrumental=true, tags=\"orchestral,cinematic,epic\"\n\n\"acoustic indie folk, gentle guitar, male vocals, warm and nostalgic\"\n→ Set: vocal_gender=\"male\", tags=\"acoustic,indie,folk\"\n\n\n⚠️ Technical Note for Suno:\n\nmodel_version inside parameters.parameters (e.g., \"sonic-v5\") is different from the outer model_version field (which is \"sonic\"). Always set both correctly.\n\nCommon Parameter Patterns\nn (batch generation): Supported by ALL models. Cost = base_credit × n. Creates n independent resources.\nseed: Supported by most models (-1 = random, >0 = reproducible results)\nprompt_extend: AI-powered prompt enhancement (video models only)\nDecision Tree: When User Requests Unsupported Features\nUser asks for custom aspect ratio image (e.g. \"7:3 landscape\")\n  → ❌ Image models don't support custom ratios\n  → ✅ Solution: \"图片模型不支持自定义比例。建议用视频模型(Wan 2.6 t2v)生成16:9视频，然后截取首帧作为图片。\"\n\nUser asks for 8K image\n  → ❌ No model supports 8K\n  → ✅ Solution: \"当前最高支持4K分辨率(Nano Banana Pro，18积分)。要使用吗？\"\n\nUser asks for video with audio\n  → Check model: Veo 3.1 / Kling O1 / Hailuo have generate_audio\n  → ✅ Solution: \"Veo 3.1 和 Kling O1 支持音频生成(需在参数中设置 generate_audio=True)。要用哪个？\"\n\nUser asks for long music (e.g. \"5 minute track\")\n  → ❌ Duration not user-controllable\n  → ✅ Solution: \"Suno 生成约2分钟音乐。需要更长时长可以生成多段后拼接。\"\n\nUser asks for 30s video\n  → Check model: Most models max 15s\n  → ✅ Solution: \"当前最长15秒。可选模型：Wan 2.6(15s, 75积分), Kling O1(10s, 96积分)。\"\n\n\nWhen user requests unsupported combinations:\n\nVideo + audio (unsupported model) → \"该模型不支持音频。建议用 Veo 3.1 或 Kling O1 (支持 generate_audio 参数)\"\nMusic + custom duration → \"音乐时长由模型固定(Suno约2分钟,DouBao约30秒),无法自定义\"\nVideo duration > 15s → \"当前最长15秒。可选模型：Wan 2.6(15s, 75积分), Kling O1(10s, 96积分)\"\n\nNote: Image-specific unsupported combinations (Midjourney + aspect_ratio, 8K, non-standard ratios) are documented in the \"Image Models\" section above.\n\n🧠 User Preference Memory (Video)\n\nUser preferences have highest priority when they exist. But preferences are only saved when users explicitly express model preferences — not from automatic model selection.\n\nStorage: ~/.openclaw/memory/ima_prefs.json\n{\n  \"user_{user_id}\": {\n    \"text_to_video\":              { \"model_id\": \"wan2.6-t2v\",      \"model_name\": \"Wan 2.6\",  \"credit\": 25, \"last_used\": \"...\" },\n    \"image_to_video\":             { \"model_id\": \"wan2.6-i2v\",      \"model_name\": \"Wan 2.6\",  \"credit\": 25, \"last_used\": \"...\" },\n    \"first_last_frame_to_video\":  { \"model_id\": \"kling-video-o1\", \"model_name\": \"Kling O1\", \"credit\": 48, \"last_used\": \"...\" },\n    \"reference_image_to_video\":   { \"model_id\": \"kling-video-o1\", \"model_name\": \"Kling O1\", \"credit\": 48, \"last_used\": \"...\" }\n  }\n}\n\nModel Selection Flow (Video Generation)\n\nStep 1: Get knowledge-ai recommendation (if installed)\n\nknowledge_recommended_model = read_ima_knowledge_ai()  # e.g., \"Wan 2.6\"\n\n\nStep 2: Check user preference\n\nuser_pref = load_prefs().get(f\"user_{user_id}\", {}).get(task_type)  # e.g., {\"model_id\": \"kling-video-o1\", ...}\n\n\nStep 3: Decide which model to use\n\nif user_pref exists:\n    use_model = user_pref[\"model_id\"]  # Highest priority\nelse:\n    use_model = knowledge_recommended_model or fallback_default\n\n\nStep 4: Check for mismatch (for later hint)\n\nif user_pref exists and knowledge_recommended_model != user_pref[\"model_id\"]:\n    mismatch = True  # Will add hint in success message\n\nWhen to Write (User Explicit Preference ONLY)\n\n✅ Save preference when user explicitly specifies a model:\n\nUser says\tAction\n用XXX / 换成XXX / 改用XXX\tSwitch to model XXX + save as preference\n以后都用XXX / 默认用XXX / always use XXX\tSave + confirm: ✅ 已记住！以后视频生成默认用 [XXX]\n我喜欢XXX / 我更喜欢XXX\tSave as preference\n\n❌ Do NOT save when:\n\nAgent auto-selects from knowledge-ai → not user preference\nAgent uses fallback default → not user preference\nUser says generic quality requests (see \"Clear Preference\" below) → clear preference instead\nWhen to Clear (User Abandons Preference)\n\n🗑️ Clear preference when user wants automatic selection:\n\nUser says\tAction\n用最好的 / 用最合适的 / best / recommended\tClear pref + use knowledge-ai recommendation\n推荐一个 / 你选一个 / 自动选择\tClear pref + use knowledge-ai recommendation\n用默认的 / 用新的\tClear pref + use knowledge-ai recommendation\n试试别的 / 换个试试 (without specific model)\tClear pref + use knowledge-ai recommendation\n重新推荐\tClear pref + use knowledge-ai recommendation\n\nImplementation:\n\ndel prefs[f\"user_{user_id}\"][task_type]\nsave_prefs(prefs)\n\n⭐ Model Selection Priority (Video)\n\nSelection flow:\n\nUser preference (if exists) → Highest priority, always respect\nima-knowledge-ai skill (if installed) → Professional recommendation based on task\nFallback defaults → Use table below (only if neither 1 nor 2 exists)\n\nImportant notes:\n\nUser preference is only saved when user explicitly specifies a model (see \"When to Write\" above)\nKnowledge-ai is always consulted (even when user pref exists) to detect mismatches\nWhen mismatch detected → add gentle hint in success message (does NOT interrupt generation)\n\nThe defaults below are FALLBACK only. User preferences have highest priority, then knowledge-ai recommendations.\n\n💬 User Experience Protocol (IM / Feishu / Discord) v2.0 🆕\n\nv2.0 Updates (aligned with ima-image-ai v1.3):\n\nAdded Step 0 for correct message ordering (fixes group chat bug)\nAdded Step 5 for explicit task completion\nEnhanced Midjourney support with proper timing estimates\nNow 6 steps total (0-5): Acknowledgment → Pre-Gen → Progress → Success/Failure → Done\n\nThis skill runs inside IM platforms (Feishu, Discord via OpenClaw).\nGeneration takes 10 seconds (music) up to 6 minutes (video). Never let users wait in silence.\nAlways follow all 6 steps below, every single time.\n\n🚫 Never Say to Users\n\nThe following are internal implementation details. Never mention them in any user-facing message, under any circumstances:\n\n❌ Never say\t✅ What users care about\nima_create.py / 脚本 / script\t—\n自动化脚本 / automation script\t—\n自动处理产品列表查询\t—\n自动解析参数和配置\t—\n智能轮询 / polling / 轮询\t—\nproduct list / 商品列表接口\t—\nattribute_id / model_version / form_config\t—\nAPI 调用 / HTTP 请求\t—\n任何技术参数名\t模型名称、积分、生成时间\n\nUser messages must only contain: model name, estimated/actual time, credits consumed, result URL, and natural language status updates.\n\nEstimated Generation Time (All Task Types)\nTask Type\tModel\tEstimated Time\tPoll Every\tSend Progress Every\ntext_to_image\tSeeDream 4.5\t25~60s\t5s\t20s\n\tNano Banana2 💚\t20~40s\t5s\t15s\n\tNano Banana Pro\t60~120s\t5s\t30s\n\tMidjourney 🎨\t40~90s\t8s\t25s\nimage_to_image\tSeeDream 4.5\t25~60s\t5s\t20s\n\tNano Banana2 💚\t20~40s\t5s\t15s\n\tNano Banana Pro\t60~120s\t5s\t30s\n\tMidjourney 🎨\t40~90s\t8s\t25s\ntext_to_video\tWan 2.6, Hailuo 2.0/2.3, Vidu Q2, Pixverse\t60~120s\t8s\t30s\n\tSeeDance 1.5 Pro, Kling 2.6, Veo 3.1\t90~180s\t8s\t40s\n\tKling O1, Sora 2 Pro\t180~360s\t8s\t60s\nimage_to_video\tSame ranges as text_to_video\t—\t8s\t40s\nfirst_last_frame / reference\tKling O1, Veo 3.1\t180~360s\t8s\t60s\ntext_to_music\tDouBao BGM / Song\t10~25s\t5s\t10s\n\tSuno (sonic-v5)\t20~45s\t5s\t15s\ntext_to_speech\t(varies by model)\t5~30s\t3s\t10s\n\nestimated_max_seconds = upper bound of the range (e.g. 60 for SeeDream 4.5, 40 for Nano Banana2, 120 for Nano Banana Pro, 90 for Midjourney, 180 for Kling 2.6, 360 for Kling O1).\n\nStep 0 — Initial Acknowledgment Reply (Normal Reply) 🆕\n\n⚠️ CRITICAL: This step is essential for correct message ordering in IM platforms (Feishu, Discord).\n\nBefore doing anything else, reply to the user with a friendly acknowledgment message using your normal reply (not message tool). This reply will automatically appear FIRST in the conversation.\n\nExample acknowledgment messages:\n\nFor images:\n\n好的!来帮你画一只萌萌的猫咪 🐱\n\n收到！马上为你生成一张 16:9 的风景照 🏔️\n\nOK! Starting image generation with SeeDream 4.5 🎨\n\n\nFor videos:\n\n好的!来帮你生成一段视频 🎬\n\n收到！开始用 Wan 2.6 生成视频 🎥\n\n\nFor music:\n\n好的!来帮你创作一首音乐 🎵\n\n\nRules:\n\nKeep it short and warm (< 15 words)\nMatch the user's language (Chinese/English)\nInclude relevant emoji (🐱/🎨/🎬/🎵/✨)\nThis is your ONLY normal reply — all subsequent updates use message tool\n\nWhy this matters:\n\nNormal replies automatically appear FIRST in the conversation thread\nmessage tool pushes appear in chronological order AFTER your initial reply\nThis ensures users see: \"好的!\" → \"🎨 开始生成...\" → \"⏳ 进度...\" → \"✅ 成功!\" (correct order)\nWithout Step 0, the confirmation might appear LAST, confusing users\nStep 1 — Pre-Generation Notification (Push via message tool)\n\nAfter Step 0 reply, use the message tool to push a notification immediately:\n\n[Emoji] 开始生成 [内容类型]，请稍候…\n• 模型：[Model Name]\n• 预计耗时：[X ~ Y 秒]\n• 消耗积分：[N pts]\n\n\nEmoji by content type:\n\n图片 → 🎨\n视频 → 🎬（加注:视频生成需要较长时间，我会定时汇报进度）\n音乐 → 🎵\n\nCost transparency (new requirement):\n\nAlways show credit cost with model tier context\nFor expensive models (>50 pts), offer cheaper alternative proactively\nExamples:\nBalanced (default): \"使用 Wan 2.6（25 积分，最新 Wan）\"\nPremium (user explicit): \"使用高端模型 Kling O1（48-120 积分），质量最佳\"\nPremium (auto-selected): \"使用 Wan 2.6（25 积分）。若需更高质量可选 Kling O1（48 积分起）\"\nBudget (user asked): \"使用 Vidu Q2（5 积分，最省钱）\"\n\nAdapt language to match the user (Chinese / English). For video, always add a note that it takes longer. For expensive models, always mention cheaper alternatives unless user explicitly requested premium.\n\nStep 2 — Progress Updates\n\nPoll the task detail API every [Poll Every] seconds per the table.\nSend a progress update every [Send Progress Every] seconds.\n\n⏳ 正在生成中… [P]%\n已等待 [elapsed]s，预计最长 [max]s\n\n\nProgress formula:\n\nP = min(95, floor(elapsed_seconds / estimated_max_seconds * 100))\n\nCap at 95% — never reach 100% until the API confirms success\nIf elapsed > estimated_max: freeze at 95%, append 「快了，稍等一下…」\nFor video with max=360s: at 120s → 33%, at 250s → 69%, at 400s → 95% (frozen)\nStep 3 — Success Notification\n\nWhen task status = success:\n\nFor Video Tasks (text_to_video / image_to_video / first_last_frame / reference_image)\n\n3.1 Send video player first (IM platforms like Feishu will render inline player):\n\n# Get result URL from script output or task detail API\nresult = get_task_result(task_id)\nvideo_url = result[\"medias\"][0][\"url\"]\n\n# Build caption\ncaption = f\"\"\"✅ 视频生成成功！\n• 模型：[Model Name]\n• 耗时：预计 [X~Y]s，实际 [actual]s\n• 消耗积分：[N pts]\n\n[视频描述]\"\"\"\n\n# Add mismatch hint if user pref conflicts with knowledge-ai recommendation\nif user_pref_exists and knowledge_recommended_model != used_model:\n    caption += f\"\"\"\n\n💡 提示：当前任务也许用 {knowledge_recommended_model} 也会不错（{reason}，{cost} pts）\"\"\"\n\n# Send video with caption (use message tool if available)\nmessage(\n    action=\"send\",\n    media=video_url,  # ⚠️ Use HTTPS URL directly, NOT local file path\n    caption=caption\n)\n\n\nImportant:\n\nHint is non-intrusive — does NOT interrupt generation\nOnly shown when user pref conflicts with knowledge-ai recommendation\nUser can ignore the hint; video is already delivered\n\n3.2 Then send link as text (for copying/sharing):\n\n# Send link message immediately after video\nmessage(action=\"send\", text=f\"🔗 视频链接（可复制分享）：\\n{video_url}\")\n\n\n⚠️ Critical for video:\n\nSend video player FIRST (inline preview)\nSend text link SECOND (for copying)\nInclude first-frame thumbnail URL if available: result[\"medias\"][0][\"cover\"]\nFor Image Tasks (text_to_image / image_to_image)\n# Build caption\ncaption = f\"\"\"✅ 图片生成成功！\n• 模型：[Model Name]\n• 耗时：预计 [X~Y]s，实际 [actual]s\n• 消耗积分：[N pts]\n\n🔗 原始链接：{image_url}\"\"\"\n\n# Add mismatch hint if user pref conflicts with knowledge-ai recommendation\nif user_pref_exists and knowledge_recommended_model != used_model:\n    caption += f\"\"\"\n\n💡 提示：当前任务也许用 {knowledge_recommended_model} 也会不错（{reason}，{cost} pts）\"\"\"\n\n# Send image with caption\nmessage(\n    action=\"send\",\n    media=image_url,\n    caption=caption\n)\n\n\nImportant:\n\nHint is non-intrusive — does NOT interrupt generation\nOnly shown when user pref conflicts with knowledge-ai recommendation\nUser can ignore the hint; image is already delivered\nFor Music Tasks (text_to_music)\n\nSend audio file with player:\n\n✅ 音乐生成成功！\n• 模型：[Model Name]\n• 耗时：预计 [X~Y]s，实际 [actual]s\n• 消耗积分：[N pts]\n• 时长：约 [duration]\n\n[音频URL或直接发送音频文件]\n\nFor TTS Tasks (text_to_speech) — Full UX Protocol (Steps 0–5)\n\nStep 0 — Initial acknowledgment (normal reply)\nFirst reply with a short acknowledgment, e.g.: 好的，正在帮你把这段文字转成语音。 / OK, converting this text to speech.\n\nStep 1 — Pre-generation (message tool)\nPush once:\n\n🔊 开始语音合成，请稍候…\n• 模型：[Model Name]\n• 预计耗时：[X ~ Y 秒]\n• 消耗积分：[N pts]\n\n\nStep 2 — Progress\nPoll every 2–5s. Every 10–15s send: ⏳ 语音合成中… [P]%，已等待 [elapsed]s，预计最长 [max]s. Cap progress at 95% until API returns success.\n\nStep 3 — Success (message tool)\nWhen resource_status == 1 and status != \"failed\", send media = medias[0].url and caption:\n\n✅ 语音合成成功！\n• 模型：[Model Name]\n• 耗时：实际 [actual]s\n• 消耗积分：[N pts]\n🔗 原始链接：[url]\n\n\nUse the URL from the API (do not use local file paths).\n\nStep 4 — Failure (message tool)\nOn failure, send user-friendly message. TTS error translation (do not expose raw API errors):\n\nTechnical\t✅ Say (CN)\t✅ Say (EN)\n401 Unauthorized\t密钥无效或未授权，请至 imaclaw.ai 生成新密钥\tAPI key invalid; generate at imaclaw.ai\n4008 Insufficient points\t积分不足，请至 imaclaw.ai 购买积分\tInsufficient points; buy at imaclaw.ai\nInvalid product attribute\t参数配置异常，请稍后重试\tConfiguration error, try again later\nError 6006 / 6010\t积分或参数不匹配，请换模型或重试\tPoints/params mismatch, try another model\nresource_status == 2 / status failed\t语音合成失败，建议换模型或缩短文本\tSynthesis failed, try another model or shorter text\ntimeout\t合成超时，请稍后重试\tTimed out, try again later\nNetwork error\t网络不稳定，请检查后重试\tNetwork unstable, check and retry\nText too long (TTS)\t文本过长，请缩短后重试\tText too long, please shorten\n\nLinks: API key — https://www.imaclaw.ai/imaclaw/apikey ；Credits — https://www.imaclaw.ai/imaclaw/subscription\n\nStep 5 — Done\nAfter Step 0–4, no further reply needed. Do not send duplicate confirmations.\n\nStep 4 — Failure Notification\n\nWhen task status = failed or any API/network error, send:\n\n❌ [内容类型]生成失败\n• 原因：[natural_language_error_message]\n• 建议改用：\n  - [Alt Model 1]（[特点]，[N pts]）\n  - [Alt Model 2]（[特点]，[N pts]）\n\n需要我帮你用其他模型重试吗？\n\n\n⚠️ CRITICAL: Error Message Translation\n\nNEVER show technical error messages to users. Always translate API errors into natural language.\nAPI key & credits: 密钥与积分管理入口为 imaclaw.ai（与 imastudio.com 同属 IMA 平台）。Key and subscription management: imaclaw.ai (same IMA platform as imastudio.com).\n\nTechnical Error\t❌ Never Say\t✅ Say Instead (Chinese)\t✅ Say Instead (English)\n401 Unauthorized 🆕\tInvalid API key / 401 Unauthorized\t❌ API密钥无效或未授权<br>💡 生成新密钥: https://www.imaclaw.ai/imaclaw/apikey\t❌ API key is invalid or unauthorized<br>💡 Generate API Key: https://www.imaclaw.ai/imaclaw/apikey\n4008 Insufficient points 🆕\tInsufficient points / Error 4008\t❌ 积分不足，无法创建任务<br>💡 购买积分: https://www.imaclaw.ai/imaclaw/subscription\t❌ Insufficient points to create this task<br>💡 Buy Credits: https://www.imaclaw.ai/imaclaw/subscription\n\"Invalid product attribute\" / \"Insufficient points\"\tInvalid product attribute\t生成参数配置异常，请稍后重试\tConfiguration error, please try again later\nError 6006 (credit mismatch)\tError 6006\t积分计算异常，系统正在修复\tPoints calculation error, system is fixing\nError 6009 (no matching rule)\tError 6009\t参数组合不匹配，已自动调整\tParameter mismatch, auto-adjusted\nError 6010 (attribute_id mismatch)\tAttribute ID does not match\t模型参数不匹配，请尝试其他模型\tModel parameters incompatible, try another model\nerror 400 (bad request)\terror 400 / Bad request\t请求参数有误，请稍后重试\tInvalid request parameters, please try again\nresource_status == 2\tResource status 2 / Failed\t生成过程遇到问题，建议换个模型试试\tGeneration failed, please try another model\nstatus == \"failed\" (no details)\tTask failed\t这次生成没成功，要不换个模型试试？\tGeneration unsuccessful, try a different model?\ntimeout\tTask timed out / Timeout error\t生成时间过长已超时，建议用更快的模型\tGeneration took too long, try a faster model\nNetwork error / Connection refused\tConnection refused / Network error\t网络连接不稳定，请检查网络后重试\tNetwork connection unstable, check network and retry\nRate limit exceeded\t429 Too Many Requests / Rate limit\t请求过于频繁，请稍等片刻再试\tToo many requests, please wait a moment\nPrompt moderation (Sora only)\tContent policy violation\t提示词包含敏感内容，请修改后重试\tPrompt contains restricted content, please modify\nModel unavailable\tModel not available / 503 Service Unavailable\t当前模型暂时不可用，建议换个模型\tModel temporarily unavailable, try another model\nLyrics format error (Suno only) 🎵\tInvalid lyrics format\t歌词格式有误，请调整后重试\tLyrics format error, adjust and retry\nPrompt too short/long (Music) 🎵\tPrompt length invalid\t音乐描述过短或过长，请调整到合适长度 (建议20-100字)\tMusic description too short or long, adjust to appropriate length (20-100 chars recommended)\nText too long (TTS) 🔊\tTTS text length\t文本过长，请缩短后重试\tText too long, please shorten and retry\n\nGeneric fallback (when error is unknown):\n\nChinese: 生成过程遇到问题，请稍后重试或换个模型试试\nEnglish: Generation encountered an issue, please try again or use another model\n\nBest Practices:\n\nFocus on user action: Tell users what to do next, not what went wrong technically\nBe reassuring: Use phrases like \"建议换个模型试试\" instead of \"失败了\"\nAvoid blame: Never say \"你的提示词有问题\" → say \"提示词需要调整一下\"\nProvide alternatives: Always suggest 1-2 alternative models in the failure message\n🆕 Include actionable links (v1.0.8+): For 401/4008 errors, provide clickable links to API key generation or credit purchase pages\n🎵 Music-specific (v1.2.0+):\nFor Suno lyrics errors, suggest simplifying lyrics or using auto-generated lyrics (auto_lyrics=true)\nFor prompt length errors, give example length (e.g., \"建议20-100字\")\nFor BGM requests, recommend DouBao BGM over Suno\n🔊 TTS-specific: Use the TTS error translation table in \"For TTS Tasks (text_to_speech)\" above; suggest another model via --list-models or shortening text.\nStep 5 — Done (No Further Action Needed) 🆕\n\nAfter sending Step 3 (success) or Step 4 (failure):\n\nDO NOT send any additional messages unless the user asks a follow-up question\nThe task is complete — wait for the user's next request\nUser preference has been saved (if generation succeeded)\nThe conversation is ready for the next generation request\n\nWhy this step matters:\n\nPrevents unnecessary \"anything else?\" messages that clutter the chat\nAllows users to naturally continue the conversation when ready\nRespects the asynchronous nature of IM platforms\n\nException: If the user explicitly asks \"还有别的吗？\" or similar, then respond naturally.\n\n🆕 Enhanced Error Handling (v1.0.8):\n\nThe Reflection mechanism (3 automatic retries) now provides specific, actionable suggestions for common errors:\n\n401 Unauthorized: System suggests generating a new API key with clickable link\n4008 Insufficient Points: System suggests purchasing credits with clickable link\n500 Internal Server Error: Automatic parameter degradation (size, resolution, duration, quality)\n6009 No Rule Match: Automatic parameter completion from credit_rules\n6010 Attribute Mismatch: Automatic credit_rule reselection\nTimeout: Helpful info with dashboard link for background task status\n\nAll error handling is automatic and transparent — users receive natural language explanations with next steps.\n\nFailure fallback by task type:\n\nTask Type\tFailed Model\tFirst Alt\tSecond Alt\ntext_to_image\tSeeDream 4.5\tNano Banana2 (4pts, fast)\tNano Banana Pro (10-18pts, premium)\ntext_to_image\tNano Banana2\tSeeDream 4.5 (5pts, better quality)\tNano Banana Pro (10-18pts)\ntext_to_image\tNano Banana Pro\tSeeDream 4.5 (5pts)\tNano Banana2 (4pts, budget)\nimage_to_image\tSeeDream 4.5\tNano Banana2 (4pts, fast)\tNano Banana Pro (10pts)\nimage_to_image\tNano Banana2\tSeeDream 4.5 (5pts)\tNano Banana Pro (10pts)\nimage_to_image\tNano Banana Pro\tSeeDream 4.5 (5pts)\tNano Banana2 (4pts)\ntext_to_video\tKling O1\tWan 2.6 (25pts)\tVidu Q2 (5pts)\ntext_to_video\tGoogle Veo 3.1\tKling O1 (48pts)\tSora 2 Pro (122pts)\ntext_to_video\tAny\tWan 2.6 (25pts, most popular)\tHailuo 2.0 (5pts)\nimage_to_video\tWan 2.6\tKling O1 (48pts)\tHailuo 2.0 i2v (25pts)\nimage_to_video\tAny\tWan 2.6 (25pts, most popular)\tVidu Q2 Pro (20pts)\nfirst_last / reference\tKling O1\tKling 2.6 (80pts)\tVeo 3.1 (70pts+)\ntext_to_music 🎵\tSuno\tDouBao BGM (30pts, 背景音乐)\tDouBao Song (30pts, 歌曲生成)\ntext_to_music 🎵\tDouBao BGM\tDouBao Song (30pts)\tSuno (25pts, 功能最强)\ntext_to_music 🎵\tDouBao Song\tDouBao BGM (30pts)\tSuno (25pts, 功能最强)\ntext_to_speech 🔊\t(any)\tQuery --list-models for alternatives\tUse another model_id from product list\n\nMusic-specific failure guidance:\n\nIf Suno fails → Recommend DouBao BGM (for background music) or DouBao Song (for songs)\nIf DouBao BGM fails → Try DouBao Song first (similar pricing), then Suno (more powerful)\nIf DouBao Song fails → Try DouBao BGM first (similar pricing), then Suno (more powerful)\nFor lyrics errors in Suno → Suggest simplifying lyrics or using auto_lyrics=true\nFor prompt length errors → Recommend 20-100 characters\n\nTTS-specific failure guidance:\n\nIf TTS fails → Run --task-type text_to_speech --list-models and suggest another model_id; or shorten text / simplify content. Use the TTS error translation table in \"For TTS Tasks\" above for user-facing messages.\nSupported Models at a Glance\n\nSource: production GET /open/v1/product/list (2026-02-27). Model count reduced significantly. Always query product list API at runtime.\n\nImage Generation (4 models each)\nCategory\tName\tmodel_id\tCost\ntext_to_image\tSeeDream 4.5 🌟\tdoubao-seedream-4.5\t5 pts\ntext_to_image\tMidjourney 🎨\tmidjourney\t8/10 pts (480p/720p)\ntext_to_image\tNano Banana2 💚\tgemini-3.1-flash-image\t4/6/10/13 pts\ntext_to_image\tNano Banana Pro\tgemini-3-pro-image\t10/10/18 pts\nimage_to_image\tSeeDream 4.5 🌟\tdoubao-seedream-4.5\t5 pts\nimage_to_image\tMidjourney 🎨\tmidjourney\t8/10 pts (480p/720p)\nimage_to_image\tNano Banana2 💚\tgemini-3.1-flash-image\t4/6/10/13 pts\nimage_to_image\tNano Banana Pro\tgemini-3-pro-image\t10 pts\n\nMidjourney attribute_ids: 5451/5452 (text_to_image), 5453/5454 (image_to_image)\nNano Banana2 size options: 512px (4pts), 1K (6pts), 2K (10pts), 4K (13pts)\nNano Banana Pro size options: 1K (10pts), 2K (10pts), 4K (18pts for t2i / 10pts for i2i)\n\nImage Model Capabilities (Parameter Support)\n\n⚠️ Critical: Models have varying parameter support. Custom aspect ratios are now supported by multiple models.\n\nModel\tCustom Aspect Ratio\tMax Resolution\tSize Options\tNotes\nSeeDream 4.5\t✅ (via virtual params)\t4K (adaptive)\t8 aspect ratios\tSupports 1:1, 16:9, 9:16, 4:3, 3:4, 2:3, 3:2, 21:9 (5 pts)\nNano Banana2\t✅ Native support 🆕\t4K (4096×4096)\t512px/1K/2K/4K + aspect ratios\tSupports 1:1, 16:9, 9:16, 4:3, 3:4; size via attribute_id\nNano Banana Pro\t✅ Native support 🆕\t4K (4096×4096)\t1K/2K/4K + aspect ratios\tSupports 1:1, 16:9, 9:16, 4:3, 3:4; size via attribute_id\nMidjourney 🎨\t❌ (1:1 only)\t1024px (square)\t480p/720p via attribute_id\tFixed 1024x1024, artistic style focus\n\nKey Capabilities:\n\n✅ Aspect ratio control: SeeDream 4.5 (virtual params), Nano Banana Pro/2 (native support)\n❌ 8K: Not supported by any model (max is 4K)\n✅ Size control: Nano Banana2, Nano Banana Pro, and Midjourney support multiple size options via different attribute_ids\n✅ Budget option: Nano Banana2 is the cheapest at 4 pts for 512px, but 4K costs 13pts\n🎨 Artistic styles: Midjourney excels at creative, artistic, and illustration styles\n💡 Best value: SeeDream 4.5 at 5pts offers aspect ratio flexibility; Nano Banana2 512px at 4pts for fastest/cheapest\nVideo Generation\nCategory\tName\tmodel_id\tCost Range\ntext_to_video (14)\tWan 2.6 🔥\twan2.6-t2v\t25-120 pts\n\tHailuo 2.3\tMiniMax-Hailuo-2.3\t32+ pts\n\tHailuo 2.0\tMiniMax-Hailuo-02\t5+ pts\n\tVidu Q2\tviduq2\t5-70 pts\n\tSeeDance 1.5 Pro\tdoubao-seedance-1.5-pro\t20+ pts\n\tSora 2 Pro\tsora-2-pro\t122+ pts\n\tKling O1\tkling-video-o1\t48-120 pts\n\tKling 2.6\tkling-v2-6\t80+ pts\n\tGoogle Veo 3.1\tveo-3.1-generate-preview\t70-330 pts\n\tPixverse V5.5 / V5 / V4.5 / V4 / V3.5\tpixverse\t12-48 pts\nimage_to_video (14)\tWan 2.6 🔥\twan2.6-i2v\t25-120 pts\n\tHailuo 2.3 / 2.0\tMiniMax-Hailuo-2.3/02\t25-32 pts\n\tVidu Q2 Pro\tviduq2-pro\t20-70 pts\n\tSeeDance 1.5 Pro\tdoubao-seedance-1.5-pro\t47+ pts\n\tSora 2 Pro\tsora-2-pro\t122+ pts\n\tKling O1 / 2.6\tkling-video-o1/v2-6\t48-120 pts\n\tGoogle Veo 3.1\tveo-3.1-generate-preview\t70-330 pts\n\tPixverse V5.5-V3.5\tpixverse\t12-48 pts\nfirst_last_frame (11)\tKling O1 🌟\tkling-video-o1\t48-120 pts\n\tKling 2.6\tkling-v2-6\t80+ pts\n\tOthers (9)\tHailuo 2.0, Vidu Q2 Pro, SeeDance 1.5 Pro, Veo 3.1, Pixverse V5.5-V3.5\t—\nreference_image (6)\tKling O1 🌟\tkling-video-o1\t48-120 pts\n\tGoogle Veo 3.1\tveo-3.1-generate-preview\t70-330 pts\n\tOthers (4)\tVidu Q2, Pixverse V5.5/V5/V4.5\t—\n\n| text_to_video | SeeDance 1.5 Pro / 1.0 Pro | doubao-seedance-1.5-pro / doubao-seedance-1.0-pro | 16 / 15 pts | | text_to_video | Sora 2 Pro / Sora 2 | sora-2-pro / sora-2 | 120 / 35 pts | | text_to_video | Kling O1 / 2.6 / 2.5 Turbo / 1.6 | kling-video-o1 / kling-v2-6 / kling-v2-5-turbo / kling-v1-6 | 48 / 80 / 24 / 32 pts | | text_to_video | Google Veo 3.1 Fast / 3.1 / 3.0 | veo-3.1-fast-generate-preview / veo-3.1-generate-preview / veo-3.0-generate-preview | 55 / 140 / 280 pts | | text_to_video | Pixverse V3.5–V5.5 | pixverse | 12 pts | | image_to_video | Wan 2.6 / 2.6 Flash / 2.5 / 2.2 Plus | wan2.6-i2v / wan2.6-i2v-flash / wan2.5-i2v-preview / wan2.2-i2v-plus | 25 / 12 / 12 / 10 pts | | image_to_video | Kling 2.1 Master | kling-v2-1-master | 150 pts | | first_last_frame_to_video | Kling O1 | kling-video-o1 | 70 pts | | reference_image_to_video | Kling O1 / Vidu Q2 / Q1 | kling-video-o1 / viduq2 / viduq1 | 48 / 10 / 25 pts |\n\nMusic Generation\nCategory\tName\tmodel_id\tCost\tNotes\ntext_to_music\tSuno\tsonic\t25 pts\tsonic-v5; custom_mode, lyrics, vocal_gender\ntext_to_music\tDouBao BGM\tGenBGM\t30 pts\tBackground music\ntext_to_music\tDouBao Song\tGenSong\t30 pts\tSong generation\nSpeech (TTS) — text_to_speech\n\nModels and credits are not fixed. Always call GET /open/v1/product/list?category=text_to_speech (or run the script with --task-type text_to_speech --list-models) to get current model_id, attribute_id, and credit.\n\nima-all-ai has complete TTS capability: This document and the bundled ima_create.py provide full TTS support (routing, parameters, create/poll, UX protocol Steps 0–5, error translation). The ima-tts-ai skill is an optional standalone package with the same specification.\n\nTTS Task Detail — Response Shape\n\nPoll POST /open/v1/tasks/detail until completion. For TTS, medias[] uses the same structure as other IMA audio tasks:\n\nField\tType\tMeaning\nresource_status\tint or null\t0=处理中, 1=可用, 2=失败, 3=已删除；null 视为 0\nstatus\tstring\t\"pending\" / \"processing\" / \"success\" / \"failed\"\nurl\tstring\tAudio URL when resource_status=1 (mp3/wav)\nduration_str\tstring\tOptional, e.g. \"12s\"\nformat\tstring\tOptional, e.g. \"mp3\", \"wav\"\n\nSuccess example: When all medias have resource_status == 1 and status != \"failed\", read medias[0].url (or watermark_url). Example: {\"medias\":[{\"resource_status\":1,\"status\":\"success\",\"url\":\"https://cdn.../output.mp3\",\"duration_str\":\"12s\",\"format\":\"mp3\"}]}.\n\nTTS Create Task — Request Shape\n\ntask_type: \"text_to_speech\". No image input: src_img_url: [], input_images: []. prompt (text to speak) must be inside parameters[].parameters, not at top level. Extra fields (e.g. voice_id, speed) come from product form_config; pass via --extra-params and only include params present in the product’s credit_rules/form_config.\n\nTTS Common Mistakes\nMistake\tFix\nprompt at top level\tPut prompt inside parameters[].parameters (script does this)\nWrong or missing attribute_id\tAlways call product list first; use credit_rules\nSingle poll\tPoll until all medias have resource_status == 1\nIgnoring status when resource_status=1\tCheck status != \"failed\"\nSending params not in form_config/credit_rules\tUse only params from product list; script reflection strips others on retry\n\nAlways call GET /open/v1/product/list?category=<type> first to get the live attribute_id and form_config defaults required for task creation.\n\nThere are two equivalent route systems serving the same backend logic:\n\nRoute\tAuth\tUse Case\n/open/v1/\tAuthorization: Bearer ima_* only\tThird-party / agent access\n/api/v3/\tToken + API Key (dual auth)\tFrontend App\n\nThis skill documents the /open/v1/ Open API. All business logic (credit validation, N-flattening, risk control) runs identically on both paths.\n\nEnvironment\n\nBase URL: https://api.imastudio.com\n\nRequired/recommended headers for all /open/v1/ endpoints:\n\nHeader\tRequired\tValue\tNotes\nAuthorization\t✅\tBearer ima_your_api_key_here\tAPI key authentication\nx-app-source\t✅\tima_skills\tFixed value — identifies skill-originated requests\nx_app_language\trecommended\ten / zh\tProduct label language; defaults to en if omitted\nAuthorization: Bearer ima_your_api_key_here\nx-app-source: ima_skills\nx_app_language: en\n\n📤 When to Upload Images (Quick Reference)\n\nThe IMA Open API does NOT accept raw bytes or base64 images. All image inputs must be public HTTPS URLs.\n\nTask Type\tInput Required?\tUpload Before Create?\tNotes\ntext_to_image\t❌ No\t—\tPrompt only\nimage_to_image\t✅ Yes (1 image)\t✅ Upload first\tSingle input image\ntext_to_video\t❌ No\t—\tPrompt only\nimage_to_video\t✅ Yes (1 image)\t✅ Upload first\tSingle input image\nfirst_last_frame_to_video\t✅ Yes (2 images)\t✅ Upload first\tFirst + last frame\nreference_image_to_video\t✅ Yes (1+ images)\t✅ Upload first\tReference image(s)\ntext_to_music\t❌ No\t—\tPrompt only\ntext_to_speech\t❌ No\t—\tPrompt only (text to speak)\n\nUpload flow:\n\nUser provides local file path or bytes → call prepare_image_url() (see section below)\nUser provides HTTPS URL → use directly, no upload needed\nUse the returned CDN URL (fdl) as the value for input_images / src_img_url\n\nExample workflow (image_to_image):\n\n# User provides local file\nimage_url = prepare_image_url(\"/path/to/photo.jpg\", api_key)\n# → Returns: https://ima-ga.esxscloud.com/webAgent/privite/2026/02/27/..._uuid.jpeg\n\n# Then create task with this URL\ncreate_task(\n    task_type=\"image_to_image\",\n    input_images=[image_url],  # Use uploaded URL\n    prompt=\"turn into oil painting\"\n)\n\n⚠️ MANDATORY: Always Query Product List First\n\nCRITICAL: You MUST call /open/v1/product/list BEFORE creating any task.\nThe attribute_id field is REQUIRED in the create request. If it is 0 or missing, you get:\n\"Invalid product attribute\" → \"Insufficient points\" → task fails completely.\nNEVER construct a create request from the model table alone. Always fetch the product first.\n\nHow to get attribute_id (all task types)\n# Query product list with the correct category\nGET /open/v1/product/list?app=ima&platform=web&category=<task_type>\n# task_type: text_to_image | image_to_image | text_to_video | image_to_video |\n#            first_last_frame_to_video | reference_image_to_video | text_to_music | text_to_speech\n\n# Walk the V2 tree to find your target model (type=3 leaf nodes only)\nfor group in response[\"data\"]:\n    for version in group.get(\"children\", []):\n        if version[\"type\"] == \"3\" and version[\"model_id\"] == target_model_id:\n            attribute_id  = version[\"credit_rules\"][0][\"attribute_id\"]\n            credit        = version[\"credit_rules\"][0][\"points\"]\n            model_version = version[\"id\"]    # = version_id / model_version\n            model_name    = version[\"name\"]\n            form_defaults = {f[\"field\"]: f[\"value\"] for f in version[\"form_config\"]}\n            break\n\nQuick Reference: Known attribute_ids\n\nPre-queried values for convenience. Always call the product list at runtime for accuracy.\n\nModel\tTask Type\tmodel_id\tattribute_id\tcredit\tNotes\ntext_to_image\t\t\t\t\t\nSeeDream 4.5\ttext_to_image\tdoubao-seedream-4.5\t2341\t5 pts\tDefault, balanced\nNano Banana Pro (1K)\ttext_to_image\tgemini-3-pro-image\t2399\t10 pts\t1024×1024\nNano Banana Pro (2K)\ttext_to_image\tgemini-3-pro-image\t2400\t10 pts\t2048×2048\nNano Banana Pro (4K)\ttext_to_image\tgemini-3-pro-image\t2401\t18 pts\t4096×4096\ntext_to_video\t\t\t\t\t\nWan 2.6 (720P, 5s)\ttext_to_video\twan2.6-t2v\t2057\t25 pts\tDefault, balanced\nWan 2.6 (1080P, 5s)\ttext_to_video\twan2.6-t2v\t2058\t40 pts\t—\nWan 2.6 (720P, 10s)\ttext_to_video\twan2.6-t2v\t2059\t50 pts\t—\nWan 2.6 (1080P, 10s)\ttext_to_video\twan2.6-t2v\t2060\t80 pts\t—\nWan 2.6 (720P, 15s)\ttext_to_video\twan2.6-t2v\t2061\t75 pts\t—\nWan 2.6 (1080P, 15s)\ttext_to_video\twan2.6-t2v\t2062\t120 pts\t—\nKling O1 (5s, std)\ttext_to_video\tkling-video-o1\t2313\t48 pts\tLatest Kling\nKling O1 (5s, pro)\ttext_to_video\tkling-video-o1\t2314\t60 pts\t—\nKling O1 (10s, std)\ttext_to_video\tkling-video-o1\t2315\t96 pts\t—\nKling O1 (10s, pro)\ttext_to_video\tkling-video-o1\t2316\t120 pts\t—\ntext_to_music\t\t\t\t\t\nSuno (sonic-v4)\ttext_to_music\tsonic\t2370\t25 pts\tDefault\nDouBao BGM\ttext_to_music\tGenBGM\t4399\t30 pts\t—\nDouBao Song\ttext_to_music\tGenSong\t4398\t30 pts\t—\nAll others\tany\t—\t→ query /open/v1/product/list\t—\tAlways runtime query\n\n⚠️ Production warning: attribute_id and credit values change frequently in production. Always call /open/v1/product/list at runtime; above table is pre-queried reference only (2026-02-27).\n\nCommon Mistakes (and resulting errors)\nMistake\tError\nattribute_id is 0 or missing\t\"Invalid product attribute\" + \"Insufficient points\"\nattribute_id outdated (production changed)\tSame errors; always query product list first\nattribute_id doesn't match parameter combination\tError 6010: \"Attribute ID does not match the calculated rule\"\nprompt at outer parameters[] level\tPrompt ignored; wrong routing\ncast missing from inner parameters.parameters\tBilling validation failure\ncredit value wrong or missing\tError 6006\nmodel_name / model_version missing\tWrong backend routing\nSkipped product list, used table values directly\tAll of the above\n\n⚠️ Critical for Google Veo 3.1 and multi-rule models:\n\nModels like Google Veo 3.1 have multiple credit_rules, each with a different attribute_id for different parameter combinations:\n\n720p + 4s + optimized → attribute_id A\n720p + 8s + optimized → attribute_id B\n4K + 4s + high → attribute_id C\n\nThe script automatically selects the correct attribute_id by matching your parameters (duration, resolution, compression_quality, generate_audio) against each rule's attributes. If the match fails, you get error 6010.\n\nFix: The bundled script now checks these video-specific parameters for smart credit_rule selection. Always use the script, not manual API construction.\n\nCore Flow\n1. GET /open/v1/product/list?app=ima&platform=web&category=<type>\n   → REQUIRED: Get attribute_id, credit, model_version, model_name, form_config defaults\n\n[If input image required]\n2. Upload image → get public HTTPS URL\n   → See \"Image Upload\" section below\n\n3. POST /open/v1/tasks/create\n   → Must include: attribute_id, model_name, model_version, credit, cast, prompt (nested!)\n\n4. POST /open/v1/tasks/detail  {\"task_id\": \"...\"}\n   → Poll until medias[].resource_status == 1\n   → Extract url from completed media\n\nImage Upload (Required Before Image Tasks)\n\nThe IMA Open API does NOT accept raw bytes or base64 images. All image inputs must be public HTTPS URLs.\n\nWhen a user provides an image (local file, bytes, base64), you must upload it first and get a URL. This is exactly what the IMA frontend does before every image task.\n\nReal Upload Flow (from IMA Frontend Source)\n\nThe frontend uses a two-step presigned URL flow via the IM platform:\n\nStep 1: GET /api/rest/oss/getuploadtoken   → returns { ful, fdl }\n          ful = presigned PUT URL (upload destination, expires ~7 days)\n          fdl = final CDN download URL (use this as input_images value)\n\nStep 2: PUT {ful}  with raw image bytes + Content-Type header\n          → image is stored in Aliyun OSS: zhubite-imagent-bot.oss-us-east-1.aliyuncs.com\n          → accessible via CDN: https://ima-ga.esxscloud.com/...\n\nStep 1: Get Upload Token\nGET https://imapi-qa.liveme.com/api/rest/oss/getuploadtoken\n\n\nRequired query parameters (11 total — sourced directly from frontend generateUploadInfo):\n\nParameter\tExample\tDescription\nappUid\tima_xxx...\tUse IMA API key directly — no separate login needed\nappId\twebAgent\tApp identifier (fixed)\nappKey\t32jdskjdk320eew\tApp secret (fixed, used for sign generation)\ncmimToken\tima_xxx...\tUse IMA API key directly — same as appUid\nsign\t117CF6CF...\tIM auth HMAC: SHA1(\"webAgent|32jdskjdk320eew|{timestamp}|{nonce}\").upper()\ntimestamp\t1772042430\tUnix timestamp (seconds), generated per request\nnonce\tCxI1FLI5ajLJZ1jlxZmeg\tRandom nonce string, generated per request\nfService\tprivite\tFixed: storage service type\nfType\tpicture\tpicture for images, video, audio\nfSuffix\tjpeg\tFile extension: jpeg, png, mp4, mp3\nfContentType\timage/jpeg\tMIME type of the file\n\n简化认证：直接使用 IMA API key 填充 appUid 和 cmimToken 参数，无需单独获取凭证。\n\nResponse:\n\n{\n  \"ful\": \"https://zhubite-imagent-bot.oss-us-east-1.aliyuncs.com/webAgent/privite/2026/02/26/..._uuid.jpeg?Expires=...&OSSAccessKeyId=...&Signature=...\",\n  \"fdl\": \"https://ima-ga.esxscloud.com/webAgent/privite/2026/02/26/..._uuid.jpeg\",\n  \"ful_expire\": \"...\",\n  \"fdl_expire\": \"...\",\n  \"fdl_key\": \"...\"\n}\n\nStep 2: Upload Image via Presigned URL\nPUT {ful}\nContent-Type: image/jpeg\nBody: [raw image bytes]\n\n\nNo auth headers needed — the presigned URL already encodes the credentials.\n\nStep 3: Use fdl as the Image URL\n\nAfter the PUT succeeds, use fdl (the CDN URL) as the value for input_images / src_img_url.\n\nPython Implementation\nimport hashlib, time, uuid, requests, mimetypes\n\n# ── 🌐 IMA Upload Service Endpoint (IMA-owned, for image/video uploads) ──────\nIMA_IM_BASE = \"https://imapi-qa.liveme.com\"   # prod: https://imapi.liveme.com\n\n# ── 🔑 Hardcoded APP_KEY (Public, Shared Across All Users) ──────────────────\n# This APP_KEY is a PUBLIC identifier used by IMA Studio's image/video upload \n# service. It is NOT a secret—it's intentionally shared across all users and \n# embedded in the IMA web frontend. This key is used to generate HMAC signatures \n# for upload token requests, but your IMA API key (ima_xxx...) is the ACTUAL \n# authentication credential. Think of APP_KEY as a \"client ID\" rather than a \n# \"client secret.\"\n#\n# ⚠️ Security Note: Your ima_xxx... API key is the sensitive credential. It is \n# sent to imapi.liveme.com as query parameters (appUid, cmimToken). Always use \n# test keys for experiments and rotate your API key regularly.\n#\n# 📖 See SECURITY.md for complete disclosure and network verification guide.\nAPP_ID    = \"webAgent\"\nAPP_KEY   = \"32jdskjdk320eew\"   # Public shared key (used for HMAC sign generation)\nAPP_UID   = \"<your_app_uid>\"    # POST /api/v3/login/app → data.user_id\nAPP_TOKEN = \"<your_app_token>\"  # POST /api/v3/login/app → data.token\n\n\ndef _gen_sign() -> tuple[str, str, str]:\n    \"\"\"Generate per-request (sign, timestamp, nonce).\"\"\"\n    nonce = uuid.uuid4().hex[:21]\n    ts    = str(int(time.time()))\n    raw   = f\"{APP_ID}|{APP_KEY}|{ts}|{nonce}\"\n    sign  = hashlib.sha1(raw.encode()).hexdigest().upper()\n    return sign, ts, nonce\n\n\ndef get_upload_token(app_uid: str, app_token: str,\n                     suffix: str, content_type: str) -> dict:\n    \"\"\"Step 1: Get presigned upload URL from IMA's upload service.\n    \n    Calls GET imapi.liveme.com/api/rest/oss/getuploadtoken with exactly 11 params.\n    Returns: { \"ful\": \"<presigned PUT URL>\", \"fdl\": \"<CDN download URL>\" }\n    \n    Args:\n        app_uid: Your IMA API key (ima_xxx...), used as appUid parameter\n        app_token: Your IMA API key (ima_xxx...), used as cmimToken parameter\n        suffix: File extension (jpeg, png, mp4, mp3)\n        content_type: MIME type (image/jpeg, video/mp4, etc.)\n    \n    Security Note:\n        Your IMA API key (ima_xxx...) is sent to imapi.liveme.com as query \n        parameters (appUid, cmimToken). This is IMA Studio's image/video upload \n        service, separate from the main api.imastudio.com API. Both domains are \n        owned by IMA Studio—this is part of IMA's microservices architecture.\n        \n        Why two domains?\n        - api.imastudio.com: Core AI generation API (product list, task creation)\n        - imapi.liveme.com: Specialized upload service (presigned URL generation)\n        \n        Your API key grants access to both services. For security verification, \n        see SECURITY.md section \"Network Traffic Verification.\"\n    \"\"\"\n    sign, ts, nonce = _gen_sign()\n    r = requests.get(\n        f\"{IMA_IM_BASE}/api/rest/oss/getuploadtoken\",\n        params={\n            # App Key params\n            \"appUid\":       app_uid,       # APP_UID\n            \"appId\":        APP_ID,\n            \"appKey\":       APP_KEY,\n            \"cmimToken\":    app_token,     # APP_TOKEN\n            \"sign\":         sign,\n            \"timestamp\":    ts,\n            \"nonce\":        nonce,\n            # File params\n            \"fService\":     \"privite\",     # fixed\n            \"fType\":        \"picture\",     # picture / video / audio\n            \"fSuffix\":      suffix,        # jpeg / png / mp4 / mp3\n            \"fContentType\": content_type,\n        },\n    )\n    r.raise_for_status()\n    return r.json()[\"data\"]\n\n\ndef upload_image_to_oss(image_bytes: bytes, content_type: str, ful: str) -> None:\n    \"\"\"Step 2: PUT image bytes to the presigned OSS URL. No auth needed.\"\"\"\n    resp = requests.put(ful, data=image_bytes, headers={\"Content-Type\": content_type})\n    resp.raise_for_status()\n\n\ndef prepare_image_url(source, api_key: str) -> str:\n    \"\"\"\n    Full workflow: upload any image and return the CDN URL (fdl).\n    \n    Args:\n        source: file path (str), raw bytes, or already-public HTTPS URL\n        api_key: IMA API key for upload authentication\n    \n    Returns: public HTTPS CDN URL ready to use as input_images value\n    \"\"\"\n    # Already a public URL → use directly, no upload needed\n    if isinstance(source, str) and source.startswith(\"https://\"):\n        return source\n    \n    # Read file bytes\n    if isinstance(source, str):\n        ext = source.rsplit(\".\", 1)[-1].lower() if \".\" in source else \"jpeg\"\n        with open(source, \"rb\") as f:\n            image_bytes = f.read()\n        content_type = mimetypes.guess_type(source)[0] or \"image/jpeg\"\n    else:\n        image_bytes = source\n        ext = \"jpeg\"\n        content_type = \"image/jpeg\"\n\n    # Step 1: Get presigned URL using API key directly\n    token_data = get_upload_token(api_key, ext, content_type)\n    ful = token_data[\"ful\"]\n    fdl = token_data[\"fdl\"]\n\n    # Step 2: Upload to OSS\n    upload_image_to_oss(image_bytes, content_type, ful)\n\n    # Step 3: Return CDN URL\n    return fdl   # use this as input_images / src_img_url value\n\n\nOSS path format: webAgent/privite/{YYYY}/{MM}/{DD}/{timestamp}_{uid}_{uuid}.{ext} CDN base: https://ima-ga.esxscloud.com/ OSS bucket: zhubite-imagent-bot.oss-us-east-1.aliyuncs.com\n\nQuick Reference\nTask Types (category values)\ncategory\tCapability\tInput\ntext_to_image\tText → Image\tprompt\nimage_to_image\tImage → Image\tprompt + input_images\ntext_to_video\tText → Video\tprompt\nimage_to_video\tImage → Video\tprompt + input_images\nfirst_last_frame_to_video\tFirst+Last Frame → Video\tprompt + src_img_url[2]\nreference_image_to_video\tReference Image → Video\tprompt + src_img_url[1+]\ntext_to_music\tText → Music\tprompt\ntext_to_speech\tText → Speech\tprompt (text to speak)\nDetail API status values\n\nEach media in medias[] has two fields:\n\nField\tType\tValues\tDescription\nresource_status\tint (or null)\t0, 1, 2, 3\t0=处理中, 1=可用, 2=失败, 3=已删除。API 可能返回 null，需当作 0。\nstatus\tstring\t\"pending\", \"processing\", \"success\", \"failed\"\t任务状态文案。轮询时以 resource_status 为准；status == \"failed\" 表示失败。\n\nPoll on resource_status first, then ensure status is not \"failed\":\n\nresource_status\tstatus\tMeaning\tAction\n0 or null\tpending / processing\t处理中\tKeep polling; do not stop (null = 0)\n1\tsuccess (or completed)\t完成\tRead url; stop only when all medias are 1\n1\tfailed\t失败 (status 优先)\tStop, handle error\n2\tany\t失败\tStop, handle error\n3\tany\t已删除\tStop\n\nImportant: (1) Treat resource_status: null as 0. (2) Stop only when all medias have resource_status == 1. (3) When resource_status=1, still check status != \"failed\".\n\nAPI 1: Product List\nGET /open/v1/product/list?app=ima&platform=web&category=text_to_image\n\n\nInternally calls downstream /v1/products/listv2. Returns a V2 tree structure: type=2 nodes are model groups, type=3 nodes are versions (leaves). Only type=3 nodes contain credit_rules and form_config.\n\nwebAgent is auto-converted to ima by the gateway — you can use either value for app.\n\n[\n  {\n    \"id\": \"SeeDream\",\n    \"type\": \"2\",\n    \"name\": \"SeeDream\",\n    \"model_id\": \"\",\n    \"children\": [\n      {\n        \"id\": \"doubao-seedream-4-0-250828\",\n        \"type\": \"3\",\n        \"name\": \"SeeDream 4.0\",\n        \"model_id\": \"doubao-seedream-4.0\",\n        \"credit_rules\": [\n          { \"attribute_id\": 332, \"points\": 5, \"attributes\": { \"default\": \"enabled\" } }\n        ],\n        \"form_config\": [\n          { \"field\": \"size\", \"type\": \"tags\", \"value\": \"1K\",\n            \"options\": [{\"label\":\"1K\",\"value\":\"1K\"}, {\"label\":\"2K\",\"value\":\"2K\"}] }\n        ]\n      }\n    ]\n  }\n]\n\n\nHow to pick a version for task creation:\n\nTraverse nodes to find type=3 leaves (versions)\nUse model_id and id (= model_version) from the leaf\nPick credit_rules[].attribute_id matching your desired quality/size (attributes field shows the config)\nUse form_config[].value as default parameters values\n\ncredit_rules[].attribute_id → required for task creation as attribute_id. credit_rules[].points → required for task creation as credit and cast.points.\n\nAPI 2: Create Task\nPOST /open/v1/tasks/create\n\nRequest Structure\n{\n  \"task_type\": \"text_to_image\",\n  \"enable_multi_model\": false,\n  \"src_img_url\": [],\n  \"upload_img_src\": \"\",\n  \"parameters\": [\n    {\n      \"attribute_id\": 8538,\n      \"model_id\":      \"doubao-seedream-4.5\",\n      \"model_name\":    \"SeeDream 4.5\",\n      \"model_version\": \"doubao-seedream-4-5-251128\",\n      \"app\":           \"ima\",\n      \"platform\":      \"web\",\n      \"category\":      \"text_to_image\",\n      \"credit\":        5,\n      \"parameters\": {\n        \"prompt\":       \"a beautiful mountain sunset, photorealistic\",\n        \"size\":         \"4k\",\n        \"n\":            1,\n        \"input_images\": [],\n        \"cast\":         {\"points\": 5, \"attribute_id\": 8538}\n      }\n    }\n  ]\n}\n\nField Reference\nField\tRequired\tDescription\ntask_type\t✅\tMust match parameters[].category\nparameters[].attribute_id\t✅\tFrom credit_rules[].attribute_id in product list\nparameters[].model_id\t✅\tFrom type=3 leaf node model_id\nparameters[].model_version\t✅\tFrom type=3 leaf node id\nparameters[].app\t✅\tUse ima (or webAgent, auto-converted)\nparameters[].platform\t✅\tUse web\nparameters[].category\t✅\tMust match top-level task_type\nparameters[].credit\t✅\tMust equal credit_rules[].points. Error 6006 if wrong.\nparameters[].parameters.prompt\t✅\tThe actual prompt text used by downstream service\nparameters[].parameters.cast\t✅\t{\"points\": N, \"attribute_id\": N} — mirrors credit\nparameters[].parameters.n\t✅\tNumber of outputs (usually 1). Gateway flattens N>1 into separate resources.\nparameters[].parameters.input_images\timage tasks\tArray of input image URLs\ntop-level src_img_url\tmulti-image\tArray for first_last_frame / reference tasks\nN-Field Flattening (Gateway Internal Logic)\n\nWhen n > 1, the gateway automatically:\n\nGenerates n independent resourceBizId values\nDeducts credits n times (one per resource)\nCreates n separate tasks in the downstream service\n\nResponse medias[] will contain n items. Poll until all have resource_status == 1.\n\nResponse\n{\n  \"code\": 0,\n  \"data\": {\n    \"id\": \"task_abc123\",\n    \"biz_id\": \"biz_xxx\",\n    \"task_type\": \"text_to_image\",\n    \"medias\": [],\n    \"generate_count\": 1,\n    \"created_at\": 1700000000000,\n    \"timeout_at\": 1700000300000\n  }\n}\n\n\ndata.id = task ID for polling. timeout_at = Unix ms deadline.\n\nAPI 3: Task Detail (Poll)\nPOST /open/v1/tasks/detail\n{\"task_id\": \"<id from create response>\"}\n\n\nPoll every 2–5s (8s+ for video). Completed response:\n\n{\n  \"id\": \"task_abc\",\n  \"medias\": [{\n    \"resource_status\": 1,\n    \"status\": \"success\",\n    \"url\": \"https://cdn.../output.jpg\",\n    \"cover\": \"https://cdn.../cover.jpg\",\n    \"format\": \"jpg\",\n    \"width\": 1024,\n    \"height\": 1024\n  }]\n}\n\n\nPolling stop condition (must implement exactly):\n\nTreat resource_status: null (or missing) as 0 (processing). Do not stop when you see null; backend may serialize Go *int as null.\nStop only when ALL medias[].resource_status == 1 and no status == \"failed\". If you return on the first media with resource_status == 1 while others are still 0, the task is not fully done and you will keep polling or get inconsistent state.\nStop immediately if any status == \"failed\" or resource_status == 2 or resource_status == 3.\nTask Type Examples\ntext_to_image ✅ Verified\n\nNo image input. src_img_url: [], input_images: []. See API 2 for full example.\n\ntext_to_video ✅ Verified\n\nExtra fields vs text_to_image — all from form_config defaults:\n\n{\n  \"task_type\": \"text_to_video\",\n  \"src_img_url\": [],\n  \"parameters\": [{\n    \"attribute_id\":  4838,\n    \"model_id\":      \"wan2.6-t2v\",\n    \"model_name\":    \"Wan 2.6\",\n    \"model_version\": \"wan2.6-t2v\",\n    \"category\":      \"text_to_video\",\n    \"credit\":        3,\n    \"app\": \"ima\", \"platform\": \"web\",\n    \"parameters\": {\n      \"prompt\":          \"a puppy dancing happily, sunny meadow\",\n      \"negative_prompt\": \"\",\n      \"prompt_extend\":   false,\n      \"duration\":        5,\n      \"resolution\":      \"1080P\",\n      \"aspect_ratio\":    \"16:9\",\n      \"shot_type\":       \"single\",\n      \"seed\":            -1,\n      \"n\":               1,\n      \"input_images\":    [],\n      \"cast\":            {\"points\": 3, \"attribute_id\": 4838}\n    }\n  }]\n}\n\n\nVideo-specific fields from form_config: duration (seconds), resolution, aspect_ratio, shot_type, negative_prompt, prompt_extend. Poll every 8s (video generation is slower). Response medias[].cover = first-frame thumbnail.\n\ntext_to_music\n\nNo image input. src_img_url: [], input_images: [].\n\nimage_to_image ✅ Verified\n{\n  \"task_type\": \"image_to_image\",\n  \"src_img_url\": [\"https://...input.jpg\"],\n  \"parameters\": [{\n    \"attribute_id\":  8560,\n    \"model_id\":      \"doubao-seedream-4.5\",\n    \"model_version\": \"doubao-seedream-4-5-251128\",\n    \"category\":      \"image_to_image\",\n    \"credit\":        5,\n    \"app\": \"ima\", \"platform\": \"web\",\n    \"parameters\": {\n      \"prompt\":       \"turn into oil painting style\",\n      \"size\":         \"4k\",\n      \"n\":            1,\n      \"input_images\": [\"https://...input.jpg\"],\n      \"cast\":         {\"points\": 5, \"attribute_id\": 8560}\n    }\n  }]\n}\n\n\n⚠️ size must be from form_config options (e.g. \"2k\", \"4k\", \"2048x2048\"). \"adaptive\" is NOT valid for SeeDream 4.5 i2i — causes error 400. Top-level src_img_url and parameters.input_images must both contain the input image URL. Some i2i models (e.g. doubao-seededit-3.0-i2i) may not be available in test environments — fall back to SeeDream 4.5.\n\nimage_to_video / first_last_frame_to_video / reference_image_to_video\n{\n  \"src_img_url\": [\"https://first-frame.jpg\", \"https://last-frame.jpg\"]\n}\n\n\nIndex 0 = first frame (or reference), index 1 = last frame (first_last_frame only).\n\nCommon Mistakes\nMistake\tFix\nattribute_id not from credit_rules\tAlways fetch product list first\ncredit value wrong\tMust exactly match credit_rules[].points — error 6006\nprompt at wrong location\tPut prompt in parameters[].parameters.prompt (nested), not only at top level\nPolling biz_id instead of id\tUse id (task ID) for /tasks/detail\nSingle-poll instead of loop\tPoll until resource_status == 1 for ALL medias\nMissing app / platform in parameters\tRequired fields — use ima / web\ncategory mismatch\tparameters[].category must match top-level task_type\nresource_status == 2 not handled\tCheck for failure, don't loop forever\nstatus == \"failed\" ignored\tresource_status=1 + status=\"failed\" means actual failure\nn > 1 and only checking first media\tAll n media items must reach resource_status == 1\nComplete Python Example\n\nSee the Python example sections throughout this documentation for implementation guidance covering all 7 task types.\n\nSupported Models & Search Terms\n\nImage: SeeDream 4.5 (see dream), Midjourney (MJ), Nano Banana 2, Nano Banana Pro Video: Wan 2.6, Kling O1, Kling 2.6, Google Veo 3.1 (veo), Sora 2 Pro, Pixverse V5.5, Hailuo 2.0, Hailuo 2.3, MiniMax Hailuo, SeeDance 1.5 Pro, Vidu Q2 Music: Suno sonic v4, Suno sonic v5, DouBao BGM (GenBGM), DouBao Song (GenSong) TTS: seed-tts-2.0 (seed tts, text-to-speech)\n\nCapabilities: multimodal AI creation, all-in-one, image generation, video generation, music generation, text-to-speech, text-to-image, image-to-video, text-to-music"
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/allenfancy-gan/ima-all-ai",
    "publisherUrl": "https://clawhub.ai/allenfancy-gan/ima-all-ai",
    "owner": "allenfancy-gan",
    "version": "1.0.9",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/ima-all-ai",
    "downloadUrl": "https://openagent3.xyz/downloads/ima-all-ai",
    "agentUrl": "https://openagent3.xyz/skills/ima-all-ai/agent",
    "manifestUrl": "https://openagent3.xyz/skills/ima-all-ai/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/ima-all-ai/agent.md"
  }
}