Tencent SkillHub · Developer Tools

Ms Qwen Vl

调用魔搭社区（ModelScope）Qwen3-VL 多模态 API 进行视觉解析。使用 OpenAI SDK 兼容方式调用，支持图片内容描述、OCR 文字提取、视觉问答、对象检测等功能。用户提到"魔搭"、"ModelScope"、"Qwen-VL"、"多模态视觉"、"解析图片"等关键词时应触发。

skill openclawclawhub Free

0 Downloads

0 Stars

0 Installs

0 Score

High Signal

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Quick setup

Download the package from Yavira.
Extract the archive and review SKILL.md first.
Import or place the package into your OpenClaw setup.

Requirements

Target platform: OpenClaw
Install method: Manual import
Extraction: Extract archive
Prerequisites: OpenClaw
Primary doc: SKILL.md

Package facts

Download mode: Yavira redirect
Package format: ZIP package
Source platform: Tencent SkillHub
What's included: README.md, SKILL.md, references/api-guide.md, references/models.md, requirements.txt, scripts/ms_qwen_vl.py

Validation

Use the Yavira download entry.
Review SKILL.md after the package is downloaded.
Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

Download the package from Yavira.
Extract it into a folder your agent can access.
Paste one of the prompts below and point your agent at the extracted folder.

New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.

Open Send to Agent page Open JSON manifest Open Markdown brief

Trust & source

Release facts

Source: Tencent SkillHub
Verification: Indexed source record
Version: 0.1.0

Provenance

Publisher: crocketc
Source page: View original listing
Canonical URL: Open canonical page

Documentation

ClawHub primary doc Primary doc: SKILL.md 12 sections Open source page

MS-Qwen-VL Skill

基于 ModelScope Qwen3-VL 系列模型的多模态视觉识别技能，使用 OpenAI SDK 兼容方式调用。

功能特点

OpenAI SDK 兼容：使用标准 OpenAI SDK 调用 API 多种任务支持：图像描述、OCR、视觉问答、目标检测、图表解析双模型模式：默认快速模型（30B）+ 精细高精度模型（235B）灵活输入：支持本地图片和 URL

安装与配置

# 安装依赖 pip install -r requirements.txt # 配置 API Key cp .env.example .env 编辑 .env 文件，填入从 https://modelscope.cn/my/myaccesstoken 获取的 API Key： MODELSCOPE_API_KEY=your_api_key_here

重要：处理本地图片

当用户提供本地图片路径时（如桌面截图），必须使用 Python 脚本处理： python scripts/ms_qwen_vl.py "<图片路径>" --task <任务类型> 脚本会自动将本地文件转换为 ModelScope API 需要的 base64 格式。

处理 URL 图片

当用户提供网络 URL 时，同样使用上述命令，脚本会自动识别： python scripts/ms_qwen_vl.py "<URL>" --task <任务类型>

Claude Code 对话示例

场景 1：分析桌面截图用户: 请帮我描述这张图片 C:\Users\...\Desktop\screenshot.png 助手: [执行] python scripts/ms_qwen_vl.py "C:\Users\...\Desktop\screenshot.png" 场景 2：OCR 识别本地图片用户: 识别这张图中的文字: D:\Documents\invoice.jpg 助手: [执行] python scripts/ms_qwen_vl.py "D:\Documents\invoice.jpg" --task ocr 场景 3：分析网络图片用户: 分析这张图片 https://example.com/photo.jpg 助手: [执行] python scripts/ms_qwen_vl.py "https://example.com/photo.jpg" --task describe 场景 4：视觉问答用户: 这张图里有几个人？C:\Users\...\Desktop\photo.png 助手: [执行] python scripts/ms_qwen_vl.py "C:\Users\...\Desktop\photo.png" --task ask --question "图片里有几个人？"

任务类型对照

用户需求--task 参数描述图片内容describe识别文字/OCRocr回答关于图片的问题ask（需要 --question）检测物体detect解析图表chart

快速使用

# 图像描述（默认） python scripts/ms_qwen_vl.py image.jpg # OCR 文字识别 python scripts/ms_qwen_vl.py image.jpg --task ocr # 视觉问答 python scripts/ms_qwen_vl.py image.jpg --task ask --question "图片里有什么？" # 使用精细模式（235B 模型） python scripts/ms_qwen_vl.py image.jpg --task describe --precise Python 代码调用： from scripts.ms_qwen_vl import analyze_image result = analyze_image("image.jpg", task="ocr") print(result)

任务类型

任务参数说明图像描述describe详细描述图片内容（默认）OCR 识别ocr识别图片中的文字视觉问答ask回答关于图片的问题目标检测detect检测图片中的物体图表解析chart解析图表数据

环境变量

变量名说明MODELSCOPE_API_KEYAPI 密钥（必需）MODELSCOPE_MODEL默认模型（可选）MODELSCOPE_MODEL_PRECISE精细模式模型（可选）

scripts/

ms_qwen_vl.py - 核心解析脚本，提供 analyze_image() 统一接口

references/

api-guide.md - OpenAI SDK 兼容调用方式详细说明 models.md - Qwen3-VL 系列模型及推荐使用场景

Category context

Code helpers, APIs, CLIs, browser automation, testing, and developer operations.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package

4 Docs1 Scripts1 Files

SKILL.md Primary doc
README.md Docs
references/api-guide.md Docs
references/models.md Docs
scripts/ms_qwen_vl.py Scripts
requirements.txt Files