Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Extract metadata and content from WeChat Official Account articles. Use when user needs to parse WeChat article URLs (mp.weixin.qq.com), extract article info...
Extract metadata and content from WeChat Official Account articles. Use when user needs to parse WeChat article URLs (mp.weixin.qq.com), extract article info...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
Extract metadata and content from WeChat Official Account (微信公众号) articles.
Parse WeChat article URLs (mp.weixin.qq.com) Extract article metadata: title, author, description, publish time Extract account info: name, avatar, alias, description Get article content (HTML) Get cover image URL Support multiple article types: post, video, image, voice, text, repost Handle various error cases: deleted content, expired links, access limits
const { extract } = require('./scripts/extract.js'); const result = await extract('https://mp.weixin.qq.com/s?__biz=...'); // Returns: { done: true, code: 0, data: {...} }
const html = await fetch(url).then(r => r.text()); const result = await extract(html, { url: sourceUrl });
const result = await extract(url, { shouldReturnContent: true, // Return HTML content (default: true) shouldReturnRawMeta: false, // Return raw metadata (default: false) shouldFollowTransferLink: true, // Follow migrated account links (default: true) shouldExtractMpLinks: false, // Extract embedded mp.weixin links (default: false) shouldExtractTags: false, // Extract article tags (default: false) shouldExtractRepostMeta: false // Extract repost source info (default: false) });
{ done: true, code: 0, data: { // Account info account_name: "公众号名称", account_alias: "微信号", account_avatar: "头像URL", account_description: "功能介绍", account_id: "原始ID", account_biz: "biz参数", account_biz_number: 1234567890, account_qr_code: "二维码URL", // Article info msg_title: "文章标题", msg_desc: "文章摘要", msg_content: "HTML内容", msg_cover: "封面图URL", msg_author: "作者", msg_type: "post", // post|video|image|voice|text|repost msg_has_copyright: true, msg_publish_time: Date, msg_publish_time_str: "2024/01/15 10:30:00", // Link params msg_link: "文章链接", msg_source_url: "阅读原文链接", msg_sn: "sn参数", msg_mid: 1234567890, msg_idx: 1 } }
{ done: false, code: 1001, msg: "无法获取文章信息" }
CodeMessageDescription1000文章获取失败General failure1001无法获取文章信息Missing title or publish time1002请求失败HTTP request failed1003响应为空Empty response1004访问过于频繁Rate limited1005脚本解析失败Script parsing error1006公众号已迁移Account migrated2001请提供文章内容或链接Missing input2002链接已过期Link expired2003内容涉嫌侵权Content removed (copyright)2004无法获取迁移后的链接Migration link failed2005内容已被发布者删除Content deleted by author2006内容因违规无法查看Content blocked2007内容发送失败Failed to send2008系统出错System error2009不支持的链接Unsupported URL2010内容获取失败Content fetch failed2011涉嫌过度营销Marketing/spam content2012账号已被屏蔽Account blocked2013账号已自主注销Account deleted2014内容被投诉Content reported2015账号处于迁移流程中Account migrating2016冒名侵权Impersonation
Required npm packages: cheerio - HTML parsing dayjs - Date formatting request-promise - HTTP requests qs - Query string parsing lodash.unescape - HTML entities
Handles various WeChat page structures and anti-scraping measures Automatically detects article type from page content Supports extracting from Sogou WeChat search results (weixin.sogou.com) Some fields may be null depending on article type and page structure
Messaging, meetings, inboxes, CRM, and teammate communication surfaces.
Largest current source with strong distribution and engagement signals.