{
  "schemaVersion": "1.0",
  "item": {
    "slug": "clean-skill",
    "name": "Clean Skill",
    "source": "tencent",
    "type": "skill",
    "category": "数据分析",
    "sourceUrl": "https://clawhub.ai/zhongrenfei1-hub/clean-skill",
    "canonicalUrl": "https://clawhub.ai/zhongrenfei1-hub/clean-skill",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/clean-skill",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=clean-skill",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "SKILL.md",
      "examples/example_usage.py",
      "examples/test_high_consistency.py",
      "references/api_limitations.md",
      "references/data_schema.md",
      "references/sentiment_analysis.md"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-23T16:43:11.935Z",
      "expiresAt": "2026-04-30T16:43:11.935Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=4claw-imageboard",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=4claw-imageboard",
        "contentDisposition": "attachment; filename=\"4claw-imageboard-1.0.1.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/clean-skill"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/clean-skill",
    "agentPageUrl": "https://openagent3.xyz/skills/clean-skill/agent",
    "manifestUrl": "https://openagent3.xyz/skills/clean-skill/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/clean-skill/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "Restaurant Review Cross-Check",
        "body": "Cross-reference restaurant data from Xiaohongshu and Dianping to provide validated recommendations."
      },
      {
        "title": "Quick Start",
        "body": "Query restaurants by location and cuisine type:\n\n# Basic query\ncrosscheck-restaurants \"上海静安区\" \"日式料理\"\n\n# With filters\ncrosscheck-restaurants \"北京朝阳区\" \"火锅\" --min-rating 4.5 --min-reviews 100"
      },
      {
        "title": "1. Data Collection",
        "body": "Query both platforms simultaneously:\n\nDianping:\n\nFetch restaurants matching location + cuisine\nExtract: name, rating, review_count, price_range, address, tags\n\nXiaohongshu:\n\nSearch notes/posts matching location + cuisine\nExtract: restaurant_name, engagement_metrics (likes/saves), sentiment_score\nNote: Xiaohongshu data requires scraping as no public API"
      },
      {
        "title": "2. Data Matching",
        "body": "Match restaurants across platforms using fuzzy matching:\n\nRestaurant name similarity (Levenshtein distance)\nLocation proximity (address matching)\nHandle name variations (e.g., \"银座寿司\" vs \"银座寿司静安店\")\n\nSee scripts/match_restaurants.py for matching logic."
      },
      {
        "title": "3. Consistency Analysis",
        "body": "Calculate consistency score based on:\n\nRating correlation (0-1): Correlation between platform ratings\nEngagement validation (0-1): Do high ratings correlate with high engagement?\nSentiment alignment (0-1): Do user sentiments align across platforms?\n\nFormula: consistency_score = (rating_corr * 0.5) + (engagement_val * 0.3) + (sentiment_align * 0.2)"
      },
      {
        "title": "4. Recommendation Score",
        "body": "Calculate final recommendation score:\n\nrecommendation_score = (\n    (dianping_rating * 0.4) +\n    (xhs_engagement_normalized * 0.3) +\n    (consistency_score * 0.3)\n) * 10\n\nOutput: 0-10 scale, where >8.0 = high confidence recommendation"
      },
      {
        "title": "Output Format",
        "body": "📍 [Location] [Cuisine Type] 餐厅推荐\n\n1. [Restaurant Name]\n   🏆 推荐指数: X.X/10\n   ⭐ 大众点评: X.X (Xk评价)\n   💬 小红书: X.X⭐ (X笔记)\n   📍 地址: [Address]\n   💰 人均: ¥[Price]\n   ✅ 一致性: [高/中/低] - [Brief explanation]\n   \n   📊 平台对比:\n   - 大众点评标签: [Tags]\n   - 小红书热词: [Keywords]\n   \n   ⚠️ 注意: [Any discrepancies or warnings]\n\n[Continue for top 5-10 restaurants...]"
      },
      {
        "title": "Thresholds",
        "body": "Min rating: 4.0/5.0 (configurable)\nMin reviews: 50 on Dianping, 20 notes on Xiaohongshu (configurable)\nMax results: Top 10 restaurants by recommendation score\nHigh consistency: Score > 0.7\nMedium consistency: Score 0.5-0.7\nLow consistency: Score < 0.5 (flag for manual review)"
      },
      {
        "title": "Dianping",
        "body": "Method: Web scraping (Dianping API requires business partnership)\nBase URL: https://www.dianping.com\nRate limiting: 1 request/2 seconds minimum\nAnti-scraping: Use residential proxies, rotate user agents\n\nSee scripts/fetch_dianping.py for implementation."
      },
      {
        "title": "Xiaohongshu",
        "body": "Method: Web scraping (no public API)\nBase URL: https://www.xiaohongshu.com\nRate limiting: 1 request/3 seconds minimum\nAuthentication: Cookies required for full access\n\nSee scripts/fetch_xiaohongshu.py for implementation."
      },
      {
        "title": "Configuration",
        "body": "Edit scripts/config.py to set:\n\nDEFAULT_THRESHOLDS = {\n    \"min_rating\": 4.0,\n    \"min_dianping_reviews\": 50,\n    \"min_xhs_notes\": 20,\n    \"max_results\": 10\n}\n\nPROXY_CONFIG = {\n    \"use_proxy\": True,\n    \"proxy_list\": [\"http://proxy1:port\", \"http://proxy2:port\"]\n}"
      },
      {
        "title": "Error Handling",
        "body": "No matches found: Suggest broader search terms or nearby areas\nPlatform timeout: Retry with exponential backoff, max 3 attempts\nRate limiting detected: Pause for 60 seconds, rotate proxy\nLow confidence results: Flag results with consistency < 0.5 for manual review"
      },
      {
        "title": "Sentiment Analysis",
        "body": "Xiaohongshu posts use NLP to extract:\n\nFood quality mentions\nService quality mentions\nAtmosphere mentions\nPrice/value mentions\n\nSee references/sentiment_analysis.md for methodology."
      },
      {
        "title": "Fuzzy Matching",
        "body": "Handle restaurant name variations:\n\nChain stores (e.g., \"海底捞火锅\" vs \"海底捞静安店\")\nAbbreviations (e.g., \"鼎泰丰\" vs \"鼎泰丰上海店\")\nTranslation differences\n\nUses thefuzz library for similarity scoring."
      },
      {
        "title": "Dependencies",
        "body": "pip install requests beautifulsoup4 pandas numpy thefuzz selenium lxml\n\nSee scripts/requirements.txt for complete list."
      },
      {
        "title": "Troubleshooting",
        "body": "Issue: Xiaohongshu returns empty results\n\nSolution: Check if cookies expired, re-authenticate\n\nIssue: Dianping blocks requests\n\nSolution: Reduce request rate, rotate proxies\n\nIssue: Poor matching between platforms\n\nSolution: Adjust similarity threshold in match_restaurants.py"
      },
      {
        "title": "References",
        "body": "Data schema documentation\nSentiment analysis guide\nAPI limitations"
      }
    ],
    "body": "Restaurant Review Cross-Check\n\nCross-reference restaurant data from Xiaohongshu and Dianping to provide validated recommendations.\n\nQuick Start\n\nQuery restaurants by location and cuisine type:\n\n# Basic query\ncrosscheck-restaurants \"上海静安区\" \"日式料理\"\n\n# With filters\ncrosscheck-restaurants \"北京朝阳区\" \"火锅\" --min-rating 4.5 --min-reviews 100\n\nWorkflow\n1. Data Collection\n\nQuery both platforms simultaneously:\n\nDianping:\n\nFetch restaurants matching location + cuisine\nExtract: name, rating, review_count, price_range, address, tags\n\nXiaohongshu:\n\nSearch notes/posts matching location + cuisine\nExtract: restaurant_name, engagement_metrics (likes/saves), sentiment_score\nNote: Xiaohongshu data requires scraping as no public API\n2. Data Matching\n\nMatch restaurants across platforms using fuzzy matching:\n\nRestaurant name similarity (Levenshtein distance)\nLocation proximity (address matching)\nHandle name variations (e.g., \"银座寿司\" vs \"银座寿司静安店\")\n\nSee scripts/match_restaurants.py for matching logic.\n\n3. Consistency Analysis\n\nCalculate consistency score based on:\n\nRating correlation (0-1): Correlation between platform ratings\nEngagement validation (0-1): Do high ratings correlate with high engagement?\nSentiment alignment (0-1): Do user sentiments align across platforms?\n\nFormula: consistency_score = (rating_corr * 0.5) + (engagement_val * 0.3) + (sentiment_align * 0.2)\n\n4. Recommendation Score\n\nCalculate final recommendation score:\n\nrecommendation_score = (\n    (dianping_rating * 0.4) +\n    (xhs_engagement_normalized * 0.3) +\n    (consistency_score * 0.3)\n) * 10\n\n\nOutput: 0-10 scale, where >8.0 = high confidence recommendation\n\nOutput Format\n📍 [Location] [Cuisine Type] 餐厅推荐\n\n1. [Restaurant Name]\n   🏆 推荐指数: X.X/10\n   ⭐ 大众点评: X.X (Xk评价)\n   💬 小红书: X.X⭐ (X笔记)\n   📍 地址: [Address]\n   💰 人均: ¥[Price]\n   ✅ 一致性: [高/中/低] - [Brief explanation]\n   \n   📊 平台对比:\n   - 大众点评标签: [Tags]\n   - 小红书热词: [Keywords]\n   \n   ⚠️ 注意: [Any discrepancies or warnings]\n\n[Continue for top 5-10 restaurants...]\n\nThresholds\nMin rating: 4.0/5.0 (configurable)\nMin reviews: 50 on Dianping, 20 notes on Xiaohongshu (configurable)\nMax results: Top 10 restaurants by recommendation score\nHigh consistency: Score > 0.7\nMedium consistency: Score 0.5-0.7\nLow consistency: Score < 0.5 (flag for manual review)\nAPI & Data Sources\nDianping\nMethod: Web scraping (Dianping API requires business partnership)\nBase URL: https://www.dianping.com\nRate limiting: 1 request/2 seconds minimum\nAnti-scraping: Use residential proxies, rotate user agents\n\nSee scripts/fetch_dianping.py for implementation.\n\nXiaohongshu\nMethod: Web scraping (no public API)\nBase URL: https://www.xiaohongshu.com\nRate limiting: 1 request/3 seconds minimum\nAuthentication: Cookies required for full access\n\nSee scripts/fetch_xiaohongshu.py for implementation.\n\nConfiguration\n\nEdit scripts/config.py to set:\n\nDEFAULT_THRESHOLDS = {\n    \"min_rating\": 4.0,\n    \"min_dianping_reviews\": 50,\n    \"min_xhs_notes\": 20,\n    \"max_results\": 10\n}\n\nPROXY_CONFIG = {\n    \"use_proxy\": True,\n    \"proxy_list\": [\"http://proxy1:port\", \"http://proxy2:port\"]\n}\n\nError Handling\nNo matches found: Suggest broader search terms or nearby areas\nPlatform timeout: Retry with exponential backoff, max 3 attempts\nRate limiting detected: Pause for 60 seconds, rotate proxy\nLow confidence results: Flag results with consistency < 0.5 for manual review\nAdvanced Features\nSentiment Analysis\n\nXiaohongshu posts use NLP to extract:\n\nFood quality mentions\nService quality mentions\nAtmosphere mentions\nPrice/value mentions\n\nSee references/sentiment_analysis.md for methodology.\n\nFuzzy Matching\n\nHandle restaurant name variations:\n\nChain stores (e.g., \"海底捞火锅\" vs \"海底捞静安店\")\nAbbreviations (e.g., \"鼎泰丰\" vs \"鼎泰丰上海店\")\nTranslation differences\n\nUses thefuzz library for similarity scoring.\n\nDependencies\npip install requests beautifulsoup4 pandas numpy thefuzz selenium lxml\n\n\nSee scripts/requirements.txt for complete list.\n\nTroubleshooting\n\nIssue: Xiaohongshu returns empty results\n\nSolution: Check if cookies expired, re-authenticate\n\nIssue: Dianping blocks requests\n\nSolution: Reduce request rate, rotate proxies\n\nIssue: Poor matching between platforms\n\nSolution: Adjust similarity threshold in match_restaurants.py\nReferences\nData schema documentation\nSentiment analysis guide\nAPI limitations"
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/zhongrenfei1-hub/clean-skill",
    "publisherUrl": "https://clawhub.ai/zhongrenfei1-hub/clean-skill",
    "owner": "zhongrenfei1-hub",
    "version": "1.1.0",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/clean-skill",
    "downloadUrl": "https://openagent3.xyz/downloads/clean-skill",
    "agentUrl": "https://openagent3.xyz/skills/clean-skill/agent",
    "manifestUrl": "https://openagent3.xyz/skills/clean-skill/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/clean-skill/agent.md"
  }
}