Tencent SkillHub · Developer Tools

Playwright Scraper Skill

Playwright-based web scraping OpenClaw Skill with anti-bot protection. Successfully tested on complex sites like Discuss.com.hk.

skill openclawclawhub Free

0 Downloads

0 Stars

0 Installs

0 Score

High Signal

Playwright-based web scraping OpenClaw Skill with anti-bot protection. Successfully tested on complex sites like Discuss.com.hk.

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Quick setup

Download the package from Yavira.
Extract the archive and review SKILL.md first.
Import or place the package into your OpenClaw setup.

Requirements

Target platform: OpenClaw
Install method: Manual import
Extraction: Extract archive
Prerequisites: OpenClaw
Primary doc: SKILL.md

Package facts

Download mode: Yavira redirect
Package format: ZIP package
Source platform: Tencent SkillHub
What's included: CHANGELOG.md, CONTRIBUTING.md, INSTALL.md, README.md, README_ZH.md, SKILL.md

Validation

Use the Yavira download entry.
Review SKILL.md after the package is downloaded.
Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

Download the package from Yavira.
Extract it into a folder your agent can access.
Paste one of the prompts below and point your agent at the extracted folder.

New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.

Open Send to Agent page Open JSON manifest Open Markdown brief

Trust & source

Release facts

Source: Tencent SkillHub
Verification: Indexed source record
Version: 1.2.0

Provenance

Publisher: waisimon
Source page: View original listing
Canonical URL: Open canonical page

Documentation

ClawHub primary doc Primary doc: SKILL.md 24 sections Open source page

Playwright Scraper Skill

A Playwright-based web scraping OpenClaw Skill with anti-bot protection. Choose the best approach based on the target website's anti-bot level.

🎯 Use Case Matrix

Target WebsiteAnti-Bot LevelRecommended MethodScriptRegular SitesLowweb_fetch toolN/A (built-in)Dynamic SitesMediumPlaywright Simplescripts/playwright-simple.jsCloudflare ProtectedHighPlaywright Stealth ⭐scripts/playwright-stealth.jsYouTubeSpecialdeep-scraperInstall separatelyRedditSpecialreddit-scraperInstall separately

📦 Installation

cd playwright-scraper-skill npm install npx playwright install chromium

1️⃣ Simple Sites (No Anti-Bot)

Use OpenClaw's built-in web_fetch tool: # Invoke directly in OpenClaw Hey, fetch me the content from https://example.com

2️⃣ Dynamic Sites (Requires JavaScript)

Use Playwright Simple: node scripts/playwright-simple.js "https://example.com" Example output: { "url": "https://example.com", "title": "Example Domain", "content": "...", "elapsedSeconds": "3.45" }

3️⃣ Anti-Bot Protected Sites (Cloudflare etc.)

Use Playwright Stealth: node scripts/playwright-stealth.js "https://m.discuss.com.hk/#hot" Features: Hide automation markers (navigator.webdriver = false) Realistic User-Agent (iPhone, Android) Random delays to mimic human behavior Screenshot and HTML saving support

4️⃣ YouTube Video Transcripts

Use deep-scraper (install separately): # Install deep-scraper skill npx clawhub install deep-scraper # Use it cd skills/deep-scraper node assets/youtube_handler.js "https://www.youtube.com/watch?v=VIDEO_ID"

scripts/playwright-simple.js

Use Case: Regular dynamic websites Speed: Fast (3-5 seconds) Anti-Bot: None Output: JSON (title, content, URL)

scripts/playwright-stealth.js ⭐

Use Case: Sites with Cloudflare or anti-bot protection Speed: Medium (5-20 seconds) Anti-Bot: Medium-High (hides automation, realistic UA) Output: JSON + Screenshot + HTML file Verified: 100% success on Discuss.com.hk

1. Try web_fetch First

If the site doesn't have dynamic loading, use OpenClaw's web_fetch tool—it's fastest.

2. Need JavaScript? Use Playwright Simple

If you need to wait for JavaScript rendering, use playwright-simple.js.

3. Getting Blocked? Use Stealth

If you encounter 403 or Cloudflare challenges, use playwright-stealth.js.

4. Special Sites Need Specialized Skills

YouTube → deep-scraper Reddit → reddit-scraper Twitter → bird skill

🔧 Customization

All scripts support environment variables: # Set screenshot path SCREENSHOT_PATH=/path/to/screenshot.png node scripts/playwright-stealth.js URL # Set wait time (milliseconds) WAIT_TIME=10000 node scripts/playwright-simple.js URL # Enable headful mode (show browser) HEADLESS=false node scripts/playwright-stealth.js URL # Save HTML SAVE_HTML=true node scripts/playwright-stealth.js URL # Custom User-Agent USER_AGENT="Mozilla/5.0 ..." node scripts/playwright-stealth.js URL

📊 Performance Comparison

MethodSpeedAnti-BotSuccess Rate (Discuss.com.hk)web_fetch⚡ Fastest❌ None0%Playwright Simple🚀 Fast⚠️ Low20%Playwright Stealth⏱️ Medium✅ Medium100% ✅Puppeteer Stealth⏱️ Medium✅ Medium-High~80%Crawlee (deep-scraper)🐢 Slow❌ Detected0%Chaser (Rust)⏱️ Medium❌ Detected0%

🛡️ Anti-Bot Techniques Summary

Lessons learned from our testing:

✅ Effective Anti-Bot Measures

Hide navigator.webdriver — Essential Realistic User-Agent — Use real devices (iPhone, Android) Mimic Human Behavior — Random delays, scrolling Avoid Framework Signatures — Crawlee, Selenium are easily detected Use addInitScript (Playwright) — Inject before page load

❌ Ineffective Anti-Bot Measures

Only changing User-Agent — Not enough Using high-level frameworks (Crawlee) — More easily detected Docker isolation — Doesn't help with Cloudflare

Issue: 403 Forbidden

Solution: Use playwright-stealth.js

Issue: Cloudflare Challenge Page

Solution: Increase wait time (10-15 seconds) Try headless: false (headful mode sometimes has higher success rate) Consider using proxy IPs

Issue: Blank Page

Solution: Increase waitForTimeout Use waitUntil: 'networkidle' or 'domcontentloaded' Check if login is required

2026-02-07 Discuss.com.hk Test Conclusions

✅ Pure Playwright + Stealth succeeded (5s, 200 OK) ❌ Crawlee (deep-scraper) failed (403) ❌ Chaser (Rust) failed (Cloudflare) ❌ Puppeteer standard failed (403) Best Solution: Pure Playwright + anti-bot techniques (framework-independent)

🚧 Future Improvements

Add proxy IP rotation Implement cookie management (maintain login state) Add CAPTCHA handling (2captcha / Anti-Captcha) Batch scraping (parallel URLs) Integration with OpenClaw's browser tool

📚 References

Playwright Official Docs puppeteer-extra-plugin-stealth deep-scraper skill

Category context

Code helpers, APIs, CLIs, browser automation, testing, and developer operations.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package

6 Docs

SKILL.md Primary doc
CHANGELOG.md Docs
CONTRIBUTING.md Docs
INSTALL.md Docs
README_ZH.md Docs
README.md Docs

Install for OpenClaw

Requirements

Package facts

Validation

Install with your agent

Trust & source

Release facts

Provenance

Documentation

Playwright Scraper Skill

🎯 Use Case Matrix

📦 Installation

1️⃣ Simple Sites (No Anti-Bot)

2️⃣ Dynamic Sites (Requires JavaScript)

3️⃣ Anti-Bot Protected Sites (Cloudflare etc.)

4️⃣ YouTube Video Transcripts

scripts/playwright-simple.js

scripts/playwright-stealth.js ⭐

1. Try web_fetch First

2. Need JavaScript? Use Playwright Simple

3. Getting Blocked? Use Stealth

4. Special Sites Need Specialized Skills

🔧 Customization

📊 Performance Comparison

🛡️ Anti-Bot Techniques Summary

✅ Effective Anti-Bot Measures

❌ Ineffective Anti-Bot Measures

Issue: 403 Forbidden

Issue: Cloudflare Challenge Page

Issue: Blank Page

2026-02-07 Discuss.com.hk Test Conclusions

🚧 Future Improvements

📚 References

Package contents

Related skills

API Architect

The Dev Team

Web Platform Engineer