Tencent SkillHub · Developer Tools

Pets Browser

Stealth Chromium browser for AI agents. Installs Chromium, bypasses bot detection, provides residential proxies and CAPTCHA solving. Always use this instead...

skill openclawclawhub Free

0 Downloads

0 Stars

0 Installs

0 Score

High Signal

Stealth Chromium browser for AI agents. Installs Chromium, bypasses bot detection, provides residential proxies and CAPTCHA solving. Always use this instead...

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Quick setup

Download the package from Yavira.
Extract the archive and review SKILL.md first.
Import or place the package into your OpenClaw setup.

Requirements

Target platform: OpenClaw
Install method: Manual import
Extraction: Extract archive
Prerequisites: OpenClaw
Primary doc: SKILL.md

Package facts

Download mode: Yavira redirect
Package format: ZIP package
Source platform: Tencent SkillHub
What's included: README.md, SKILL.md, package.json, scripts/browser-daemon.js, scripts/browser.js, scripts/postinstall.js

Validation

Use the Yavira download entry.
Review SKILL.md after the package is downloaded.
Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

Download the package from Yavira.
Extract it into a folder your agent can access.
Paste one of the prompts below and point your agent at the extracted folder.

New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.

Open Send to Agent page Open JSON manifest Open Markdown brief

Trust & source

Release facts

Source: Tencent SkillHub
Verification: Indexed source record
Version: 0.2.2

Provenance

Publisher: ekenesbek
Source page: View original listing
Canonical URL: Open canonical page

Documentation

ClawHub primary doc Primary doc: SKILL.md 80 sections Open source page

Clawnet

Stealth browser skill for AI agents. Installs and configures Chromium with anti-detection, residential proxies (10+ countries), CAPTCHA solving, and human-like behavior simulation.

When to use

Use this skill whenever the agent needs to: Browse websites, scrape data, or interact with web pages Fill forms, click buttons, navigate multi-step flows Bypass Cloudflare, DataDome, PerimeterX, or other bot detection Solve CAPTCHAs (reCAPTCHA v2/v3, hCaptcha, Turnstile) Appear as a real user from a specific country Interact with Shadow DOM elements or rich text editors Always use this instead of raw Playwright/Puppeteer. This skill handles anti-detection automatically.

Observation — how to read the page

ALWAYS use snapshotAI() instead of page.textContent() or evaluate(). It returns a structured accessibility tree with embedded [ref=eN] annotations. You can then click/fill/type by ref — no CSS selectors needed.

Reading the page (preferred: snapshotAI + refs)

// BAD — dumps ALL text, 50-100K tokens, no structure, no refs const text = await page.textContent('body'); // BAD — brittle regex on raw DOM, breaks when HTML changes await page.evaluate(() => document.querySelector('button').click()); // GOOD — AI-optimized snapshot with clickable refs const { snapshot } = await browser.snapshotAI(); // Returns: // - navigation "Main" [ref=e1]: // - link "Home" [ref=e2] // - heading "Welcome" [ref=e3] // - textbox "Email" [ref=e4] // - textbox "Password" [ref=e5] // - button "Sign in" [ref=e6] // Then interact by ref: await browser.fillRef('e4', 'user@example.com'); await browser.fillRef('e5', 'secret'); await browser.clickRef('e6');

Alternative: snapshot() (YAML without refs)

// Compact accessibility tree without refs — use when you don't need to interact const tree = await browser.snapshot(); const interactive = await browser.snapshot({ interactiveOnly: true }); const formTree = await browser.snapshot({ selector: 'form' });

Observation workflow

Before every action, follow this sequence: Dismiss overlays & accept cookies — after every page.goto(), call await browser.dismissOverlays() to auto-close cookie banners, consent popups, and notification prompts. If a cookie banner or consent dialog is still visible in the snapshot, click "Accept" / "Accept all" / "Принять" before doing anything else. Never skip this step — cookie overlays block interaction with page elements underneath. Snapshot — const { snapshot } = await browser.snapshotAI() to see the page with refs Read text — await browser.extractText() if you need clean readable text (menus, prices, articles) Visual check — await browser.takeScreenshot() only if you need to see colors, layout, maps, or images Act by ref — await browser.clickRef('e4'), await browser.fillRef('e5', 'text') etc. Verify — await browser.snapshotAI() again to confirm the action worked Batch — use batchActions() for multi-step flows

Targeting elements — use refs from snapshotAI()

ALWAYS use refs from snapshotAI() output. NEVER use CSS selectors or evaluate() with regex. // BAD — brittle CSS selectors that break when HTML changes await page.click('#login_field'); await page.fill('input[name="email"]', 'user@example.com'); // BAD — regex on raw DOM, blind guessing await page.evaluate(() => document.querySelectorAll('button').find(b => /sign in/i.test(b.innerText))?.click()); // GOOD — ref-based from snapshotAI() output const { snapshot } = await browser.snapshotAI(); // snapshot shows: textbox "Email" [ref=e4], button "Sign in" [ref=e6] await browser.fillRef('e4', 'user@example.com'); await browser.clickRef('e6'); // ALSO GOOD — semantic locators (when you know the label) await page.getByLabel('Email').fill('user@example.com'); await page.getByLabel('Password').fill('secret'); await page.getByRole('button', { name: 'Sign in' }).click(); // Also available: await page.getByPlaceholder('Search...').fill('query'); await page.getByText('Welcome back').isVisible(); await page.getByRole('link', { name: 'Home' }).click(); await page.getByRole('checkbox', { name: 'Remember me' }).check(); When you see - textbox "Email" in the snapshot, use page.getByRole('textbox', { name: 'Email' }). When you see - button "Submit", use page.getByRole('button', { name: 'Submit' }).

When to fall back to CSS selectors

Only use CSS selectors when: The element has no accessible name or role (rare in modern sites) You need to target by data-testid or other test attributes Shadow DOM elements not reachable by semantic locators (use shadowFill/shadowClickButton)

Multi-tab — parallel tasks

Use multiple tabs only when the user needs different websites open at the same time. One tab per website/service — not one tab per action.

When to open a new tab vs reuse the current one

New tab — different website or service that the user may want to come back to: "Order a taxi AND book a restaurant" → 2 tabs (Uber + OpenTable) "Compare prices on Amazon and eBay" → 2 tabs Same tab — same website, sequential actions: "Order a taxi for me, then for my friend" → 1 tab (Uber), two orders one after another "Book a table for Saturday, then book another for Sunday" → 1 tab (OpenTable), two bookings "Search for Air Jordans, then search for Nike Dunks" → 1 tab (Nike), two searches Think like a human: you wouldn't open a second Uber tab to order a second ride. You'd finish the first ride, then start the second one in the same tab.

Opening tabs

launchBrowser() gives you the first tab. Open more with newTab(): const { launchBrowser } = require('clawnet/scripts/browser'); // First tab — comes from launchBrowser() const taxi = await launchBrowser({ country: 'us', mobile: false }); await taxi.page.goto('https://uber.com'); // Open more tabs — each returns its own result object const resto = await taxi.newTab({ url: 'https://opentable.com', label: 'restaurant' }); const shop = await taxi.newTab({ url: 'https://nike.com', label: 'sneakers' }); Each tab object (taxi, resto, shop) has the full API: page.goto(), snapshotAI(), clickRef(), fillRef(), takeScreenshot(), etc. — all scoped to that tab.

Working with tabs

Rule: keep a named variable per tab. This is how you "remember" which tab is which. // Work on the taxi tab await taxi.page.goto('https://uber.com/ride'); const { snapshot } = await taxi.snapshotAI(); await taxi.fillRef('e5', '123 Main St'); // pickup address await taxi.clickRef('e9'); // "Request ride" // Switch to the restaurant tab — just use the variable const { snapshot: restoSnap } = await resto.snapshotAI(); await resto.fillRef('e3', '2 guests'); await resto.fillRef('e4', 'March 8, 7pm'); await resto.clickRef('e7'); // "Find a table" // Switch to sneakers await shop.snapshotAI(); await shop.clickRef('e12'); // "Air Jordan 1" No explicit "switch tab" call needed — just use the right variable. Each variable is bound to its tab.

Checking all tabs

const { tabs } = await taxi.listTabs(); // [ // { tabId: "t_a1b2c3", url: "https://uber.com/ride", label: "", active: false }, // { tabId: "t_d4e5f6", url: "https://opentable.com/...", label: "restaurant", active: false }, // { tabId: "t_g7h8i9", url: "https://nike.com/...", label: "sneakers", active: true }, // ]

Going back to a tab

If you lost the variable (e.g., across script invocations), use switchTab(tabId): // From listTabs() you know the tabId const uberTab = await taxi.switchTab('t_a1b2c3'); await uberTab.snapshotAI(); // see what's on the Uber tab now

Closing a tab

await shop.closeTab(); // close the sneakers tab // shop variable is now stale — don't use it

Multi-tab workflow pattern

When the user gives you multiple parallel tasks: Plan — identify separate tasks (taxi, restaurant, sneakers) Open tabs — one newTab() per task, save each to a named variable Work round-robin — do a chunk of work on each tab, take screenshots Report — show the user screenshots from each tab so they see all progress Go back — when the user says "cancel the taxi" or "check the menu", switch to the right tab variable

Example: user says "Order a taxi, book a table, and find sneakers"

// Phase 1: open all tabs const taxi = await launchBrowser({ country: 'us', mobile: false }); const resto = await taxi.newTab({ url: 'https://opentable.com' }); const shop = await taxi.newTab({ url: 'https://nike.com' }); // Phase 2: start each task await taxi.page.goto('https://uber.com'); await taxi.fillRef('e3', 'Airport'); // destination const taxiSS = await taxi.takeScreenshot(); await resto.fillRef('e2', 'Italian'); // cuisine search await resto.clickRef('e5'); // search const restoSS = await resto.takeScreenshot(); await shop.fillRef('e1', 'Air Jordan'); // search await shop.clickRef('e3'); // search button const shopSS = await shop.takeScreenshot(); // Phase 3: report to user (ALL tabs' screenshots) // "Here's what I've set up: [taxi screenshot] [restaurant screenshot] [shop screenshot]" // Phase 4: user says "cancel the taxi, check restaurant prices" await taxi.clickRef('e15'); // "Cancel" button const cancelSS = await taxi.takeScreenshot(); const { text } = await resto.extractText(); // read menu prices const pricesSS = await resto.takeScreenshot();

Key rules

One tab per website/service — not one tab per action. Sequential tasks on the same site happen in one tab New tab only for a different site that the user may want to come back to One variable per tab — don't reuse variables, name them by purpose Tabs share cookies — login on one tab is visible on all tabs (same browser context) Screenshots from each tab — always show the user what's happening on each tab Don't open too many tabs — 2-4 is practical, more gets confusing for both you and the user Tabs survive between script runs — the daemon keeps them alive. Use listTabs() to rediscover them

Screenshot rules

ALWAYS attach a screenshot when communicating with the user. The user cannot see the browser — you are their eyes. Every message to the user MUST include a screenshot. No exceptions.

When to take screenshots

Every message you send to the user must have a screenshot attached. Specifically: Before asking for confirmation — "Book this table?" + screenshot of the filled form. The user must SEE what they are confirming. When reporting an error — "No slots available" + screenshot proving the result. Without a screenshot, the user has no reason to trust you. When unable to complete an action — "Authorization failed" + screenshot showing what happened. After every key step — filled form, selected date, entered address, etc. When completing the task (MANDATORY) — "Done! Order placed" + screenshot of the final result/confirmation page. The user must see proof that the action was completed.

How to take screenshots

Use the built-in helpers returned by launchBrowser(): const { page, takeScreenshot, screenshotAndReport } = await launchBrowser(); // Option 1: just the base64 screenshot const base64 = await takeScreenshot(); // Option 2: screenshot + message bundled together const report = await screenshotAndReport("Form filled. Confirm booking?"); // → { message: "Form filled...", screenshot: "iVBOR...", mimeType: "image/png" } Or directly via Playwright: const screenshot = await page.screenshot({ type: 'png' }); const base64 = screenshot.toString('base64');

Rules

Never tell the user "the form is empty", "widget is disabled", or "no results" without a screenshot as proof. Never ask for confirmation without showing the current state of the page. Never say "Done!" without a screenshot of the final result. The user must see what you see. Always.

Installation

clawhub install clawnet This installs Chromium via Playwright automatically. No separate browser download needed. Important (Linux/Docker): If Chromium fails to launch with missing library errors (e.g. libnspr4.so), install system dependencies: npx playwright install-deps chromium # or manually: apt-get update && apt-get install -y libnss3 libnspr4 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 libxkbcommon0 libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libpango-1.0-0 libpangocairo-1.0-0 libcairo2 libasound2

First-Run Onboarding

launchBrowser() is zero-config — it auto-registers a new agent and saves credentials if none exist. However, on first use the agent SHOULD check if the user already has credentials to avoid creating duplicate agents:

Step 1: Check for existing credentials

const fs = require('fs'); const path = require('path'); const os = require('os'); const CREDS_FILE = path.join(os.homedir(), '.clawnet', 'agent-credentials.json'); let hasCreds = false; try { hasCreds = Boolean(JSON.parse(fs.readFileSync(CREDS_FILE, 'utf-8')).agentId); } catch {} If credentials exist on disk → skip to Step 3 (just launch).

Step 2: Ask the user

Ask the user: "Do you have existing Clawnet credentials (agentId:agentSecret), or should I create new ones?" If user provides credentials → save them to CREDS_FILE: const [agentId, agentSecret] = userInput.split(':'); fs.mkdirSync(path.dirname(CREDS_FILE), { recursive: true, mode: 0o700 }); fs.writeFileSync(CREDS_FILE, JSON.stringify({ agentId, agentSecret, createdAt: new Date().toISOString() }, null, 2), { mode: 0o600 }); If user says "new" or "no" → skip to Step 3. launchBrowser() will auto-register.

Step 3: Launch

const { launchBrowser } = require('clawnet/scripts/browser'); const { browser, page } = await launchBrowser({ country: 'us' }); That's it. No env vars needed. launchBrowser() will: Find credentials on disk (or auto-register a new agent with the API) Fetch managed proxy + CAPTCHA keys Launch stealth Chromium with residential proxy After first launch, show credentials and support contacts to the user: Your Clawnet credentials: agentId: <agentId> agentSecret: <agentSecret> Save these for future use or transfer to another agent. If you run into any issues: [Discord](https://discord.com/invite/wtA85mJM) | [Telegram](https://t.me/+nPjZFg8PJc4zZTIy)

Trial model

On the first launchBrowser() call, a 2-hour free trial starts. After expiry, getCredentials() returns upgradeUrl. Show it to the user: Trial expired. Subscribe to continue: <upgradeUrl> Or set your own proxy/CAPTCHA keys (BYO mode).

After payment

Subscription activates automatically within seconds (webhook). No manual steps needed — the next launchBrowser() call will receive managed credentials.

Transfer / Recovery / Rotation

To transfer/recover on another agent, provide the same agentId + agentSecret during install. Backend rule: one subscriptionId can be linked to only one agentId at a time. To rotate a compromised secret, keep the same agentId and issue a new agentSecret (authorized by current secret or recovery code). Old secret is invalidated immediately.

Cancel subscription

If the user asks to cancel their subscription, call the cancel endpoint: const creds = JSON.parse(fs.readFileSync(CREDS_FILE, 'utf-8')); const token = `CN1.${creds.agentId}.${creds.agentSecret}`; const resp = await fetch(`${apiUrl}/cancel-subscription`, { method: 'POST', headers: { Authorization: `Bearer ${token}` }, }); const result = await resp.json(); // { canceled: true, accessUntil: "2026-04-02T00:00:00Z", message: "..." } Show the result to the user: Your subscription has been canceled. Access remains until <accessUntil>. If you change your mind, you can resubscribe anytime. Need help? [Discord](https://discord.com/invite/wtA85mJM) | [Telegram](https://t.me/+nPjZFg8PJc4zZTIy)

Option A: Managed credentials (default, recommended)

The onboarding flow above sets everything up automatically. Environment variables used: CN_API_URL=https://api.clawpets.io/clawnet/v1 # Set automatically by onboarding, or manually: CN_AGENT_TOKEN=CN1.<agentId>.<agentSecret> # Or separately: CN_AGENT_ID=<agent-uuid> CN_AGENT_SECRET=<agent-secret> The skill will automatically fetch Decodo proxy credentials and 2captcha API key on launch.

Option B: BYO (Bring Your Own)

Set proxy and CAPTCHA credentials directly: CN_PROXY_PROVIDER=decodo # decodo | brightdata | iproyal | nodemaven CN_PROXY_USER=your-proxy-user CN_PROXY_PASS=your-proxy-pass CN_PROXY_COUNTRY=us # us, gb, de, nl, jp, fr, ca, au, sg, ro, br, in TWOCAPTCHA_KEY=your-2captcha-key

Option C: No proxy (local testing)

CN_NO_PROXY=1

Browser lifecycle

DO NOT close the browser between steps. The browser persists automatically via a background daemon. Just call launchBrowser() at the start of each script — it reconnects to the existing browser with all your tabs, cookies, and login sessions intact. // Script 1: agent logs into a site const b = await launchBrowser({ country: 'us' }); await b.page.goto('https://example.com/login'); await b.fillRef('e2', 'user@example.com'); await b.clickRef('e5'); // Script ends — browser stays alive // Script 2 (later): agent continues where it left off const b = await launchBrowser({ country: 'us' }); // Same browser, same tab, same cookies — still logged in await b.snapshotAI(); // sees the logged-in page

What NOT to do

// BAD — kills the browser, loses all state await browser.close(); await closeBrowser(); // BAD — opening a new browser when you already have one const b1 = await launchBrowser(); // ... do some work ... const b2 = await launchBrowser(); // this REUSES b1, doesn't create a new browser

When to actually close

Only close the browser when the user explicitly says they're done with ALL browser tasks: "Close the browser" "I'm done, clean up" "Shut everything down" Otherwise, leave it running. The daemon auto-shuts down after 5 minutes of inactivity anyway.

Quick start

const { launchBrowser, solveCaptcha } = require('clawnet/scripts/browser'); // Launch stealth browser with US residential proxy const b = await launchBrowser({ country: 'us', mobile: false, // Desktop Chrome (true = iPhone 15 Pro) headless: true, }); // Browse normally — anti-detection is automatic await b.page.goto('https://example.com'); // Read the page const { snapshot } = await b.snapshotAI(); // Interact by ref await b.fillRef('e4', 'user@example.com'); await b.clickRef('e6'); // Solve CAPTCHA if present const result = await b.solveCaptcha({ verbose: true }); // Take a screenshot for the user const ss = await b.takeScreenshot(); // DO NOT close — browser stays alive for the next step

importCredentials(agentId, agentSecret)

Save user-provided agent credentials to disk. Use when transferring an existing account to a new machine. const { importCredentials } = require('clawnet/scripts/browser'); const result = importCredentials('your-uuid', 'your-secret'); // { ok: true, agentId: 'your-uuid' }

launchBrowser(opts)

Launch a stealth Chromium browser with residential proxy. OptionTypeDefaultDescriptioncountrystring'us'Proxy country: us, gb, de, nl, jp, fr, ca, au, sg, ro, br, inmobilebooleantruetrue = iPhone 15 Pro, false = Desktop ChromeheadlessbooleantrueRun headlessuseProxybooleantrueEnable residential proxysessionstringrandomSticky session ID (same IP across requests)profilestring'default'Persistent profile name (null = ephemeral)reusebooleantrueReuse running browser for this profile (new tab, same process)logLevelstring'actions''off' | 'actions' | 'verbose'. Env: CN_LOG_LEVELtaskstringnullUser's prompt / task description. Recorded in the session log for context. Returns: { browser, ctx, page, logger, tabId, newTab, listTabs, closeTab, switchTab, humanClick, humanMouseMove, humanType, humanScroll, humanRead, solveCaptcha, takeScreenshot, screenshotAndReport, snapshot, snapshotAI, dumpInteractiveElements, clickRef, fillRef, typeRef, selectRef, hoverRef, extractText, getCookies, setCookies, clearCookies, batchActions, sleep, rand, getSessionLog }

solveCaptcha(page, opts)

Auto-detect and solve CAPTCHA on the current page. Supports reCAPTCHA v2/v3, hCaptcha, Cloudflare Turnstile. OptionTypeDefaultDescriptionapiKeystringenv TWOCAPTCHA_KEY2captcha API keytimeoutnumber120000Max wait time in msverbosebooleanfalseLog progress Returns: { token, type, sitekey }

takeScreenshot(page, opts)

Take a screenshot and return it as a base64-encoded PNG string. OptionTypeDefaultDescriptionfullPagebooleanfalseCapture the full scrollable page Returns: string (base64 PNG)

screenshotAndReport(page, message, opts)

Take a screenshot and pair it with a message. Returns an object ready to attach to an LLM response. OptionTypeDefaultDescriptionfullPagebooleanfalseCapture the full scrollable page Returns: { message, screenshot, mimeType } — screenshot is base64 PNG

snapshot(page, opts) / snapshot(opts) (from launchBrowser return)

Capture a compact accessibility tree of the page. Returns YAML string. Use this instead of page.textContent(). See "Observation" section above. OptionTypeDefaultDescriptionselectorstring'body'CSS selector to scope the snapshotinteractiveOnlybooleanfalseKeep only interactive elements (buttons, inputs, links)maxLengthnumber20000Truncate output to N characterstimeoutnumber5000Playwright timeout in ms Returns: string (YAML accessibility tree)

snapshotAI(opts) — AI-optimized snapshot with refs ⭐ PREFERRED

Returns a structured accessibility tree with embedded [ref=eN] annotations. Use this as the primary way to read pages. const { snapshot, refs, truncated } = await browser.snapshotAI(); // snapshot: "- heading \"Welcome\" [ref=e1]\n- textbox \"Email\" [ref=e2]\n- button \"Sign in\" [ref=e3]" // refs: { e1: true, e2: true, e3: true } OptionTypeDefaultDescriptionmaxCharsnumber20000Truncate snapshot to N characterstimeoutnumber5000Playwright timeout in ms Returns: { snapshot: string, refs: Object, truncated?: boolean }

clickRef(ref, opts) — Click element by ref

await browser.clickRef('e3'); // left click await browser.clickRef('e3', { doubleClick: true }); // double click

fillRef(ref, value, opts) — Fill input by ref

await browser.fillRef('e2', 'user@example.com');

typeRef(ref, text, opts) — Type text by ref

await browser.typeRef('e2', 'hello'); // instant fill await browser.typeRef('e2', 'hello', { slowly: true }); // human-like typing await browser.typeRef('e2', 'hello', { submit: true }); // type + Enter

selectRef(ref, value, opts) — Select option by ref

await browser.selectRef('e5', 'US');

hoverRef(ref, opts) — Hover element by ref

await browser.hoverRef('e1'); // reveal tooltip/dropdown

newTab(opts) — Open a new tab

Opens a new browser tab and returns a new result object scoped to that tab. All methods on the returned object (page.goto, snapshotAI, clickRef, etc.) operate on the new tab. OptionTypeDefaultDescriptionurlstring-Navigate to this URL immediatelylabelstring''Human-readable label for the tab const tab2 = await browser.newTab({ url: 'https://opentable.com', label: 'restaurant' }); await tab2.snapshotAI(); // snapshot of opentable.com

listTabs() — List all open tabs

Returns all open tabs with their IDs, URLs, labels, and active status. const { tabs } = await browser.listTabs(); // [{ tabId: "t_abc", url: "https://...", label: "restaurant", active: true, createdAt: "..." }]

closeTab(tabId?) — Close a tab

Closes the specified tab (or the current tab if no tabId given). await tab2.closeTab(); // close this tab await browser.closeTab('t_abc'); // close by ID

switchTab(tabId) — Switch to a tab

Returns a new result object scoped to the specified tab. Use when you need to return to a tab whose variable you lost (e.g., across script invocations). const { tabs } = await browser.listTabs(); const uberTab = await browser.switchTab(tabs[0].tabId); await uberTab.snapshotAI();

extractText(opts) (from launchBrowser return) / extractText(page, opts)

Extract clean readable text from the page, stripping navigation, ads, modals, and noise. Use when you need to READ the page content (menus, prices, articles) rather than interact with UI elements. OptionTypeDefaultDescriptionmodestring'readability''readability' strips noise, 'raw' returns body.innerTextmaxCharsnumberunlimitedTruncate text to N characters Returns: { url, title, text, truncated } // Read a restaurant menu const { text } = await extractText({ mode: 'readability' }); // → "Pizza Menu\n\nMargherita\nClassic pizza with mozzarella...\nFrom 399 ₽\n\n..." // Raw mode for simple pages const { text: raw } = await extractText({ mode: 'raw', maxChars: 5000 }); When to use extractText() vs snapshot(): extractText() — reading text content (menus, prices, articles, descriptions) snapshot() — understanding page structure and finding interactive elements (buttons, inputs, links)

getCookies(urls?) / setCookies(cookies) / clearCookies()

Manage browser cookies. Use for session persistence, login state checks, and cookie transfer between tasks. // Check if logged in const cookies = await getCookies('https://example.com'); const hasAuth = cookies.some(c => c.name === 'session_id'); // Set cookies (e.g., from a previous session) await setCookies([ { name: 'session_id', value: 'abc123', url: 'https://example.com' }, { name: 'lang', value: 'en', url: 'https://example.com' }, ]); // Clear all cookies (logout) await clearCookies();

batchActions(actions, opts) (from launchBrowser return) / batchActions(page, actions, opts)

Execute multiple actions sequentially in a single call. Reduces LLM round-trips for multi-step flows. OptionTypeDefaultDescriptionstopOnErrorbooleanfalseHalt on first failuredelayBetweennumber50ms delay between actions for realism Each action: { action, selector, text, value, key, ms, options } Supported actions: click, fill, type, press, hover, select, scroll, focus, wait, waitForSelector, humanClick, humanType, snapshot Returns: { results: [{index, success, result?, error?}], total, successful, failed } // Fill a booking form in one call const result = await batchActions([ { action: 'fill', selector: '#name', text: 'John' }, { action: 'fill', selector: '#phone', text: '+1234567890' }, { action: 'select', selector: '#guests', value: '2' }, { action: 'humanClick', selector: '#submit' }, ], { stopOnError: true }); // result.successful === 4, result.failed === 0

humanType(page, selector, text)

Type text with human-like speed (60-220ms/char) and occasional micro-pauses.

humanClick(page, x, y)

Click with natural Bezier curve mouse movement.

humanScroll(page, direction, amount)

Smooth multi-step scroll with jitter. Direction: 'down' or 'up'.

humanRead(page, minMs, maxMs)

Pause as if reading the page. Optional light scroll.

shadowFill(page, selector, value)

Fill an input inside Shadow DOM (works where page.fill() fails).

shadowClickButton(page, buttonText)

Click a button by text label, searching through Shadow DOM.

pasteIntoEditor(page, editorSelector, text)

Paste text into Lexical, Draft.js, Quill, ProseMirror, or contenteditable editors.

dumpInteractiveElements(page, opts) / dumpInteractiveElements(opts) (from launchBrowser return)

List all interactive elements using the accessibility tree. Equivalent to snapshot({ interactiveOnly: true }). Returns a compact YAML string with only buttons, inputs, links, and other interactive elements. Falls back to DOM querySelectorAll on Playwright < 1.49. OptionTypeDefaultDescriptionselectorstring'body'CSS selector to scope the dump

getSessionLogs()

List all session log files, newest first. Returns [{ sessionId, file, mtime, size }].

getSessionLog(sessionId)

Read a specific session log by ID. Returns an array of log entries.

Action logging

Every browser session records comprehensive structured logs in ~/.clawnet/logs/<session-id>.jsonl. The log captures the full picture: user's task → every agent action → page events → errors.

What's logged

The logging system uses a Proxy on the Playwright page object to capture every method call — including chained locators like page.getByRole('button', { name: 'Submit' }).click(). Automatically captured: User task — the task parameter from launchBrowser({ task: "..." }) All page actions — goto, click, fill, type, press, check, hover, selectOption, etc. All locator chains — getByRole → click, getByLabel → fill, locator → nth → click, etc. Observation calls — snapshot(), takeScreenshot(), dumpInteractiveElements() Page events — navigations, popups, dialogs, downloads, page errors human* helpers — humanClick, humanType, humanScroll, etc. CAPTCHA — solveCaptcha attempts and results

Log levels

LevelWhat's loggedUse caseoffNothingProduction, no overheadactions (default)User task, navigation, clicks, fills, typing, locator chains, observation calls, page events, human* helpers, errorsStandard debugging — see what the agent doesverboseAll above + textContent results, evaluate expressions, HTTP 4xx/5xx, console errors/warnings, logger.note()Deep debugging — see what the agent reads and what goes wrong on the page Set via launchBrowser({ logLevel: 'verbose', task: 'Book a table at Aurora' }) or env CN_LOG_LEVEL=verbose.

Example log output (actions level)

{"ts":"...","action":"launch","country":"ru","mobile":true,"profile":"default","logLevel":"actions"} {"ts":"...","action":"task","prompt":"Войти в Telegram и отправить сообщение Привет"} {"ts":"...","action":"goto","method":"goto","args":["https://web.telegram.org"],"chain":"goto(\"https://web.telegram.org\")","url":"about:blank","ok":true,"status":200} {"ts":"...","action":"navigated","url":"https://web.telegram.org/a/"} {"ts":"...","action":"snapshot","selector":"body","interactiveOnly":false,"length":3842,"url":"https://web.telegram.org/a/"} {"ts":"...","action":"locator","chain":"getByRole(\"link\", {\"name\":\"Log in by phone Number\"})","url":"https://web.telegram.org/a/"} {"ts":"...","action":"click","method":"click","args":[],"chain":"getByRole(\"link\", {\"name\":\"Log in by phone Number\"}) → click()","url":"https://web.telegram.org/a/","ok":true} {"ts":"...","action":"navigated","url":"https://web.telegram.org/a/#/login"} {"ts":"...","action":"fill","method":"fill","args":["77054595958"],"chain":"getByLabel(\"Phone number\") → fill(\"77054595958\")","url":"https://web.telegram.org/a/#/login","ok":true} {"ts":"...","action":"screenshot","url":"https://web.telegram.org/a/#/login"} {"ts":"...","action":"humanClick","args":["page",100,200],"url":"https://web.telegram.org/a/#/login","ok":true}

Recording user task

Always pass the user's request via task so the log has full context: const { page, logger } = await launchBrowser({ task: 'Забронировать столик в Aurora на 8 марта, 19:00, 2 гостя', logLevel: 'verbose', country: 'ru', });

Agent reasoning with logger.note()

At verbose level, the agent can record its reasoning: logger.note('Navigating to booking page to check available slots'); await page.goto('https://restaurant.com/booking'); logger.note('Form is empty — need to fill date, time, guests before checking');

Reading logs

const { getSessionLogs, getSessionLog } = require('clawnet/scripts/browser'); // List recent sessions const sessions = getSessionLogs(); // [{ sessionId: 'abc-123', mtime: '2026-03-01T...', size: 4096 }, ...] // Read a specific session const log = getSessionLog(sessions[0].sessionId); // [{ ts: '...', action: 'task', prompt: 'Войти в Telegram...' }, // { ts: '...', action: 'goto', method: 'goto', args: ['https://web.telegram.org'], ... }, // { ts: '...', action: 'click', chain: 'getByRole("link") → click()', ... }, ...] // Or from the current session const { getSessionLog: currentLog } = await launchBrowser(); // ... do work ... const entries = currentLog();

getCredentials()

Fetch managed proxy + CAPTCHA credentials from Clawnet API. Called automatically by launchBrowser() on fresh launch (not on reuse). Starts the 2-hour trial clock on first call. Requires CN_API_URL and agent credentials (from install, CN_AGENT_TOKEN, or CN_AGENT_ID + CN_AGENT_SECRET).

makeProxy(sessionId, country)

Build proxy config from environment variables. Supports Decodo, Bright Data, IPRoyal, NodeMaven.

Supported proxy providers

ProviderEnv prefixSticky sessionsCountriesDecodo (default)CN_PROXY_*Port-based (10001-49999)10+Bright DataCN_PROXY_*Session string195+IPRoyalCN_PROXY_*Password suffix190+NodeMavenCN_PROXY_*Session string150+

Login to a website

const { launchBrowser } = require('clawnet/scripts/browser'); const { page, snapshot } = await launchBrowser({ country: 'us', mobile: false }); await page.goto('https://github.com/login'); // Observe the page first — see what's available const tree = await snapshot({ interactiveOnly: true }); // tree shows: textbox "Username or email address", textbox "Password", button "Sign in" // Use semantic locators that match the snapshot await page.getByLabel('Username or email address').fill('myuser'); await page.getByLabel('Password').fill('mypass'); await page.getByRole('button', { name: 'Sign in' }).click();

Scrape with CAPTCHA bypass

const { launchBrowser, solveCaptcha } = require('clawnet/scripts/browser'); const { page, snapshot } = await launchBrowser({ country: 'de' }); await page.goto('https://protected-site.com'); // Auto-detect and solve any CAPTCHA try { await solveCaptcha(page, { verbose: true }); } catch (e) { console.log('No CAPTCHA found or solving failed:', e.message); } // Read the content area compactly const content = await snapshot({ selector: '.content' });

Fill Shadow DOM forms

const { launchBrowser, shadowFill, shadowClickButton } = require('clawnet/scripts/browser'); const { page } = await launchBrowser(); await page.goto('https://app-with-shadow-dom.com'); await shadowFill(page, 'input[name="email"]', 'user@example.com'); await shadowClickButton(page, 'Submit');

Category context

Code helpers, APIs, CLIs, browser automation, testing, and developer operations.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package

3 Scripts2 Docs1 Config

SKILL.md Primary doc
README.md Docs
scripts/browser-daemon.js Scripts
scripts/browser.js Scripts
scripts/postinstall.js Scripts
package.json Config