Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Automate browser actions locally via browser-use CLI/Python: open pages, click/type, screenshot, extract HTML/links, debug sessions, and capture login QR codes.
Automate browser actions locally via browser-use CLI/Python: open pages, click/type, screenshot, extract HTML/links, debug sessions, and capture login QR codes.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Prefer browser-use (CLI/Python) over OpenClaw browser tool here; OpenClaw browser may fail if no supported system browser is present. Use persistent sessions to do multi-step flows: --session <name>.
Open browser-use --session demo open https://example.com Inspect (sometimes state returns 0 elements on heavy/JS sites) browser-use --session demo --json state | jq '.data | {url,title,elements:(.elements|length)}' Screenshot (always works; best debugging primitive) browser-use --session demo screenshot /home/node/.openclaw/workspace/page.png HTML for link discovery (works even when state is empty) browser-use --session demo --json get html > /tmp/page_html.json python3 - <<'PY' import json,re html=json.load(open('/tmp/page_html.json')).get('data',{}).get('html','') urls=set(re.findall(r"https?://[^\s\"'<>]+", html)) for u in sorted([u for u in urls if any(k in u for k in ['demo','login','console','qr','qrcode'])])[:200]: print(u) PY Lightweight DOM queries via JS (useful when state is empty) browser-use --session demo --json eval "location.href" browser-use --session demo --json eval "document.title"
Use Python for Agent runs when the CLI run path requires Browser-Use cloud keys or when you need strict control over LLM parameters.
Create .env (or export env vars) with: OPENAI_API_KEY=... OPENAI_BASE_URL=https://api.moonshot.cn/v1 Then run the bundled script: source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate python /home/node/.openclaw/workspace/skills/browser-use-local/scripts/run_agent_kimi.py Kimi/Moonshot quirks observed in practice (fixes): temperature must be 1 for kimi-k2.5. frequency_penalty must be 0 for kimi-k2.5. Moonshot can reject strict JSON Schema used for structured output. Enable: remove_defaults_from_schema=True remove_min_items_from_schema=True If you get a 400 error mentioning response_format.json_schema ... keyword 'default' is not allowed or min_items unsupported, those two flags are the first thing to set.
Screenshot the page and crop candidate regions (fast, robust). If HTML contains data:image/png;base64,..., extract and decode it.
Use scripts/crop_candidates.py to generate multiple likely QR crops from a screenshot. source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate python skills/browser-use-local/scripts/crop_candidates.py \ --in /home/node/.openclaw/workspace/login.png \ --outdir /home/node/.openclaw/workspace/qr_crops
source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate browser-use --session demo --json get html > /tmp/page_html.json python skills/browser-use-local/scripts/extract_data_images.py \ --in /tmp/page_html.json \ --outdir /home/node/.openclaw/workspace/data_imgs
state shows elements: 0: use get html + regex discovery, plus screenshots; use eval to query DOM. Page readiness timeout warnings: usually harmless; rely on screenshot + HTML. CLI flags order: global flags go before the subcommand: โ browser-use --browser chromium --json open https://... โ browser-use open https://... --browser chromium
Code helpers, APIs, CLIs, browser automation, testing, and developer operations.
Largest current source with strong distribution and engagement signals.