Better Browser Use
A hybrid SOTA browser use skill with tiered stealth escalation and turnstyle captcha trigger-avoidance. Includes early WebMCP functionality.
Install / Use
/learn @yoloshii/Better Browser UseQuality Score
Category
Development & EngineeringSupported Platforms
README
better-browser-use
Agentic browser automation with ARIA snapshots, three stealth tiers, and human-like behavior simulation.
An AI agent controls the browser by observing ARIA accessibility trees (not screenshots or HTML), reasoning about page state, and executing actions through element refs. Sessions persist cookies, storage, and fingerprints across runs. Anti-bot protection is handled through progressive stealth escalation.
How It Works
Agent Server Browser
│ │ │
├─ launch(tier=1, url) ──────►│─── open browser ────────────►│
│◄──── {session_id} ──────────│ │
│ │ │
├─ snapshot ──────────────────►│─── ARIA tree ───────────────►│
│◄──── @e1 link "Login" │◄── {tree, refs} ─────────────│
│ @e2 input "Email" │ │
│ @e3 button "Submit" │ │
│ │ │
├─ click @e1 ─────────────────►│─── humanized click ─────────►│
│◄──── {page_changed: true} │◄── result ───────────────────│
│ │ │
├─ snapshot ──────────────────►│ (new refs after nav) │
│◄──── @e4 input "Password" │ │
│ ... │ │
The agent loop: snapshot (observe) → reason (decide) → act (execute) → repeat.
Quick Start
Install
git clone https://github.com/yoloshii/better-browser-use.git
cd better-browser-use
pip install cloakbrowser 'pyee>=13,<14'
pip install 'playwright>=1.51,<1.56' && playwright install chromium
pip install aiohttp 'pydantic>=2.0' markdownify python-dotenv
Configure (optional)
cp .env.example .env
# Edit .env with your auth token, proxy, CAPTCHA solver keys, etc.
Start Server
python scripts/server.py --port 8500
Use
# Launch a browser session
curl -s -X POST http://127.0.0.1:8500/ \
-H 'Content-Type: application/json' \
-d '{"op":"launch","tier":1,"url":"https://example.com"}'
# Get ARIA snapshot with element refs
curl -s -X POST http://127.0.0.1:8500/ \
-H 'Content-Type: application/json' \
-d '{"op":"snapshot","session_id":"<id>","compact":true}'
# Click an element
curl -s -X POST http://127.0.0.1:8500/ \
-H 'Content-Type: application/json' \
-d '{"op":"action","session_id":"<id>","action":"click","params":{"ref":"@e1"}}'
# Close session
curl -s -X POST http://127.0.0.1:8500/ \
-H 'Content-Type: application/json' \
-d '{"op":"close","session_id":"<id>"}'
Stealth Tiers
Three browser engines with progressive anti-detection:
| Tier | Engine | Tracker Blocking | Humanization | Use Case |
|------|--------|:---:|:---:|------|
| 1 | Playwright (Chromium) | - | Opt-in | General browsing, friendly sites |
| 2 | CloakBrowser (C++ patched Chromium) / Patchright fallback | Yes | Auto | Moderate anti-bot (26 C++ source patches, no navigator.webdriver leak) |
| 3 | Camoufox (Firefox C++ fork) | Yes | Auto | Turnstile, DataDome, PerimeterX — with GeoIP + residential proxy |
Dependencies auto-install on first use per tier.
# Tier 1 (default)
{"op": "launch", "tier": 1, "url": "https://example.com"}
# Tier 2 — stealth Chromium
{"op": "launch", "tier": 2, "url": "https://protected-site.com"}
# Tier 3 — anti-detect Firefox with fingerprint
{"op": "launch", "tier": 3, "url": "https://heavily-protected.com", "profile": "my-identity"}
Actions
Core
| Action | Params | Description |
|--------|--------|-------------|
| navigate | {url} | Go to URL |
| click | {ref} | Click element by ref |
| dblclick | {ref} | Double-click element by ref |
| rightclick | {ref} | Right-click (context menu) element by ref |
| hover | {ref} | Hover over element (reveals dropdowns, tooltips) |
| drag | {source_ref, target_ref} | Drag one element to another |
| check | {ref} | Check a checkbox (no-op if already checked) |
| uncheck | {ref} | Uncheck a checkbox (no-op if already unchecked) |
| fill | {ref, value} | Clear + fill (forms) |
| type | {ref, text, delay_ms?} | Character-by-character typing (search, compose) |
| scroll | {direction, amount} | up/down, pixels or "page" |
| press | {key, ref?} | Keyboard: "Enter", "Tab", "Escape" |
| select | {ref, value} | Dropdown selection |
| wait | {ms} | Explicit wait (max 30s) |
| evaluate | {js, deep_query?, frame_url?} | Execute JavaScript. Set deep_query: true to inject deepQuery(sel) / deepQueryAll(sel) helpers that pierce shadow DOM boundaries. Requires BROWSER_USE_EVALUATE=1. |
| screenshot | {full_page?} | Base64 PNG |
| snapshot | {compact?, max_depth?} | ARIA tree + refs |
| done | {success?, result?} | Mark task as complete with optional result text |
| solve_captcha | {} | Auto-detect and solve CAPTCHA (CapSolver → 2Captcha fallback) |
Tabs & Navigation
| Action | Params | Description |
|--------|--------|-------------|
| go_back | {} | Browser back |
| go_forward | {} | Browser forward |
| tab_new | {url?} | Open new tab |
| tab_switch | {index} | Switch tab (0-based) |
| tab_close | {index} | Close tab |
| cookies_get | {domain?} | Get cookies |
| cookies_set | {cookies} | Set cookies |
| cookies_export | {path, domain?} | Export cookies to JSON file |
| cookies_import | {path} | Import cookies from JSON file |
Search & Discovery
| Action | Params | Description |
|--------|--------|-------------|
| search_page | {query, max_results?} | Text search across visible page content. Case-insensitive. |
| find_elements | {text?, role?} | Find refs matching criteria in current snapshot. |
| extract | {max_chars?, include_links?} | Full page content as Markdown. |
| get_value | {ref} | Get current value of input/textarea/select. |
| get_attributes | {ref} | Get all HTML attributes of an element. |
| get_bbox | {ref} | Get bounding box {x, y, width, height} in viewport coordinates. |
WebMCP (Chrome 147+)
| Action | Params | Description |
|--------|--------|-------------|
| webmcp_discover | {} | Probe page for structured tools (imperative + declarative forms). |
| webmcp_call | {tool, args} | Call a discovered WebMCP tool with structured arguments. |
File & Coordinate
| Action | Params | Description |
|--------|--------|-------------|
| upload_file | {ref, path} | Upload file to input[type=file] near ref. |
| get_downloads | {} | List files downloaded in this session. |
| click_coordinate | {x, y} | Click at viewport coordinates (last resort for non-ARIA elements). |
Stealth
| Action | Params | Description |
|--------|--------|-------------|
| rotate_fingerprint | {geo?} | Inject JS to rotate navigator fingerprint (Tier 1-2 only; Tier 3 Camoufox handles natively). |
Viewport & Capture
| Action | Params | Description |
|--------|--------|-------------|
| resize | {width, height} | Resize viewport (320-7680 x 200-4320). |
| pdf | {format?, print_background?} | Save page as PDF (base64). Headless Chromium only. |
Console & Storage
| Action | Params | Description |
|--------|--------|-------------|
| console | {level?, clear?} | Get captured JS console messages. Filter by level (error/warning/log). |
| storage_get | {type?, key?} | Read localStorage (default) or sessionStorage. Omit key for all entries. |
| storage_set | {type?, key, value} | Write to localStorage or sessionStorage. |
Batch Actions
Execute multiple actions in a single request with op: "actions":
{"op": "actions", "session_id": "<id>", "actions": [
{"action": "navigate", "params": {"url": "https://example.com"}},
{"action": "snapshot", "params": {}}
], "stop_on_error": true}
Returns {"success": true, "results": [...], "stopped_at": null}. Max 20 actions per batch. Ref maps propagate between steps.
ARIA Snapshots & Refs
Pages are observed through ARIA accessibility trees, not raw HTML. Each interactive element gets a ref (@e1, @e2, ...):
Page: https://github.com/login | Title: Sign in to GitHub
Tab 1 of 1
- main
- heading "Sign in to GitHub" @e1 [level=1]
- form
- text "Username or email address"
- textbox @e2
- text "Password"
- textbox @e3
- button "Sign in" @e4
- link "Forgot password?" @e5
Use refs in actions: {"action": "fill", "params": {"ref": "@e2", "value": "user@example.com"}}.
Refs reset on every new snapshot. If an action returns "ref not found", take a new snapshot.
Snapshot diff: Changes since the previous snapshot are marked in the tree:
- button "Submit" @e1
*- button "Confirm" @e2 <-- NEW since last snapshot
~- button "Updated" @e3 <-- CHANGED (same element, different name)
- textbox "Email" @e4
[removed since last snapshot]
- link "Old Link" <-- REMOVED
Response includes new_element_count, changed_element_count, and removed_element_count.
SPA Re-detection
SPAs (e.g., x.com → x.com/home) often redirect via JavaScript after navigate returns. The server detects this automatically:
- On
navigate: If the final URL differs from the requested URL, the response includesspa_redirect: trueand the redirect is noted inextracted_content. - On
snapshot: If the current URL differs from the last navigated URL, the response includesspa_navigation: true,spa_from, andspa_to. This catches delayed SPA routing that happens after the initial navigation completes.
Use these signals to confirm you're on the expected page after navigation.
Loop Detection
The server detects repetitive action patterns and returns escalating warnings:
