Rayobrowse
Stealth Chromium browser for large-scale web scraping.
Install / Use
/learn @rayobyte-data/RayobrowseREADME
Overview
rayobrowse is a Chromium-based stealth browser for web scraping, AI agents, and automation workflows. It runs on headless Linux servers (no GPU required) and works with any tool that speaks CDP: Playwright, Puppeteer, Selenium, OpenClaw, Scrapy, and custom automation scripts.
Standard headless Chromium gets blocked immediately by modern bot detection. rayobrowse fixes this with realistic fingerprints (user agent, screen resolution, WebGL, fonts, timezone, and dozens of other signals) that make each session look like a real device.
It runs inside Docker (x86_64 and ARM64) and is actively used in production on Rayobyte's scraping API to scrape millions of pages per day across some of the most difficult, high-value websites.
Quick Start
1. Set up environment
cp .env.example .env
Open .env and set STEALTH_BROWSER_ACCEPT_TERMS=true to confirm you agree to the LICENSE. The daemon will not create browsers until this is set.
2. Start the container
docker compose up -d
Docker automatically pulls the correct image for your architecture (x86_64 or ARM64).
3. Connect and automate
Any CDP client can connect directly to the /connect endpoint. No SDK install required.
# pip install playwright && playwright install
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.connect_over_cdp(
"ws://localhost:9222/connect?headless=false&os=windows"
)
page = browser.new_context().new_page()
page.goto("https://example.com")
print(page.title())
input("Browser open — view at http://localhost:6080/vnc.html. Press Enter to close...")
browser.close()
View the browser live at http://localhost:6080/vnc.html (noVNC).
For more control (listing, deleting, managing multiple browsers), install the Python SDK:
pip install -r requirements.txt
python examples/playwright_example.py
Upgrading
To upgrade to the latest version of rayobrowse:
# Pull the latest Docker image and restart the container
docker compose pull && docker compose up -d
# Upgrade the Python SDK
pip install --upgrade -r requirements.txt
The Docker image and Python SDK are versioned independently:
- Docker image (
rayobyte/rayobrowse:latest) — contains Chromium binary, fingerprint engine, daemon server - Python SDK (
rayobrowseon PyPI) — lightweight client forcreate_browser()
Both are updated regularly. The SDK maintains backward compatibility with older daemon versions, but upgrading both together is recommended for the best experience.
Requirements
- Docker — the browser runs inside a container
- Python 3.10+ — for the SDK client and examples
- 2GB+ RAM available (~300MB per browser instance)
Works on Linux, Windows (native or WSL2), and macOS. Both x86_64 (amd64) and ARM64 (Apple Silicon, AWS Graviton) are supported — the Docker image is built and tested for both architectures, and Docker automatically pulls the correct one.
What's in the pip package vs. the Docker image
| Component | Where it lives |
| --- | --- |
| rayobrowse Python SDK (create_browser(), client) | pip install rayobrowse — lightweight, pure-Python |
| Chromium binary, fingerprint engine, daemon server | Docker image (rayobyte/rayobrowse) |
The SDK is intentionally minimal — it issues HTTP requests to the daemon and returns CDP WebSocket URLs. All browser-level logic runs inside the container.
Why This Exists
Browser automation is becoming the backbone of web interaction, not just for scraping, but for AI agents, workflow automation, and any tool that needs to navigate the real web. Projects like OpenClaw, Scrapy, Firecrawl, and dozens of others all need a browser to do their job. The problem is that standard headless Chromium gets detected and blocked by most websites. Every one of these tools hits the same wall.
rayobrowse gives them a browser that actually works. It looks like a real device, with a matching fingerprint across user agent, screen resolution, WebGL, fonts, timezone, and every other signal that detection systems check. Any tool that speaks CDP (Chrome DevTools Protocol) can connect and automate without getting blocked.
We needed a browser that:
- Uses Chromium (71% browser market share, blending in is key)
- Runs reliably on headless Linux servers with no GPU
- Works with any CDP client (Playwright, Selenium, Puppeteer, AI agents, custom tools)
- Uses real-world, diverse fingerprints
- Can be deployed and updated at scale
- Is commercially maintained long-term
Since no existing solution met these requirements, we built rayobrowse. It's developed as part of our scraping platform, so it'll be commercially supported and up-to-date with the latest anti-scraping techniques.
Architecture
<p align="center"> <img src="assets/architecture.png" alt="rayobrowse architecture"> </p>rayobrowse runs as a Docker container that bundles the custom Chromium binary, fingerprint engine, and a daemon server. Your code runs on the host and connects over CDP:
There are two ways to get a browser:
| Method | How it works | Best for |
| --- | --- | --- |
| /connect endpoint | Connect to ws://localhost:9222/connect?headless=true&os=windows. A stealth browser is auto-created on connection and cleaned up on disconnect. | Third-party tools (OpenClaw, Scrapy, Firecrawl), quick scripts, any CDP client |
| Python SDK | Call create_browser() to get a CDP WebSocket URL, then connect with your automation library. | Fine-grained control, multiple browsers, custom lifecycle management |
The /connect endpoint is the simplest path. Point any CDP-capable tool at a single static URL and it just works. The Python SDK gives you more control over browser creation, listing, and deletion.
The noVNC viewer on :6080 lets you watch browser sessions in real time, useful for debugging and demos.
Zero system dependencies on your host machine beyond Docker. No Xvfb, no font packages, no Chromium install.
How It Works
Chromium Fork
rayobrowse tracks upstream Chromium releases and applies a focused set of patches (using a plaster approach similar to Brave):
- Normalize and harden exposed browser APIs
- Reduce fingerprint entropy leaks
- Improve automation compatibility
- Preserve native Chromium behavior where possible
Updates are continuously validated against internal test targets before release.
Fingerprint Injection
At startup, each session is assigned a real-world device profile covering:
- User agent, platform, and OS metadata
- Screen resolution and media features
- Graphics and rendering attributes (Canvas, WebGL)
- Fonts matching the target OS
- Locale, timezone, and WebRTC configuration
Profiles are selected dynamically from a database of thousands of real-world fingerprints collected using the same techniques that major anti-bot companies use.
Automation Layer
rayobrowse exposes standard Chromium interfaces and avoids non-standard hooks that increase detection risk. Automation connects through native CDP and operates on unmodified page contexts — your existing Playwright, Selenium, and Puppeteer scripts work as-is.
CI & Validation
Every release passes through automated testing including fingerprint consistency checks, detection regression tests, and stability benchmarks. Releases are only published once they pass all validation stages.
Features
Fingerprint Spoofing
Use your own static fingerprint or pull from our database of thousands of real-world fingerprints. Vectors emulated include:
- OS (Windows, Android thoroughly tested; macOS and Linux experimental)
- WebRTC and DNS leak protection
- Canvas and WebGL
- Fonts (matched to target OS)
- Screen resolution
hardwareConcurrency- Timezone matching with proxy geolocation (via MaxMind GeoLite2)
- ...and much more
Human Mouse
Optional human-like mouse movement and clicking, inspired by HumanCursor. Use Playwright's page.click() and page.mouse.move() as you normally do — our system applies natural mouse curves and realistic click timing automatically.

Proxy Support
Route traffic through any HTTP proxy, just as you would with standard Playwright.
Headless or Headful
Run headful mode on headless Linux servers (handled inside the container via Xvnc). Watch sessions live through the built-in noVNC viewer.
Usage
rayobrowse works with Playwright, Selenium, Puppeteer, and any tool that speaks CDP. See the examples/ folder for ready-to-run scripts.
Using /connect (simplest)
Connect any CDP client directly to the /connect endpoint. No SDK needed.
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.connect_over_cdp(
"ws://localhost:9222/connect?headless=true&os=windows"
)
page = browser.new_context().new_page()
page.goto("https://example.com")
print(page.title())
browser.close()
Customize the browser via query parameters:
ws://localhost:9222/connect?headless=true&os=windows&proxy=http://user:pass@host:port
All /connect parameters:
| Parameter | Default | Description |
|-----------|---------|-------------|
| headless | true | true or false |
| os | linux | Fingerprint OS: windows, linux, android, macos |
| browser_name | chrome | Browse
Related Skills
node-connect
353.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
353.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
353.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
