Rayobrowse

Stealth Chromium browser for large-scale web scraping.

Generate Convert Improve

Install / Use

/learn @rayobyte-data/Rayobrowse

About this skill

Quality Score

0/100

README

<img src="assets/rayobrowse.png" alt="rayobrowse"> Self-hosted Chromium stealth browser for web scraping and automation.

Overview

rayobrowse is a Chromium-based stealth browser for web scraping, AI agents, and automation workflows. It runs on headless Linux servers (no GPU required) and works with any tool that speaks CDP: Playwright, Puppeteer, Selenium, OpenClaw, Scrapy, and custom automation scripts.

Standard headless Chromium gets blocked immediately by modern bot detection. rayobrowse fixes this with realistic fingerprints (user agent, screen resolution, WebGL, fonts, timezone, and dozens of other signals) that make each session look like a real device.

It runs inside Docker (x86_64 and ARM64) and is actively used in production on Rayobyte's scraping API to scrape millions of pages per day across some of the most difficult, high-value websites.

Quick Start

1. Set up environment

cp .env.example .env

Open .env and set STEALTH_BROWSER_ACCEPT_TERMS=true to confirm you agree to the LICENSE. The daemon will not create browsers until this is set.

2. Start the container

docker compose up -d

Docker automatically pulls the correct image for your architecture (x86_64 or ARM64).

3. Connect and automate

Any CDP client can connect directly to the /connect endpoint. No SDK install required.

# pip install playwright && playwright install
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.connect_over_cdp(
        "ws://localhost:9222/connect?headless=false&os=windows"
    )
    page = browser.new_context().new_page()
    page.goto("https://example.com")
    print(page.title())
    input("Browser open — view at http://localhost:6080/vnc.html. Press Enter to close...")
    browser.close()

View the browser live at http://localhost:6080/vnc.html (noVNC).

For more control (listing, deleting, managing multiple browsers), install the Python SDK:

pip install -r requirements.txt
python examples/playwright_example.py

Upgrading

To upgrade to the latest version of rayobrowse:

# Pull the latest Docker image and restart the container
docker compose pull && docker compose up -d

# Upgrade the Python SDK
pip install --upgrade -r requirements.txt

The Docker image and Python SDK are versioned independently:

Docker image (rayobyte/rayobrowse:latest) — contains Chromium binary, fingerprint engine, daemon server
Python SDK (rayobrowse on PyPI) — lightweight client for create_browser()

Both are updated regularly. The SDK maintains backward compatibility with older daemon versions, but upgrading both together is recommended for the best experience.

Requirements

Docker — the browser runs inside a container
Python 3.10+ — for the SDK client and examples
2GB+ RAM available (~300MB per browser instance)

Works on Linux, Windows (native or WSL2), and macOS. Both x86_64 (amd64) and ARM64 (Apple Silicon, AWS Graviton) are supported — the Docker image is built and tested for both architectures, and Docker automatically pulls the correct one.

What's in the pip package vs. the Docker image

| Component | Where it lives | | --- | --- | | rayobrowse Python SDK (create_browser(), client) | pip install rayobrowse — lightweight, pure-Python | | Chromium binary, fingerprint engine, daemon server | Docker image (rayobyte/rayobrowse) |

The SDK is intentionally minimal — it issues HTTP requests to the daemon and returns CDP WebSocket URLs. All browser-level logic runs inside the container.

Why This Exists

Browser automation is becoming the backbone of web interaction, not just for scraping, but for AI agents, workflow automation, and any tool that needs to navigate the real web. Projects like OpenClaw, Scrapy, Firecrawl, and dozens of others all need a browser to do their job. The problem is that standard headless Chromium gets detected and blocked by most websites. Every one of these tools hits the same wall.

rayobrowse gives them a browser that actually works. It looks like a real device, with a matching fingerprint across user agent, screen resolution, WebGL, fonts, timezone, and every other signal that detection systems check. Any tool that speaks CDP (Chrome DevTools Protocol) can connect and automate without getting blocked.

We needed a browser that:

Uses Chromium (71% browser market share, blending in is key)
Runs reliably on headless Linux servers with no GPU
Works with any CDP client (Playwright, Selenium, Puppeteer, AI agents, custom tools)
Uses real-world, diverse fingerprints
Can be deployed and updated at scale
Is commercially maintained long-term

Since no existing solution met these requirements, we built rayobrowse. It's developed as part of our scraping platform, so it'll be commercially supported and up-to-date with the latest anti-scraping techniques.

Architecture

rayobrowse runs as a Docker container that bundles the custom Chromium binary, fingerprint engine, and a daemon server. Your code runs on the host and connects over CDP:

There are two ways to get a browser:

| Method | How it works | Best for | | --- | --- | --- | | /connect endpoint | Connect to ws://localhost:9222/connect?headless=true&os=windows. A stealth browser is auto-created on connection and cleaned up on disconnect. | Third-party tools (OpenClaw, Scrapy, Firecrawl), quick scripts, any CDP client | | Python SDK | Call create_browser() to get a CDP WebSocket URL, then connect with your automation library. | Fine-grained control, multiple browsers, custom lifecycle management |

The /connect endpoint is the simplest path. Point any CDP-capable tool at a single static URL and it just works. The Python SDK gives you more control over browser creation, listing, and deletion.

The noVNC viewer on :6080 lets you watch browser sessions in real time, useful for debugging and demos.

Zero system dependencies on your host machine beyond Docker. No Xvfb, no font packages, no Chromium install.

How It Works

Chromium Fork

rayobrowse tracks upstream Chromium releases and applies a focused set of patches (using a plaster approach similar to Brave):

Normalize and harden exposed browser APIs
Reduce fingerprint entropy leaks
Improve automation compatibility
Preserve native Chromium behavior where possible

Updates are continuously validated against internal test targets before release.

Fingerprint Injection

At startup, each session is assigned a real-world device profile covering:

User agent, platform, and OS metadata
Screen resolution and media features
Graphics and rendering attributes (Canvas, WebGL)
Fonts matching the target OS
Locale, timezone, and WebRTC configuration

Profiles are selected dynamically from a database of thousands of real-world fingerprints collected using the same techniques that major anti-bot companies use.

Automation Layer

rayobrowse exposes standard Chromium interfaces and avoids non-standard hooks that increase detection risk. Automation connects through native CDP and operates on unmodified page contexts — your existing Playwright, Selenium, and Puppeteer scripts work as-is.

CI & Validation

Every release passes through automated testing including fingerprint consistency checks, detection regression tests, and stability benchmarks. Releases are only published once they pass all validation stages.

Features

Fingerprint Spoofing

Use your own static fingerprint or pull from our database of thousands of real-world fingerprints. Vectors emulated include:

OS (Windows, Android thoroughly tested; macOS and Linux experimental)
WebRTC and DNS leak protection
Canvas and WebGL
Fonts (matched to target OS)
Screen resolution
hardwareConcurrency
Timezone matching with proxy geolocation (via MaxMind GeoLite2)
...and much more

Human Mouse

Optional human-like mouse movement and clicking, inspired by HumanCursor. Use Playwright's page.click() and page.mouse.move() as you normally do — our system applies natural mouse curves and realistic click timing automatically.

Human-like mouse movement demonstration

Proxy Support

Route traffic through any HTTP proxy, just as you would with standard Playwright.

Headless or Headful

Run headful mode on headless Linux servers (handled inside the container via Xvnc). Watch sessions live through the built-in noVNC viewer.

Usage

rayobrowse works with Playwright, Selenium, Puppeteer, and any tool that speaks CDP. See the examples/ folder for ready-to-run scripts.

Using `/connect` (simplest)

Connect any CDP client directly to the /connect endpoint. No SDK needed.

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.connect_over_cdp(
        "ws://localhost:9222/connect?headless=true&os=windows"
    )
    page = browser.new_context().new_page()
    page.goto("https://example.com")
    print(page.title())
    browser.close()

Customize the browser via query parameters:

ws://localhost:9222/connect?headless=true&os=windows&proxy=http://user:pass@host:port

All /connect parameters:

| Parameter | Default | Description | |-----------|---------|-------------| | headless | true | true or false | | os | linux | Fingerprint OS: windows, linux, android, macos | | browser_name | chrome | Browse

Related Skills

node-connect

353.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.7k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

353.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

353.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。

rayobyte-data

View profile

View on GitHub

GitHub Stars133

CategoryDevelopment

Updated2m ago

Forks6

rayobyte-data/rayobrowse

Languages

Python

Security Score

85/100

Audited on Apr 10, 2026

No findings

Rayobrowse

Install / Use

README

Overview

Quick Start

Upgrading

Requirements

What's in the pip package vs. the Docker image

Why This Exists

Architecture

How It Works

Chromium Fork

Fingerprint Injection

Automation Layer

CI & Validation

Features

Fingerprint Spoofing

Human Mouse

Proxy Support

Headless or Headful

Usage

Using /connect (simplest)

Related Skills

Using `/connect` (simplest)