SkillAgentSearch skills...

Crw

⚡Lightweight Firecrawl alternative in Rust — 91.5% coverage, 5x faster, 3MB RAM. Web scraper & crawler with MCP server for Claude, LLM extraction, JS rendering.

Install / Use

/learn @us/Crw

README

<p align="center"> <a href="https://fastcrw.com"> <img src="docs/fastcrw-banner.png" alt="fastCRW" height="120" /> </a> <p align="center">The web scraper built for AI agents. Single binary. Zero config.</p> <p align="center"> <a href="https://crates.io/crates/crw-server"><img src="https://img.shields.io/crates/v/crw-server.svg" alt="crates.io"></a> <a href="https://github.com/us/crw/actions"><img src="https://github.com/us/crw/workflows/CI/badge.svg" alt="CI"></a> <a href="LICENSE"><img src="https://img.shields.io/badge/license-AGPL--3.0-blue.svg" alt="License"></a> <a href="https://github.com/us/crw/stargazers"><img src="https://img.shields.io/github/stars/us/crw?style=social" alt="GitHub Stars"></a> <a href="https://fastcrw.com"><img src="https://img.shields.io/badge/Managed%20Cloud-fastcrw.com-blueviolet" alt="fastcrw.com"></a> <a href="https://discord.gg/kkFh2SC8"><img src="https://img.shields.io/badge/Discord-Join%20Community-7289da?logo=discord&logoColor=white" alt="Discord"></a> </p> <p align="center"> Works with: Claude Code · Cursor · Windsurf · Cline · Copilot · Continue.dev · Codex </p> <p align="center"> <a href="docs/docs/mcp.md">MCP Integration</a> &bull; <a href="docs/docs/installation.md">Installation</a> &bull; <a href="docs/docs/rest-api.md">API Reference</a> &bull; <a href="https://fastcrw.com">Cloud</a> &bull; <a href="docs/docs/js-rendering.md">JS Rendering</a> &bull; <a href="docs/docs/configuration.md">Configuration</a> &bull; <a href="https://discord.gg/kkFh2SC8">Discord</a> </p> <p align="center"> <b>English</b> | <a href="README.zh-CN.md">中文</a> </p> </p>

Don't want to self-host? fastcrw.com is the managed cloud — global proxy network, auto-scaling, dashboard, and API keys. Same Firecrawl-compatible API. Get 500 free credits →

CRW is the open-source web scraper built for AI agents. Built-in MCP server (stdio + HTTP), single binary, ~6 MB idle RAM. Give Claude Code, Cursor, or any MCP client web scraping superpowers in 30 seconds. Firecrawl-compatible API — 5.5x faster, 75x less memory, 92% coverage on 1K real-world URLs.

Built-in MCP server. Single binary. No Redis. No Node.js.

cargo install crw-mcp
claude mcp add crw -- crw-mcp

What's New

0.0.14 (2026-03-25)

Features

  • mcp: auto-download LightPanda binary for zero-config JS rendering (41f443b)
  • mcp: auto-spawn headless Chrome for JS rendering in embedded mode (9a6b0ae)

Bug Fixes

  • ci: move crw-mcp to Tier 4 in release workflow and add workflow_dispatch (d7584a8)

0.0.13 (2026-03-24)

Features

  • mcp: add embedded mode — self-contained MCP server, no crw-server needed (75e5450)

Bug Fixes

  • ci: switch release-please to simple type for Rust workspace support (51cd420)

v0.0.12

  • Readability drill-down — when <main> or <article> wraps >90% of body, the extractor now searches inside for narrower content elements (.main-page-content, .article-content, .entry-content, etc.) instead of discarding. Fixes MDN pages returning 35 chars and StackOverflow returning only the question
  • Base64 image strippingdata: URI images are removed in both HTML cleaning (lol_html) and markdown post-processing (regex safety net). Eliminates massive base64 blobs from Reddit and similar sites
  • Select/dropdown removal<select> elements removed in onlyMainContent mode; dropdown/city-selector/location-selector noise patterns added. Fixes Hürriyet city dropdown leaking into content
  • Extended scored selectors — added .main-page-content, .js-post-body, .s-prose, #question, .page-content, #page-content, [role="article"] for better MDN, StackOverflow, and generic site coverage
  • Smarter fallback chain — when primary extraction produces too-short markdown, both fallbacks (cleaned HTML and basic clean) are tried and the longer result is picked, instead of short-circuiting on non-empty but insufficient content

Full changelog →

Why CRW?

CRW gives you Firecrawl's API with a fraction of the resource usage. No runtime dependencies, no Redis, no Node.js — just a single binary you can deploy anywhere.

| Metric | CRW (self-hosted) | fastcrw.com (cloud) | Firecrawl | Crawl4AI | Spider | |---|---|---|---|---|---| | Coverage (1K URLs) | 92.0% | 92.0% | 77.2% | — | 99.9% | | Avg Latency | 833ms | 833ms | 4,600ms | — | — | | P50 Latency | 446ms | 446ms | — | — | 45ms (static) | | Noise Rejection | 88.4% | 88.4% | noise 6.8% | noise 11.3% | noise 4.2% | | Idle RAM | 6.6 MB | 0 (managed) | ~500 MB+ | — | cloud-only | | Cold start | 85 ms | 0 (always-on) | 30–60 s | — | — | | HTTP scrape | ~30 ms | ~30 ms | ~200 ms+ | ~480 ms | ~45 ms | | Proxy network | BYO | Global (built-in) | Built-in | — | Cloud-only | | Cost / 1K scrapes | $0 (self-hosted) | From $13/mo | $0.83–5.33 | $0 | $0.65 | | Dependencies | single binary | None (API) | Node + Redis + PG + RabbitMQ | Python + Playwright | Rust / cloud | | License | AGPL-3.0 | Managed | AGPL-3.0 | Apache-2.0 | MIT |

<details> <summary><b>Full benchmark details</b></summary>

CRW vs Firecrawl — Tested on Firecrawl scrape-content-dataset-v1 (1,000 real-world URLs, JS rendering enabled):

  • CRW covers 92% of URLs vs Firecrawl's 77.2% — 15 percentage points higher
  • CRW is 5.5x faster on average (833ms vs 4,600ms)
  • CRW uses ~75x less RAM at idle (6.6 MB vs ~500 MB+)
  • Firecrawl requires 5 containers (Node.js, Redis, PostgreSQL, RabbitMQ, Playwright) — CRW is a single binary

Crawl4AI vs Firecrawl vs SpiderIndependent benchmark by Spider.cloud:

| Metric | Spider | Firecrawl | Crawl4AI | |--------|--------|-----------|----------| | Static throughput | 182 p/s | 27 p/s | 19 p/s | | Success (static) | 100% | 99.5% | 99% | | Success (SPA) | 100% | 96.6% | 93.7% | | Success (anti-bot) | 99.6% | 88.4% | 72% | | Latency (static) | 45ms | 310ms | 480ms | | Latency (SPA) | 820ms | 1,400ms | 1,650ms |

Firecrawl independent reviewScrapeway benchmark: 64.3% success rate, $5.11/1K scrapes, 0% on LinkedIn/Twitter.

Resource comparison:

| Metric | CRW | Firecrawl | |---|---|---| | Min RAM | ~7 MB | 4 GB | | Recommended RAM | ~64 MB (under load) | 8–16 GB | | Docker images | single ~8 MB binary | ~2–3 GB total | | Cold start | 85 ms | 30–60 seconds | | Containers needed | 1 (+optional sidecar) | 5 |

</details>

Features

  • 🔧 MCP server — built-in stdio + HTTP transport for Claude Code, Cursor, Windsurf, and any MCP client
  • 🔌 Firecrawl-compatible API — same endpoint family and familiar request/response ergonomics
  • 📄 6 output formats — markdown, HTML, cleaned HTML, raw HTML, plain text, links, structured JSON
  • 🤖 LLM structured extraction — send a JSON schema, get validated structured data back (Anthropic tool_use + OpenAI function calling)
  • 🌐 JS rendering — auto-detect SPAs with shell heuristics, render via LightPanda, Playwright, or Chrome (CDP)
  • 🕷️ BFS crawler — async crawl with rate limiting, robots.txt, sitemap support, concurrent jobs
  • 🔒 Security — SSRF protection (private IPs, cloud metadata, IPv6), constant-time auth, dangerous URI filtering
  • 🐳 Docker ready — multi-stage build with LightPanda sidecar
  • 🎯 CSS selector & XPath — extract specific DOM elements before Markdown conversion
  • ✂️ Chunking & filtering — split content into topic/sentence/regex chunks; rank by BM25 or cosine similarity
  • 🕵️ Stealth mode — browser-like UA rotation and header injection to reduce bot detection
  • 🌐 Per-request proxy — override the global proxy per scrape request

Cloud vs Self-Hosted

| Feature | Self-hosted | Cloud (fastcrw.com) | |---|---|---| | Setup | cargo install crw-server | Sign up → get API key | | Infrastructure | You manage | Fully managed | | Proxy | Bring your own | Global proxy network | | Scaling | Manual | Auto-scaling | | API | Firecrawl-compatible | Same Firecrawl-compatible API |

Both use the same Firecrawl-compatible API — your code works with either. Switch between self-hosted and cloud by changing the base URL.

Quick Start

MCP (AI agents — recommended):

cargo install crw-mcp
claude mcp add crw -- crw-mcp

That's it. Claude Code now has crw_scrape, crw_crawl, crw_map tools. For Cursor, Windsurf, Cline, and other MCP clients, see MCP Server.

CLI (no server needed):

cargo install crw-cli
crw https://example.com

Self-hosted server:

cargo install crw-server
crw-server

Enable JS rendering (optional):

crw-server setup

This downloads LightPanda and creates a config.local.toml for JS rendering. See JS Rendering for details.

Cloud (no setup):

curl -X POST https://fastcrw.com/api/v1/scrape \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Get your API key at [fastcrw.com](https://fastcrw.com

View on GitHub
GitHub Stars16
CategoryDevelopment
Updated16h ago
Forks1

Languages

Rust

Security Score

95/100

Audited on Mar 26, 2026

No findings