🔥 Firecrawl

Turn websites into LLM-ready data.

Firecrawl is an API that scrapes, crawls, and extracts structured data from any website, powering AI agents and apps with real-time context from the web.

Looking for our MCP? Check out the repo here.

This repository is in development, and we're still integrating custom modules into the mono repo. It's not fully ready for self-hosted deployment yet, but you can run it locally.

Pst. Hey, you, join our stargazers :)

Why Firecrawl?

LLM-ready output: Clean markdown, structured JSON, screenshots, HTML, and more
Industry-leading reliability: >80% coverage on benchmark evaluations, outperforming every other provider tested
Handles the hard stuff: Proxies, JavaScript rendering, and dynamic content that breaks other scrapers
Customization: Exclude tags, crawl behind auth walls, max depth, and more
Media parsing: Automatic text extraction from PDFs, DOCX, and images
Actions: Click, scroll, input, wait, and more before extracting
Batch processing: Scrape thousands of URLs asynchronously
Change tracking: Monitor website content changes over time

Quick Start

Make Your First API Request

curl -X POST 'https://api.firecrawl.dev/v2/scrape' \
  -H 'Authorization: Bearer fc-YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"url": "https://example.com"}'

Response:

{
  "success": true,
  "data": {
    "markdown": "# Example Domain\n\nThis domain is for use in illustrative examples...",
    "metadata": {
      "title": "Example Domain",
      "sourceURL": "https://example.com"
    }
  }
}

Install the Firecrawl Skill & CLI

The Firecrawl Skill is an easy way for AI agents such as Claude Code, Antigravity and OpenCode to use Firecrawl through the CLI.

Install and configure the skill for all detected AI coding agents:

npx -y firecrawl-cli@latest init --all --browser

After installing, restart your agent for it to discover the new skill.

You can also install the CLI globally:

npm install -g firecrawl-cli

Authenticate with your API key:

# Interactive login (opens browser)
firecrawl login --browser

# Or login with API key directly
firecrawl login --api-key fc-YOUR_API_KEY

# Or set via environment variable
export FIRECRAWL_API_KEY=fc-YOUR_API_KEY

Try a quick scrape:

firecrawl https://example.com --only-main-content

See the full Skill + CLI documentation for all available commands including search, map, crawl, agent, and browser automation.

Feature Overview

| Feature | Description | |---------|-------------| | Scrape | Convert any URL to markdown, HTML, screenshots, or structured JSON | | Search | Search the web and get full page content from results | | Browse | Let agents safely interact with the web | | Map | Discover all URLs on a website instantly | | Crawl | Scrape all URLs of a website with a single request | | Agent | Automated data gathering, just describe what you need |

Scrape

Convert any URL to clean markdown, HTML, or structured data.

curl -X POST 'https://api.firecrawl.dev/v2/scrape' \
  -H 'Authorization: Bearer fc-YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://docs.firecrawl.dev",
    "formats": ["markdown", "html"]
  }'

Response:

{
  "success": true,
  "data": {
    "markdown": "# Firecrawl Docs\n\nTurn websites into LLM-ready data...",
    "html": "<!DOCTYPE html><html>...",
    "metadata": {
      "title": "Quickstart | Firecrawl",
      "description": "Firecrawl allows you to turn entire websites into LLM-ready markdown",
      "sourceURL": "https://docs.firecrawl.dev",
      "statusCode": 200
    }
  }
}

Extract Structured Data (JSON Mode)

Extract structured data using a schema:

from firecrawl import Firecrawl
from pydantic import BaseModel

app = Firecrawl(api_key="fc-YOUR_API_KEY")

class CompanyInfo(BaseModel):
    company_mission: str
    is_open_source: bool
    is_in_yc: bool

result = app.scrape(
    'https://firecrawl.dev',
    formats=[{"type": "json", "schema": CompanyInfo.model_json_schema()}]
)

print(result.json)

{"company_mission": "Turn websites into LLM-ready data", "is_open_source": true, "is_in_yc": true}

Or extract with just a prompt (no schema):

result = app.scrape(
    'https://firecrawl.dev',
    formats=[{"type": "json", "prompt": "Extract the company mission"}]
)

Scrape Formats

Available formats: markdown, html, rawHtml, screenshot, links, json, branding

Get a screenshot

doc = app.scrape("https://firecrawl.dev", formats=["screenshot"])
print(doc.screenshot)  # Base64 encoded image

Extract brand identity (colors, fonts, typography)

doc = app.scrape("https://firecrawl.dev", formats=["branding"])
print(doc.branding)  # {"colors": {...}, "fonts": [...], "typography": {...}}

Actions (Interact Before Scraping)

Click, type, scroll, and more before extracting:

doc = app.scrape(
    url="https://example.com/login",
    formats=["markdown"],
    actions=[
        {"type": "write", "text": "user@example.com"},
        {"type": "press", "key": "Tab"},
        {"type": "write", "text": "password"},
        {"type": "click", "selector": 'button[type="submit"]'},
        {"type": "wait", "milliseconds": 2000},
        {"type": "screenshot"}
    ]
)

Search

Search the web and optionally scrape the results.

curl -X POST 'https://api.firecrawl.dev/v2/search' \
  -H 'Authorization: Bearer fc-YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "query": "firecrawl web scraping",
    "limit": 5
  }'

Response:

{
  "success": true,
  "data": {
    "web": [
      {
        "url": "https://www.firecrawl.dev/",
        "title": "Firecrawl - The Web Data API for AI",
        "description": "The web crawling, scraping, and search API for AI.",
        "position": 1
      }
    ],
    "images": [...],
    "news": [...]
  }
}

Search with Content Scraping

Get the full content of search results:

from firecrawl import Firecrawl

firecrawl = Firecrawl(api_key="fc-YOUR_API_KEY")

results = firecrawl.search(
    "firecrawl web scraping",
    limit=3,
    scrape_options={
        "formats": ["markdown", "links"]
    }
)

Browse

Give your agents a secure browser environment. Let them run code safely to gather data and take action on the web.

curl -X POST 'https://api.firecrawl.dev/v2/browser' \
  -H 'Authorization: Bearer fc-YOUR_API_KEY' \
  -H 'Content-Type: application/json'

Response:

{
  "success": true,
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "cdpUrl": "wss://cdp-proxy.firecrawl.dev/cdp/550e8400-e29b-41d4-a716-446655440000",
  "liveViewUrl": "https://liveview.firecrawl.dev/550e8400-e29b-41d4-a716-446655440000"
}

Execute Code in the Browser

Run Playwright code, Python, or bash commands remotely:

import Firecrawl from '@mendable/firecrawl-js';

const firecrawl = new Firecrawl({ apiKey: "fc-YOUR_API_KEY" });

// 1. Launch a session
const session = await firecrawl.browser();

// 2. Execute code
const result = await firecrawl.browserExecute(session.id, {
  code: `
    await page.goto("https://news.ycombinator.com");
    const title = await page.title();
    console.log(title);
  `,
  language: "node",
});
console.log(result.result); // "Hacker News"

// 3. Close
await firecrawl.deleteBrowser(session.id);

Persistent Sessions

Save and reuse browser state (cookies, localStorage) across sessions:

const session = await firecrawl.browser({
  ttl: 600,
  profile: {

Firecrawl

Install / Use

README

🔥 Firecrawl

Why Firecrawl?

Quick Start

Make Your First API Request

Install the Firecrawl Skill & CLI

Feature Overview

Scrape

Extract Structured Data (JSON Mode)

Scrape Formats

Actions (Interact Before Scraping)

Search

Search with Content Scraping

Browse

Execute Code in the Browser

Persistent Sessions