Firecrawl
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Install / Use
/learn @firecrawl/FirecrawlREADME
🔥 Firecrawl
Turn websites into LLM-ready data.
Firecrawl is an API that scrapes, crawls, and extracts structured data from any website, powering AI agents and apps with real-time context from the web.
Looking for our MCP? Check out the repo here.
This repository is in development, and we're still integrating custom modules into the mono repo. It's not fully ready for self-hosted deployment yet, but you can run it locally.
Pst. Hey, you, join our stargazers :)
<a href="https://github.com/firecrawl/firecrawl"> <img src="https://img.shields.io/github/stars/firecrawl/firecrawl.svg?style=social&label=Star&maxAge=2592000" alt="GitHub stars"> </a>Why Firecrawl?
- LLM-ready output: Clean markdown, structured JSON, screenshots, HTML, and more
- Industry-leading reliability: >80% coverage on benchmark evaluations, outperforming every other provider tested
- Handles the hard stuff: Proxies, JavaScript rendering, and dynamic content that breaks other scrapers
- Customization: Exclude tags, crawl behind auth walls, max depth, and more
- Media parsing: Automatic text extraction from PDFs, DOCX, and images
- Actions: Click, scroll, input, wait, and more before extracting
- Batch processing: Scrape thousands of URLs asynchronously
- Change tracking: Monitor website content changes over time
Quick Start
Sign up at firecrawl.dev to get your API key and start extracting data in seconds. Try the playground to test it out.
Make Your First API Request
curl -X POST 'https://api.firecrawl.dev/v2/scrape' \
-H 'Authorization: Bearer fc-YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"url": "https://example.com"}'
Response:
{
"success": true,
"data": {
"markdown": "# Example Domain\n\nThis domain is for use in illustrative examples...",
"metadata": {
"title": "Example Domain",
"sourceURL": "https://example.com"
}
}
}
Install the Firecrawl Skill & CLI
The Firecrawl Skill is an easy way for AI agents such as Claude Code, Antigravity and OpenCode to use Firecrawl through the CLI.
Install and configure the skill for all detected AI coding agents:
npx -y firecrawl-cli@latest init --all --browser
After installing, restart your agent for it to discover the new skill.
You can also install the CLI globally:
npm install -g firecrawl-cli
Authenticate with your API key:
# Interactive login (opens browser)
firecrawl login --browser
# Or login with API key directly
firecrawl login --api-key fc-YOUR_API_KEY
# Or set via environment variable
export FIRECRAWL_API_KEY=fc-YOUR_API_KEY
Try a quick scrape:
firecrawl https://example.com --only-main-content
See the full Skill + CLI documentation for all available commands including search, map, crawl, agent, and browser automation.
Feature Overview
| Feature | Description | |---------|-------------| | Scrape | Convert any URL to markdown, HTML, screenshots, or structured JSON | | Search | Search the web and get full page content from results | | Browse | Let agents safely interact with the web | | Map | Discover all URLs on a website instantly | | Crawl | Scrape all URLs of a website with a single request | | Agent | Automated data gathering, just describe what you need |
Scrape
Convert any URL to clean markdown, HTML, or structured data.
curl -X POST 'https://api.firecrawl.dev/v2/scrape' \
-H 'Authorization: Bearer fc-YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://docs.firecrawl.dev",
"formats": ["markdown", "html"]
}'
Response:
{
"success": true,
"data": {
"markdown": "# Firecrawl Docs\n\nTurn websites into LLM-ready data...",
"html": "<!DOCTYPE html><html>...",
"metadata": {
"title": "Quickstart | Firecrawl",
"description": "Firecrawl allows you to turn entire websites into LLM-ready markdown",
"sourceURL": "https://docs.firecrawl.dev",
"statusCode": 200
}
}
}
Extract Structured Data (JSON Mode)
Extract structured data using a schema:
from firecrawl import Firecrawl
from pydantic import BaseModel
app = Firecrawl(api_key="fc-YOUR_API_KEY")
class CompanyInfo(BaseModel):
company_mission: str
is_open_source: bool
is_in_yc: bool
result = app.scrape(
'https://firecrawl.dev',
formats=[{"type": "json", "schema": CompanyInfo.model_json_schema()}]
)
print(result.json)
{"company_mission": "Turn websites into LLM-ready data", "is_open_source": true, "is_in_yc": true}
Or extract with just a prompt (no schema):
result = app.scrape(
'https://firecrawl.dev',
formats=[{"type": "json", "prompt": "Extract the company mission"}]
)
Scrape Formats
Available formats: markdown, html, rawHtml, screenshot, links, json, branding
Get a screenshot
doc = app.scrape("https://firecrawl.dev", formats=["screenshot"])
print(doc.screenshot) # Base64 encoded image
Extract brand identity (colors, fonts, typography)
doc = app.scrape("https://firecrawl.dev", formats=["branding"])
print(doc.branding) # {"colors": {...}, "fonts": [...], "typography": {...}}
Actions (Interact Before Scraping)
Click, type, scroll, and more before extracting:
doc = app.scrape(
url="https://example.com/login",
formats=["markdown"],
actions=[
{"type": "write", "text": "user@example.com"},
{"type": "press", "key": "Tab"},
{"type": "write", "text": "password"},
{"type": "click", "selector": 'button[type="submit"]'},
{"type": "wait", "milliseconds": 2000},
{"type": "screenshot"}
]
)
Search
Search the web and optionally scrape the results.
curl -X POST 'https://api.firecrawl.dev/v2/search' \
-H 'Authorization: Bearer fc-YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"query": "firecrawl web scraping",
"limit": 5
}'
Response:
{
"success": true,
"data": {
"web": [
{
"url": "https://www.firecrawl.dev/",
"title": "Firecrawl - The Web Data API for AI",
"description": "The web crawling, scraping, and search API for AI.",
"position": 1
}
],
"images": [...],
"news": [...]
}
}
Search with Content Scraping
Get the full content of search results:
from firecrawl import Firecrawl
firecrawl = Firecrawl(api_key="fc-YOUR_API_KEY")
results = firecrawl.search(
"firecrawl web scraping",
limit=3,
scrape_options={
"formats": ["markdown", "links"]
}
)
Browse
Give your agents a secure browser environment. Let them run code safely to gather data and take action on the web.
curl -X POST 'https://api.firecrawl.dev/v2/browser' \
-H 'Authorization: Bearer fc-YOUR_API_KEY' \
-H 'Content-Type: application/json'
Response:
{
"success": true,
"id": "550e8400-e29b-41d4-a716-446655440000",
"cdpUrl": "wss://cdp-proxy.firecrawl.dev/cdp/550e8400-e29b-41d4-a716-446655440000",
"liveViewUrl": "https://liveview.firecrawl.dev/550e8400-e29b-41d4-a716-446655440000"
}
Execute Code in the Browser
Run Playwright code, Python, or bash commands remotely:
import Firecrawl from '@mendable/firecrawl-js';
const firecrawl = new Firecrawl({ apiKey: "fc-YOUR_API_KEY" });
// 1. Launch a session
const session = await firecrawl.browser();
// 2. Execute code
const result = await firecrawl.browserExecute(session.id, {
code: `
await page.goto("https://news.ycombinator.com");
const title = await page.title();
console.log(title);
`,
language: "node",
});
console.log(result.result); // "Hacker News"
// 3. Close
await firecrawl.deleteBrowser(session.id);
Persistent Sessions
Save and reuse browser state (cookies, localStorage) across sessions:
const session = await firecrawl.browser({
ttl: 600,
profile: {
