Scraperr
Self-hosted webscraper.
Install / Use
/learn @jaypyles/ScraperrREADME
A powerful self-hosted web scraping solution
<div> <img src="https://img.shields.io/badge/MongoDB-%234ea94b.svg?style=for-the-badge&logo=mongodb&logoColor=white" alt="MongoDB" /> <img src="https://img.shields.io/badge/FastAPI-005571?style=for-the-badge&logo=fastapi" alt="FastAPI" /> <img src="https://img.shields.io/badge/Next-black?style=for-the-badge&logo=next.js&logoColor=white" alt="Next JS" /> <img src="https://img.shields.io/badge/tailwindcss-%2338B2AC.svg?style=for-the-badge&logo=tailwind-css&logoColor=white" alt="TailwindCSS" /> </div> </div>📋 Overview
Scrape websites without writing a single line of code.
<div align="center"> <img src="https://github.com/jaypyles/www-scrape/blob/master/docs/main_page.png" alt="Scraperr Main Interface" width="800px"> </div>📚 Check out the docs for a comprehensive quickstart guide and detailed information.
✨ Key Features
- XPath-Based Extraction: Precisely target page elements
- Queue Management: Submit and manage multiple scraping jobs
- Domain Spidering: Option to scrape all pages within the same domain
- Custom Headers: Add JSON headers to your scraping requests
- Media Downloads: Automatically download images, videos, and other media
- Results Visualization: View scraped data in a structured table format
- Data Export: Export your results in markdown and csv formats
- Notifcation Channels: Send completion notifcations, through various channels
🚀 Getting Started
Docker
make up
Helm
Refer to the docs for helm deployment: https://scraperr-docs.pages.dev/guides/helm-deployment
⚖️ Legal and Ethical Guidelines
When using Scraperr, please remember to:
- Respect
robots.txt: Always check a website'srobots.txtfile to verify which pages permit scraping - Terms of Service: Adhere to each website's Terms of Service regarding data extraction
- Rate Limiting: Implement reasonable delays between requests to avoid overloading servers
Disclaimer: Scraperr is intended for use only on websites that explicitly permit scraping. The creator accepts no responsibility for misuse of this tool.
💬 Join the Community
Get support, report bugs, and chat with other users and contributors.
📄 License
This project is licensed under the MIT License. See the LICENSE file for details.
👏 Contributions
Development made easier with the webapp template.
To get started, simply run make build up-dev.
Related Skills
node-connect
343.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
prose
343.3kOpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.
claude-opus-4-5-migration
92.1kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
92.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
