30 skills found
unclecode / Crawl4ai🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
any4ai / AnyCrawlAnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
oxylabs / Oxylabs AI Studio PyStructured data gathering from any website using AI-powered scraper, crawler, and browser automation. Scraping and crawling with natural language prompts. Equip your LLM agents with fresh data. AI Studio python SDK for intelligent web data gathering.
watercrawl / WaterCrawlTransform Web Content into LLM-Ready Data
paulpierre / Markdown CrawlerA multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG
BrowserCash / TeracrawlHigh-performance web crawler API optimized for LLMs. Turn any search or website into clean Markdown using remote browsers. Firecrawl alternative
eddyhhlure1Eddy / News AnalyzerThis is an open-source RSS crawler with an LLM interface, and it can use LLM to analyze news feeds
Aavache / LLMWebCrawlerA Web Crawler based on LLMs implemented with Ray and Huggingface. The embeddings are saved into a vector database for fast clustering and retrieval. Use it for your RAG.
Sriram-PR / Doc ScraperGo web crawler to scrape documentation sites and convert content to clean Markdown for LLM ingestion (RAG, training data).
pc8544 / Website CrawlerExtract data from websites in LLM ready JSON or CSV format. Crawl or Scrape entire website with Website Crawler
oxylabs / Oxylabs AI Studio JsStructured data gathering from any website using AI-powered scraper, crawler, and browser automation. Scraping and crawling with natural language prompts. Equip your LLM agents with fresh data. AI Studio JS SDK for intelligent web data gathering.
rowyio / LLM Web CrawlerWeb Scraper and Crawler for LLM Apps and AI Workflows with NoCode / LowCode. Plug and play with your own logic and customize it flexibly and scalably on BuildShip.
hoangsonww / AI Gov Content Curator💡An end-to-end solution for aggregating, summarizing, and displaying news articles using an AI-powered backend, an automated CRON crawler & newsletter emailer, and a responsive Next.js frontend. It integrates technologies like Express.js, MongoDB, Puppeteer, and GenAI/LLMs to deliver up-to-date, curated content to government staff and other users.
eavae / Feilianllm based crawler
lennyerik / Crawl4ai ProxyA simple proxy server to integrate crawl4ai with OpenWebUI
Kenn3o3 / Easy LLM ArXiv Paper CrawlerA Python tool to crawl historical arXiv papers from specified categories, filter them using a custom LLM prompt via Alibaba Cloud's DashScope API, and export results to a CSV file with paper names and PDF links. Ideal for researchers seeking comprehensive, tailored paper collections.
GramosoftAI / GcrawlAITurn any website into clean, LLM-ready data. Open-source web crawler with stealth mode, distributed crawling, real-time WebSocket progress & Markdown output. Power your AI apps with GcrawlAI.
us / Crw⚡Lightweight Firecrawl alternative in Rust — 91.5% coverage, 5x faster, 3MB RAM. Web scraper & crawler with MCP server for Claude, LLM extraction, JS rendering.
malvads / MojoNon sucking cross-platform extremely fast C++ crawler to convert entire websites into LLM readable data
xVc323 / OmnidocsAutomated documentation crawler that generates LLM-friendly Markdown from any docs site. Export as single or multi-file, ready for AI ingestion.