323 skills found · Page 1 of 11
scrapy / ScrapyScrapy, a fast high-level web crawling & scraping framework for Python.
D4Vinci / Scrapling🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
lorien / Awesome Web ScrapingList of libraries, tools and APIs for web scraping and data processing.
adbar / TrafilaturaPython & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
oxylabs / Oxylabs AI Studio PyStructured data gathering from any website using AI-powered scraper, crawler, and browser automation. Scraping and crawling with natural language prompts. Equip your LLM agents with fresh data. AI Studio python SDK for intelligent web data gathering.
awolfly9 / IPProxyToolpython ip proxy tool scrapy crawl. 抓取大量免费代理 ip,提取有效 ip 使用
watercrawl / WaterCrawlTransform Web Content into LLM-Ready Data
Darwin-lfl / LangmanusA community-driven AI automation framework that builds upon the incredible work of the open source community. Our goal is to combine language models with specialized tools for tasks like web search, crawling, and Python code execution, while giving back to the community that made this possible.
tavily-ai / Tavily PythonThe Tavily Python SDK allows for easy interaction with the Tavily API, offering the full range of our search, extract, crawl, map, and research functionalities directly from your Python programs. Easily integrate smart search, content extraction, and research capabilities into your applications, harnessing Tavily's powerful features.
scrapfly / Scrapfly ScrapersScalable Python web scraping scripts for +40 popular domains
commoncrawl / Cc PysparkProcess Common Crawl data with Python and Spark
0xdsm / Pinkerton🕵️ Python project to crawl for JavaScript files and search for secrets like API keys, authorization tokens, hardcoded credentials, etc.
shaohua0116 / ICLR2019 OpenReviewDataScript that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using Selenium and ChromeDriver on Ubuntu.
tzuhsial / InstagramCrawlerA non API python program to crawl public photos, posts or followers
MarshalX / Telegram Crawler🕷 Automatically detect changes made to the official Telegram sites, clients and servers.
opensemanticsearch / Open Semantic EtlPython based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
serpwings / Static WordpressPython Library for Static WordPress (Autmated Crawling, Post-Processing and Hosting)
jmg / CrawleyPythonic Crawling / Scraping Framework based on Non Blocking I/O operations.
chenjr0719 / Facebook Page CrawlerA Python crawler uses Facebook Graph API to crawling fan page's public posts, comments, and reactions.
WwwwwyDev / CrawliptThe script for selenium in python. Make automated testing easier! 使用json脚本驱动selenium