10 skills found
apache / StormcrawlerA scalable, mature and versatile web crawler based on Apache Storm
commoncrawl / News CrawlNews crawling with StormCrawler - stores content as WARC
DigitalPebble / StormcrawlerfightCrawl configurations for benchmarking / testing StormCrawler
DigitalPebble / Stormcrawler DockerResources for running StormCrawler with Docker services
sebastian-nagel / Warc CrawlerProcess web archives (WARC format) with StormCrawler and index content into Elasticsearch or Solr
DigitalPebble / Ansible StormAnsible playbook for deploying a Storm cluster
apache / Stormcrawler SiteSource for the Apache StormCrawler web site
anveshv18 / StormCrawler DocumentationStormCrawler Documentation
cnf271 / StormcrawlertestStormcrawler with Elasticsearch
HPI-BP2017N2 / CrawlerBased on Stormcrawler to crawl a list of domains and hand the pages to a data store