SkillAgentSearch skills...

Crawler

An open source example of the Count Love crawler.

Install / Use

/learn @count-love/Crawler
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Count Love Crawler

Installation

To isolate the crawler and its dependencies, it is recommended that you install in a Python virtual environment.

Tested with Python 3.9 (but should be compatible with a range of versions).

Install dependencies

To install dependencies, run:

pip install -r requirements.txt

Setup SQLite database

The SQLite3 database stores the source list, crawler queue, and content extracted from pages. To create a database run:

sqlite3 data.db < schema.sql

Running crawl

To run the crawl, run:

python crawler.py

While the crawl is running, details and diagnostic information is logged to "crawl.log". Because the Sources table is initially empty, running python crawler.py has no effect until a source is added. Here's an example of how to add a source by directly interacting with the database table:

sqlite3 data.db
INSERT INTO Sources VALUES (NULL, 'https://nytimes.com', 'New York, NY', 1, datetime('now'), NULL);

Rerunning python crawler.py will now print a list of potential articles with protest keywords to the console.

View on GitHub
GitHub Stars7
CategoryDevelopment
Updated1y ago
Forks0

Languages

Python

Security Score

75/100

Audited on Mar 22, 2025

No findings