SkillAgentSearch skills...

Scrapelib

⛏ a library for scraping unreliable pages

Install / Use

/learn @jamesturk/Scrapelib
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

scrapelib is a library for making requests to less-than-reliable websites.

This repository has moved to Codeberg, GitHub will remain as a read-only mirror.

Source: https://codeberg.org/jpt/scrapelib

Documentation: https://jamesturk.github.io/scrapelib/

Issues: https://codeberg.org/jpt/scrapelib/issues

PyPI badge Test badge

Features

scrapelib originated as part of the Open States project to scrape the websites of all 50 state legislatures and as a result was therefore designed with features desirable when dealing with sites that have intermittent errors or require rate-limiting.

Advantages of using scrapelib over using requests as-is:

  • HTTP(S) and FTP requests via an identical API
  • support for simple caching with pluggable cache backends
  • highly-configurable request throtting
  • configurable retries for non-permanent site failures
  • All of the power of the suberb requests library.

Installation

scrapelib is on PyPI, and can be installed via any standard package management tool.

Example Usage


  import scrapelib
  s = scrapelib.Scraper(requests_per_minute=10)

  # Grab Google front page
  s.get('http://google.com')

  # Will be throttled to 10 HTTP requests per minute
  while True:
      s.get('http://example.com')

Related Skills

View on GitHub
GitHub Stars212
CategoryDevelopment
Updated2mo ago
Forks40

Languages

Python

Security Score

100/100

Audited on Jan 9, 2026

No findings