Scrapelib

⛏ a library for scraping unreliable pages

Generate Convert Improve

Install / Use

/learn @jamesturk/Scrapelib

About this skill

Quality Score

0/100

README

scrapelib is a library for making requests to less-than-reliable websites.

This repository has moved to Codeberg, GitHub will remain as a read-only mirror.

Source: https://codeberg.org/jpt/scrapelib

Documentation: https://jamesturk.github.io/scrapelib/

Issues: https://codeberg.org/jpt/scrapelib/issues

Features

scrapelib originated as part of the Open States project to scrape the websites of all 50 state legislatures and as a result was therefore designed with features desirable when dealing with sites that have intermittent errors or require rate-limiting.

Advantages of using scrapelib over using requests as-is:

HTTP(S) and FTP requests via an identical API
support for simple caching with pluggable cache backends
highly-configurable request throtting
configurable retries for non-permanent site failures
All of the power of the suberb requests library.

Installation

scrapelib is on PyPI, and can be installed via any standard package management tool.

Example Usage


  import scrapelib
  s = scrapelib.Scraper(requests_per_minute=10)

  # Grab Google front page
  s.get('http://google.com')

  # Will be throttled to 10 HTTP requests per minute
  while True:
      s.get('http://example.com')

Related Skills

node-connect

350.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

claude-opus-4-5-migration

110.4k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

frontend-design

110.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

model-usage

350.8k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

jamesturk

View profile

View on GitHub

GitHub Stars212

CategoryDevelopment

Updated2mo ago

Forks40

jamesturk/scrapelib

Languages

Python

Security Score

100/100

Audited on Jan 9, 2026

No findings