Redd
π‘ A modern, async-ready Python library for extracting Reddit data β scrape posts, comments, users, and subreddits with typed models and no API keys required.
Install / Use
/learn @eliasbiondo/ReddREADME
REDD
Reddit Extraction and Data Dumper
A modern, async-ready Python library for extracting Reddit data. No API keys required.
https://github.com/user-attachments/assets/2c377e2a-ac44-4596-89cd-60ce45ce3f4c
Table of Contents
- Features
- Installation
- Quick Start
- API Reference
- Architecture
- Examples
- Contributing
- Disclaimer
- License
1. Features
- No API keys β uses Reddit's public
.jsonendpoints. - Sync and async β choose
ReddorAsyncRedddepending on your stack. - Typed models β frozen dataclasses instead of raw dictionaries.
- Hexagonal architecture β swap HTTP adapters without touching business logic.
- Auto-pagination β fetch hundreds of posts with a single call.
- User-Agent rotation β built-in rotation to reduce ban risk.
- Proxy support β pass a proxy URL and scrape at scale.
- Throttling β configurable random sleep between paginated requests.
2. Installation
With uv (recommended):
uv add redd
With pip:
pip install redd
For async support (requires httpx):
uv add redd httpx
3. Quick Start
3.1. Synchronous usage
from redd import Redd, Category, TimeFilter
with Redd() as r:
# Search Reddit
results = r.search("Python programming", limit=5)
for item in results:
print(f" {item.title}")
# Fetch top posts from a subreddit
posts = r.get_subreddit_posts(
"Python",
limit=10,
category=Category.TOP,
time_filter=TimeFilter.WEEK,
)
for post in posts:
print(f" [{post.score:>5}] {post.title}")
# Get full post details with comments
detail = r.get_post("/r/Python/comments/abc123/example_post/")
print(f" {detail.title} -- {len(detail.comments)} comments")
# Scrape user activity
items = r.get_user("spez", limit=10)
for item in items:
print(f" [{item.kind}] {item.title or item.body[:80]}")
3.2. Asynchronous usage
import asyncio
from redd import AsyncRedd
async def main():
async with AsyncRedd() as r:
results = await r.search("machine learning", limit=5)
for item in results:
print(item.title)
asyncio.run(main())
3.3. Configuration
r = Redd(
proxy="http://user:pass@host:port", # Optional proxy
timeout=15.0, # Request timeout in seconds
rotate_user_agent=True, # Rotate UA per request
throttle=(1.0, 3.0), # Random sleep range between pages
)
4. API Reference
4.1. Clients
| Class | Description |
|-------|-------------|
| Redd | Synchronous client (requests) |
| AsyncRedd | Asynchronous client (httpx) |
Both clients support context managers and expose the same API surface.
4.2. Methods
| Method | Description |
|--------|-------------|
| search(query, *, limit, sort, after, before) | Search all of Reddit |
| search_subreddit(subreddit, query, *, limit, sort, after, before) | Search within a subreddit |
| get_post(permalink) | Get full post details and comment tree |
| get_user(username, *, limit) | Get a user's recent activity |
| get_subreddit_posts(subreddit, *, limit, category, time_filter) | Fetch subreddit listings |
| get_user_posts(username, *, limit, category, time_filter) | Fetch a user's submitted posts |
| download_image(image_url, *, output_dir) | Download an image |
| close() | Release HTTP resources |
4.3. Models
All models are frozen dataclasses.
| Model | Fields |
|-------|--------|
| SearchResult | title, url, description, subreddit |
| PostDetail | title, author, body, score, url, subreddit, created_utc, num_comments, comments |
| Comment | author, body, score, replies |
| SubredditPost | title, author, permalink, score, num_comments, created_utc, subreddit, url, image_url, thumbnail_url |
| UserItem | kind, subreddit, url, created_utc, title, body |
4.4. Enums
| Enum | Values |
|------|--------|
| Category | HOT, TOP, NEW, RISING |
| UserCategory | HOT, TOP, NEW |
| TimeFilter | HOUR, DAY, WEEK, MONTH, YEAR, ALL |
| SortOrder | RELEVANCE, HOT, TOP, NEW, COMMENTS |
4.5. Exceptions
| Exception | Description |
|-----------|-------------|
| ReddError | Base exception for all REDD errors |
| HttpError | HTTP request failed after retries |
| ParseError | Reddit's JSON could not be parsed into domain models |
| NotFoundError | Requested resource does not exist |
5. Architecture
REDD follows hexagonal architecture (ports and adapters), separating business logic from I/O concerns:
graph LR
subgraph Public API
A["Redd (sync)"]
B["AsyncRedd (async)"]
end
subgraph Core
C["Parsing Layer"]
D["Domain Models"]
E["Enums"]
end
subgraph Ports
F["HttpPort"]
G["AsyncHttpPort"]
end
subgraph Adapters
H["RequestsHttpAdapter"]
I["HttpxAsyncAdapter"]
end
A --> C
B --> C
C --> D
C --> E
A --> F
B --> G
F -.implements.-> H
G -.implements.-> I
H --> J["reddit.com"]
I --> J
Directory layout
src/redd/
βββ __init__.py # Public API surface
βββ _client.py # Sync client (Redd)
βββ _async_client.py # Async client (AsyncRedd)
βββ _parsing.py # JSON to domain model parsing (I/O-free)
βββ _exceptions.py # Error hierarchy
β
βββ domain/ # Pure domain layer
β βββ models.py # Frozen dataclasses
β βββ enums.py # Type-safe enumerations
β
βββ ports/ # Abstract interfaces
β βββ http.py # HttpPort and AsyncHttpPort protocols
β
βββ adapters/ # Concrete implementations
βββ http_sync.py # requests-based adapter
βββ http_async.py # httpx-based adapter
The parsing module has no I/O dependencies. Clients interact with the HTTP layer exclusively through protocol-based ports, making it straightforward to swap adapters, mock dependencies in tests, or add new transports.
6. Examples
See the examples/ directory for runnable scripts.
Fetch hot posts from a subreddit (subreddit_hot_posts.py):
from redd import Category, Redd
with Redd() as r:
posts = r.get_subreddit_posts("brdev", limit=10, category=Category.HOT)
for i, post in enumerate(posts, 1):
print(f"{i:>2}. [{post.score:>5}] {post.title}")
print(f" by u/{post.author} β {post.num_comments} comments")
print(f" {post.url}")
print()
Sample output:
1. [ 91] Qual o plano B de vocΓͺs caso a Γ‘rea piore muito?
by u/Spiritual_Pangolin18 β 185 comments
https://www.reddit.com/r/brdev/comments/1rnytuh/...
2. [ 83] FuΓ§ando minhas coisas, encontrei um cΓ³digo de 600 linhas em Portugol
by u/Dramatic-Revenue-802 β 7 comments
https://www.reddit.com/r/brdev/comments/1ro269a/...
7. Contributing
Contributions are welcome. Please read CONTRIBUTING.md for guidelines on setting up the project, running tests, and submitting changes.
8. Disclaimer
Use responsibly. Reddit may rate-limit or ban IPs that make excessive requests. Consider using rotating proxies for large-scale scraping.
9. License
MIT. See LICENSE for details.
Copyright (c) 2025 Elias Biondo
