SkillAgentSearch skills...

Redd

πŸ“‘ A modern, async-ready Python library for extracting Reddit data β€” scrape posts, comments, users, and subreddits with typed models and no API keys required.

Install / Use

/learn @eliasbiondo/Redd
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

REDD

Reddit Extraction and Data Dumper

PyPI License: MIT

A modern, async-ready Python library for extracting Reddit data. No API keys required.

https://github.com/user-attachments/assets/2c377e2a-ac44-4596-89cd-60ce45ce3f4c


Table of Contents

  1. Features
  2. Installation
  3. Quick Start
  4. API Reference
  5. Architecture
  6. Examples
  7. Contributing
  8. Disclaimer
  9. License

1. Features

  • No API keys β€” uses Reddit's public .json endpoints.
  • Sync and async β€” choose Redd or AsyncRedd depending on your stack.
  • Typed models β€” frozen dataclasses instead of raw dictionaries.
  • Hexagonal architecture β€” swap HTTP adapters without touching business logic.
  • Auto-pagination β€” fetch hundreds of posts with a single call.
  • User-Agent rotation β€” built-in rotation to reduce ban risk.
  • Proxy support β€” pass a proxy URL and scrape at scale.
  • Throttling β€” configurable random sleep between paginated requests.

2. Installation

With uv (recommended):

uv add redd

With pip:

pip install redd

For async support (requires httpx):

uv add redd httpx

3. Quick Start

3.1. Synchronous usage

from redd import Redd, Category, TimeFilter

with Redd() as r:
    # Search Reddit
    results = r.search("Python programming", limit=5)
    for item in results:
        print(f"  {item.title}")

    # Fetch top posts from a subreddit
    posts = r.get_subreddit_posts(
        "Python",
        limit=10,
        category=Category.TOP,
        time_filter=TimeFilter.WEEK,
    )
    for post in posts:
        print(f"  [{post.score:>5}] {post.title}")

    # Get full post details with comments
    detail = r.get_post("/r/Python/comments/abc123/example_post/")
    print(f"  {detail.title} -- {len(detail.comments)} comments")

    # Scrape user activity
    items = r.get_user("spez", limit=10)
    for item in items:
        print(f"  [{item.kind}] {item.title or item.body[:80]}")

3.2. Asynchronous usage

import asyncio
from redd import AsyncRedd

async def main():
    async with AsyncRedd() as r:
        results = await r.search("machine learning", limit=5)
        for item in results:
            print(item.title)

asyncio.run(main())

3.3. Configuration

r = Redd(
    proxy="http://user:pass@host:port",  # Optional proxy
    timeout=15.0,                        # Request timeout in seconds
    rotate_user_agent=True,              # Rotate UA per request
    throttle=(1.0, 3.0),                 # Random sleep range between pages
)

4. API Reference

4.1. Clients

| Class | Description | |-------|-------------| | Redd | Synchronous client (requests) | | AsyncRedd | Asynchronous client (httpx) |

Both clients support context managers and expose the same API surface.

4.2. Methods

| Method | Description | |--------|-------------| | search(query, *, limit, sort, after, before) | Search all of Reddit | | search_subreddit(subreddit, query, *, limit, sort, after, before) | Search within a subreddit | | get_post(permalink) | Get full post details and comment tree | | get_user(username, *, limit) | Get a user's recent activity | | get_subreddit_posts(subreddit, *, limit, category, time_filter) | Fetch subreddit listings | | get_user_posts(username, *, limit, category, time_filter) | Fetch a user's submitted posts | | download_image(image_url, *, output_dir) | Download an image | | close() | Release HTTP resources |

4.3. Models

All models are frozen dataclasses.

| Model | Fields | |-------|--------| | SearchResult | title, url, description, subreddit | | PostDetail | title, author, body, score, url, subreddit, created_utc, num_comments, comments | | Comment | author, body, score, replies | | SubredditPost | title, author, permalink, score, num_comments, created_utc, subreddit, url, image_url, thumbnail_url | | UserItem | kind, subreddit, url, created_utc, title, body |

4.4. Enums

| Enum | Values | |------|--------| | Category | HOT, TOP, NEW, RISING | | UserCategory | HOT, TOP, NEW | | TimeFilter | HOUR, DAY, WEEK, MONTH, YEAR, ALL | | SortOrder | RELEVANCE, HOT, TOP, NEW, COMMENTS |

4.5. Exceptions

| Exception | Description | |-----------|-------------| | ReddError | Base exception for all REDD errors | | HttpError | HTTP request failed after retries | | ParseError | Reddit's JSON could not be parsed into domain models | | NotFoundError | Requested resource does not exist |


5. Architecture

REDD follows hexagonal architecture (ports and adapters), separating business logic from I/O concerns:

graph LR
    subgraph Public API
        A["Redd (sync)"]
        B["AsyncRedd (async)"]
    end

    subgraph Core
        C["Parsing Layer"]
        D["Domain Models"]
        E["Enums"]
    end

    subgraph Ports
        F["HttpPort"]
        G["AsyncHttpPort"]
    end

    subgraph Adapters
        H["RequestsHttpAdapter"]
        I["HttpxAsyncAdapter"]
    end

    A --> C
    B --> C
    C --> D
    C --> E
    A --> F
    B --> G
    F -.implements.-> H
    G -.implements.-> I
    H --> J["reddit.com"]
    I --> J

Directory layout

src/redd/
β”œβ”€β”€ __init__.py           # Public API surface
β”œβ”€β”€ _client.py            # Sync client (Redd)
β”œβ”€β”€ _async_client.py      # Async client (AsyncRedd)
β”œβ”€β”€ _parsing.py           # JSON to domain model parsing (I/O-free)
β”œβ”€β”€ _exceptions.py        # Error hierarchy
β”‚
β”œβ”€β”€ domain/               # Pure domain layer
β”‚   β”œβ”€β”€ models.py         # Frozen dataclasses
β”‚   └── enums.py          # Type-safe enumerations
β”‚
β”œβ”€β”€ ports/                # Abstract interfaces
β”‚   └── http.py           # HttpPort and AsyncHttpPort protocols
β”‚
└── adapters/             # Concrete implementations
    β”œβ”€β”€ http_sync.py      # requests-based adapter
    └── http_async.py     # httpx-based adapter

The parsing module has no I/O dependencies. Clients interact with the HTTP layer exclusively through protocol-based ports, making it straightforward to swap adapters, mock dependencies in tests, or add new transports.


6. Examples

See the examples/ directory for runnable scripts.

Fetch hot posts from a subreddit (subreddit_hot_posts.py):

from redd import Category, Redd

with Redd() as r:
    posts = r.get_subreddit_posts("brdev", limit=10, category=Category.HOT)

    for i, post in enumerate(posts, 1):
        print(f"{i:>2}. [{post.score:>5}] {post.title}")
        print(f"     by u/{post.author} β€” {post.num_comments} comments")
        print(f"     {post.url}")
        print()

Sample output:

 1. [   91] Qual o plano B de vocΓͺs caso a Γ‘rea piore muito?
     by u/Spiritual_Pangolin18 β€” 185 comments
     https://www.reddit.com/r/brdev/comments/1rnytuh/...

 2. [   83] FuΓ§ando minhas coisas, encontrei um cΓ³digo de 600 linhas em Portugol
     by u/Dramatic-Revenue-802 β€” 7 comments
     https://www.reddit.com/r/brdev/comments/1ro269a/...

7. Contributing

Contributions are welcome. Please read CONTRIBUTING.md for guidelines on setting up the project, running tests, and submitting changes.


8. Disclaimer

Use responsibly. Reddit may rate-limit or ban IPs that make excessive requests. Consider using rotating proxies for large-scale scraping.


9. License

MIT. See LICENSE for details.

Copyright (c) 2025 Elias Biondo

View on GitHub
GitHub Stars18
CategoryDevelopment
Updated8d ago
Forks0

Languages

Python

Security Score

90/100

Audited on Mar 31, 2026

No findings