SkillAgentSearch skills...

ZenScraper

ZenScraper is an asynchronous scraper built with Python and Playwright designed for efficiently retrieving tweets from X.com (formerly Twitter). It supports scraping original tweets, retweets, and filtering tweets by date.

Install / Use

/learn @0Day3xpl0it/ZenScraper
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

ZenScraper

ZenScraper is an asynchronous scraper built with PlayWright and json designed for efficiently retrieving tweets from X.com (formerly Twitter). It supports scraping original tweets, retweets, and filtering tweets by date.

Key Features

  • Flexible Scraping: Choose to scrape original tweets, retweets, or both.
  • Date Filtering: Filter tweets based on specific date ranges (--since-after, --before).
  • Session Authentication: Uses cookies for authenticated scraping sessions.
  • Configurable Output: Outputs scraped data to JSON format with structured metadata or a cleaned text format.
  • Headless or Visible Mode: Operate in headless mode for automation or visible mode for debugging.

Requirements

Installation

Clone the repository and install the required dependencies:

git clone https://github.com/0Day3xpl0it/zenscraper.git
cd zenscraper
chmod +x *.py
pip install -r requirements.txt
playwright install

Next, generate an authenticated session cookie:

python3 grab_x_cookies.py

This script will create the x_cookies.json file necessary for authenticated scraping.

Usage

Basic command structure:

python3 zenscraper.py --username <username> [options]

Example with Time Filters

Scrape tweets from the @elonmusk account within a specific date range:

python3 zenscraper.py --username elonmusk --since-after 2025-01-01T00:00:00 --before 2025-02-01T00:00:00 --type tweets --output elonmusk_jan.json --scrolls 40 --max 200

This command collects up to 200 original tweets from January 2025, saving the output to elonmusk_jan.json.

Command-Line Options

| Option | Description | Default Value | | --------------- | -------------------------------------------- | ------------------- | | --username | (Required) X.com username to scrape | - | | --type | Content type: tweets, retweets, bio, or all | all | | --output | Output file (.json or .txt) | <username>.json | | --since-after | Include tweets after this date (ISO 8601) | None | | --before | Include tweets before this date (ISO 8601) | None | | --scrolls | Number of scroll actions | 30 | | --max | Maximum tweets to retrieve | 50 | | --no-headless | Display browser during scraping | Headless by default | | --delay | Add delay for throttling | 2 |

TODO

  • Add functionality to expand full text for tweets and retweets (complete - 5/8/25)
  • Add functionality to retrieve additional tweet data types (complete - 5/8/25)
  • Add functionality to grab all user bio data (complete - 5/9/25)
  • Add functionality to effectively grab replies and thread them to parent conversations

Important Notes

  • A valid x_cookies.json file is required for authenticated scraping.
  • Include multiple user-agent strings in user_agents.txt for request rotation.
  • Date options do not currently work with retweets as the X search function doesn't show retweets.
  • The scraper leverages asynchronous Playwright operations for optimal speed and efficiency.
  • It is recommended to use a backup X account to perform scraping activities to prevent issues.

Contributing

Contributions are welcome! Open an issue or submit a pull request for improvements.

License

ZenScraper is licensed under the MIT License. See LICENSE for details.

View on GitHub
GitHub Stars16
CategoryCustomer
Updated14d ago
Forks0

Languages

Python

Security Score

90/100

Audited on Mar 23, 2026

No findings