Picnic

A weekly basket with the latest published research in political science

Generate Convert Improve

Install / Use

/learn @sumtxt/Picnic

About this skill

Quality Score

0/100

README

Paper Picnic 2.0

A weekly basket with the latest published research in political science. On Fridays at 2 AM UTC, we query the Crossref API for new research articles that appeared in the previous 7 days across many journals in political science and adjacent fields. paper-picnic.com/

The crawler lives in the main branch of the backend while the website is rendered from the gh-pages branch.

Setup

Local Development

Install Python 3.11
```
pyenv install 3.11
pyenv local 3.11
```

Create virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies

pip install -r requirements.txt

# For development (includes testing tools)
pip install -r requirements-dev.txt

Configure environment variables

Create a .env file in the project root:

OPENAI_APIKEY=your_openai_api_key
CROSSREF_EMAIL=your_email@example.com

GitHub Actions Setup

After forking the repository, you need to configure repository settings:

Enable Workflow Permissions
- Go to Settings > Actions > General
- Scroll to "Workflow permissions"
- Allow workflows to read and write in the repository
Set Repository Secrets
- Go to Security > Secrets and Variables > Actions
- Add the following secrets:
  - OPENAI_APIKEY - OpenAI API key for article classification
  - CROSSREF_EMAIL - Your email for polite Crossref API requests
  - RESEND_API_KEY - Resend.com API key for email notifications
  - RESEND_EMAIL_FROM - Sender email address
  - RESEND_EMAIL_TO - Recipient email address

Usage

Local Crawl

Run the crawler

python main.py

Use the parameters in ./src/config.py to disable some features of the crawler for local testing purposes.

Run Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src --cov-report=term --cov-report=html

# Run specific test file
pytest tests/test_parsers.py

Project Structure

picnic/
├── src/                      # Source code modules
│   ├── config.py             # Configuration and constants
│   ├── crossref_client.py    # Crossref API client
│   ├── openai_client.py      # OpenAI API integration
|   ├── osf_client.py         # OSF API client
│   ├── parsers.py            # Response parsing
│   ├── filters.py            # Article filtering logic
│   ├── data_processor.py     # Data cleaning and deduplication
│   ├── json_renderer.py      # JSON output formatting
│   └── stats_updater.py      # Statistics management
├── main.py                   # Main crawl script
├── tests/                    # Unit tests
├── parameters/               # Journal/OSF configurations
├── memory/                   # Crawl history for deduplication
├── output/                   # Generated JSON files and statistics
├── notification/             # Email notification system (Node.js)
├── .github/workflows/        # GitHub Actions automation
└── requirements.txt          # Python dependencies

How It Works

The crawler (main.py) runs two parallel workflows:

1. Crossref Journal Crawl

Tests Crossref API endpoints (public vs polite) to select the faster one
Queries /works endpoint with batched ISSNs from parameters/journals.json
Searches both created and published dates (default: 14 days ago to 1 day ago)
Parses metadata and removes duplicates using memory/doi.csv
Merges journal info and applies filters:
- Standard: Removes editorials, ToCs, errata by title pattern
- Nature: Keeps only articles with /s in URL
- Science: Keeps only articles with abstracts ≥200 chars
- AI (optional): Uses GPT-4o-mini to classify social science relevance
Outputs to output/publications.json

2. OSF Preprints Crawl

Loads subject filter from parameters/osf_subjects.json ("Social and Behavioral Sciences")
Queries OSF API date-by-date within crawl window
Parses metadata, deduplicates versions (keeps latest), removes past preprints using memory/osf_ids.csv
Outputs to output/preprints.json

3. Statistics & Automation

Stats: Counts articles per journal, updates output/stats.csv
GitHub Actions:
- Update Website workflow syncs outputs to gh-pages branch
- Send Notification workflow triggers after the Crawl workflow completes
  - Sends an email with a subset of publications via Resend.com

Behavior is configurable via src/config.py (crawl window, memory updates, filter toggles, etc.)

History

The first version of the crawler went live in August 2024. Paper Picnic 2.0, rewritten in Python by Claude Code based on the original R version, launched in February 2026 after running side-by-side since January. The legacy R crawler remains available in the main_v0 branch, and the original website in gh-pages_v0.

Related Skills

proje

Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

400

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

sumtxt

View profile

View on GitHub

GitHub Stars32

CategoryEducation

Updated5d ago

Forks4

sumtxt/picnic

Languages

Python

Security Score

90/100

Audited on Apr 3, 2026

No findings