<h1 align="center">Oh, God! My idea comes true.</h1> <h3 align="center"> AI-powered research workflow: paper discovery → LLM analysis → scholar tracking → Paper2Code → multi-agent studio </h3> <a href="https://github.com/jerry609/PaperBot/actions"> <img src="https://img.shields.io/github/actions/workflow/status/jerry609/PaperBot/ci.yml?branch=dev&label=CI&logo=github" alt="CI"> </a> <a href="https://github.com/jerry609/PaperBot/issues/232"> <img src="https://img.shields.io/badge/roadmap-2026-blue?logo=roadmap" alt="Roadmap"> </a> <img src="https://img.shields.io/badge/version-0.1.0-007ec6?logo=github" alt="Version 0.1.0"> <img src="https://img.shields.io/badge/license-MIT-green" alt="License"> <img src="https://img.shields.io/badge/Python-3.10+-blue?logo=python&logoColor=white" alt="Python"> <img src="https://img.shields.io/badge/Next.js-16-black?logo=nextdotjs" alt="Next.js"> <img src="https://img.shields.io/badge/Platform-Win%20|%20Mac%20|%20Linux-lightgrey" alt="Platform"> <img src="https://img.shields.io/badge/Downloads-xx-lightgrey?style=social&logo=icloud" alt="Downloads"> <a href="#getting-started">Getting Started</a> · <a href="#features">Features</a> · <a href="https://github.com/jerry609/PaperBot/issues/232">Roadmap</a> · <a href="#architecture">Architecture</a> · <a href="#contributing">Contributing</a>

About

"Oh, God! My idea comes true." is an end-to-end research assistant that automates the paper discovery → analysis → reproduction pipeline. It combines multi-source search, LLM-powered evaluation, scholar tracking, and code generation into a unified workflow with Web, CLI, and API interfaces.

Backend Python + FastAPI (SSE streaming) · Frontend Next.js + Ink CLI · Sources arXiv / Semantic Scholar / OpenAlex / HuggingFace Daily Papers / papers.cool

Screenshots

<details open> <summary>Web Dashboard</summary>

Current dashboard layout focused on the active research question, the workflow console, and decision-critical alerts.

Dashboard

| Research Workspace | AgentSwarm Studio | |--------------------|-------------------| | | Studio |

| LLM-as-Judge Radar | Email Push | |---------------------|------------| | Judge | Email |

</details> <details> <summary>Terminal UI (Ink)</summary>

CLI

</details>

Features

Discovery & Analysis

Multi-source search — Aggregate arXiv, Semantic Scholar, OpenAlex, HF Daily Papers, papers.cool with cross-query dedup and scoring
DailyPaper — Automated daily report generation with SSE streaming, LLM enrichment (summary / trends / insight), and multi-channel push (Email / Slack / DingTalk / Telegram / Discord / WeCom / Feishu)
LLM-as-Judge — 5-dimensional scoring (Relevance / Novelty / Rigor / Impact / Clarity) with multi-round calibration, automatic filtering of low-quality papers
Deadline Radar — Conference deadline tracking with CCF ranking and research track matching

Knowledge Management

Paper Library — Save, organize, and export papers (BibTeX / RIS / Markdown / CSL-JSON / Zotero sync)
Structured Cards — LLM-extracted method / dataset / conclusion / limitations with DB caching
Related Work — Draft generation from saved papers with [AuthorYear] citation format
Memory System — Research memory with FTS5 + BM25 search, context engine for personalized recommendations
MemoryBench Suite — Retrieval / context / isolation / injection / performance / ROI / effectiveness benchmarks for the memory and Paper2Code stack

Reproduction & Studio

Paper2Code — Paper → code skeleton (Planning → Analysis → Generation → Verification) with self-healing debugging
AgentSwarm — Multi-agent orchestration platform with Claude Code integration, Runbook file management, Diff/Snapshot, and sandbox execution (Docker / E2B)
Scholar Tracking — Multi-agent monitoring with PIS influence scoring (citation velocity, trend momentum)
Deep Review — Simulated peer review (screening → critique → decision)

Getting Started

Install

# Use python3 for macOS/Linux
python -m venv .venv && source .venv/bin/activate
pip install -e .

Configure

cp env.example .env
# Set at least one LLM key: OPENAI_API_KEY=sk-...

<details> <summary>LLM routing configuration</summary>

Multiple LLM backends supported via ModelRouter:

| Task Type | Route | Example Models | |-----------|-------|----------------| | default / extraction / summary | default | gpt-4o-mini / MiniMax M2.1 | | analysis / reasoning / judge | reasoning | DeepSeek R1 / GLM 4.7 | | code | code | gpt-4o |

</details> <details> <summary>Push notification configuration</summary>

DailyPaper supports Email / Slack / DingTalk / Telegram / Discord / WeCom / Feishu push.

Web UI — Configure in the Topic Workflow settings panel (recommended).

Environment variables:

PAPERBOT_NOTIFY_ENABLED=true
PAPERBOT_NOTIFY_CHANNELS=email,slack
PAPERBOT_NOTIFY_SMTP_HOST=smtp.qq.com
PAPERBOT_NOTIFY_SMTP_PORT=587
PAPERBOT_NOTIFY_SMTP_USERNAME=your@qq.com
PAPERBOT_NOTIFY_SMTP_PASSWORD=your-auth-code
PAPERBOT_NOTIFY_EMAIL_FROM=your@qq.com
PAPERBOT_NOTIFY_EMAIL_TO=recipient@example.com

</details>

Run

# Database migration (first time)
alembic upgrade head

# API server
# Use python3 for macOS/Linux
python -m uvicorn src.paperbot.api.main:app --reload --port 8000

# Web dashboard (separate terminal)
cd web && npm install && npm run dev

# Background jobs (optional)
arq paperbot.infrastructure.queue.arq_worker.WorkerSettings

CLI Usage

# Daily paper with LLM + Judge + push
python -m paperbot.presentation.cli.main daily-paper \
  -q "LLM reasoning" -q "code generation" \
  --with-llm --with-judge --save --notify

# Topic search
python -m paperbot.presentation.cli.main topic-search \
  -q "ICL compression" --source arxiv_api --source hf_daily

# Scholar tracking
python main.py track --summary

# Paper2Code
python main.py gen-code --title "..." --abstract "..." --output-dir ./output

# Deep review
python main.py review --title "..." --abstract "..."

Architecture

Architecture

Editable source: Excalidraw · draw.io

Module Status

Full maturity matrix and progress: Roadmap #232

| Status | Modules | |--------|---------| | Production | Topic Search · DailyPaper · LLM-as-Judge · Push/Notify · Model Provider · Deadline Radar · Paper Library | | Usable | Scholar Tracking · Deep Review · Paper2Code · Memory · Context Engine · Discovery · AgentSwarm · Harvest · Import/Sync | | Planned | DB Modernization #231 · Obsidian Integration #159 |

MemoryBench Evaluation

Aligned with LongMemEval (ICLR 2025), LoCoMo (ACL 2024), Mem0, Letta. Full methodology: evals/memory/README.md · Epic #283

<details open> <summary>Retrieval Quality — 40 queries, 45 memories, 2 users (FTS5 + BM25)</summary>

| Metric | Target | Result | | |--------|--------|--------|-| | Recall@5 | ≥ 0.80 | 0.873 | :white_check_mark: | | MRR@10 | ≥ 0.65 | 0.731 | :white_check_mark: | | nDCG@10 | ≥ 0.70 | 0.747 | :white_check_mark: | | Hit@10 | — | 1.000 | |

Breakdown by LoCoMo question type:

| Type | Recall@5 | MRR@10 | |------|----------|--------| | single-hop (24) | 0.931 | 0.770 | | multi-hop (6) | 0.708 | 0.583 | | temporal (2) | 1.000 | 0.417 | | acronym (4) | 0.708 | 0.875 |

</details> <details> <summary>Scope Isolation + CRUD — zero-leak enforcement, Mem0 lifecycle</summary>

| Check | Result | |-------|--------| | Cross-user leak rate | 0 (zero tolerance) | | Cross-scope leak rate | 0 (zero tolerance) | | CRUD Update (old content gone) | PASS | | CRUD Delete (soft-delete enforced) | PASS | | CRUD Dedup (exact duplicate skipped) | PASS |

</details> <details> <summary>Context Extraction — L0-L3 layer assembly, Letta alignment</summary>

| Test | Result | |------|--------| | Layer completeness (L0 profile → L3 paper) | 8/8 PASS | | Graceful degradation (missing paper / empty user) | 3/3 PASS | | Context precision (query → relevant memories) | 100% (3/3) | | Token budget guard (300 token cap) | 215 tokens | | TrackRouter accuracy (query → correct track) | 100% (5/5) |

</details> <details> <summary>Injection Robustness — offline pattern detection</summary>

| Metric | Target | Result | |--------|--------|--------| | Pollution rate (missed malicious) | ≤ 2% | 0.0% (6/6 caught) | | False positive rate (benign flagged) | — | 0.0% (0/6 flagged) |

Covers: instruction override, tag escape, special token injection, role hijack, Unicode bypass, privilege escalation.

</details>

# Run full MemoryBench suite (~6s, fully offline, no API keys needed)
PYTHONPATH=src pytest -q evals/memory/test_retrieval_bench.py \
  evals/memory/test_scope_isolation.py \
  evals/memory/test_context_extraction.py \
  evals/memory/test_injection_robustness.py -s

Roadmap

Roadmap #232 — Living roadmap organized by functional area, with checkbox tracking and Epic links.

Active Epics:

| Epic | Area | Status | |------|------|--------| | [#197](https://github.com/jerry609/PaperBot/issues

PaperBot

Install / Use

README

About

Screenshots

Features

Discovery & Analysis

Knowledge Management

Reproduction & Studio

Getting Started

Install

Configure

Run

CLI Usage

Architecture

Module Status

MemoryBench Evaluation

Roadmap