PaperBot
Academic Personal AI Infrastructure
Install / Use
/learn @jerry609/PaperBotREADME
About
"Oh, God! My idea comes true." is an end-to-end research assistant that automates the paper discovery → analysis → reproduction pipeline. It combines multi-source search, LLM-powered evaluation, scholar tracking, and code generation into a unified workflow with Web, CLI, and API interfaces.
Backend Python + FastAPI (SSE streaming) · Frontend Next.js + Ink CLI · Sources arXiv / Semantic Scholar / OpenAlex / HuggingFace Daily Papers / papers.cool
Screenshots
<details open> <summary><b>Web Dashboard</b></summary> <br>Current dashboard layout focused on the active research question, the workflow console, and decision-critical alerts.

| Research Workspace | AgentSwarm Studio |
|--------------------|-------------------|
|
|
|
| LLM-as-Judge Radar | Email Push |
|---------------------|------------|
|
|
|

Features
Discovery & Analysis
- Multi-source search — Aggregate arXiv, Semantic Scholar, OpenAlex, HF Daily Papers, papers.cool with cross-query dedup and scoring
- DailyPaper — Automated daily report generation with SSE streaming, LLM enrichment (summary / trends / insight), and multi-channel push (Email / Slack / DingTalk / Telegram / Discord / WeCom / Feishu)
- LLM-as-Judge — 5-dimensional scoring (Relevance / Novelty / Rigor / Impact / Clarity) with multi-round calibration, automatic filtering of low-quality papers
- Deadline Radar — Conference deadline tracking with CCF ranking and research track matching
Knowledge Management
- Paper Library — Save, organize, and export papers (BibTeX / RIS / Markdown / CSL-JSON / Zotero sync)
- Structured Cards — LLM-extracted method / dataset / conclusion / limitations with DB caching
- Related Work — Draft generation from saved papers with [AuthorYear] citation format
- Memory System — Research memory with FTS5 + BM25 search, context engine for personalized recommendations
- MemoryBench Suite — Retrieval / context / isolation / injection / performance / ROI / effectiveness benchmarks for the memory and Paper2Code stack
Reproduction & Studio
- Paper2Code — Paper → code skeleton (Planning → Analysis → Generation → Verification) with self-healing debugging
- AgentSwarm — Multi-agent orchestration platform with Claude Code integration, Runbook file management, Diff/Snapshot, and sandbox execution (Docker / E2B)
- Scholar Tracking — Multi-agent monitoring with PIS influence scoring (citation velocity, trend momentum)
- Deep Review — Simulated peer review (screening → critique → decision)
Getting Started
Install
# Use python3 for macOS/Linux
python -m venv .venv && source .venv/bin/activate
pip install -e .
Configure
cp env.example .env
# Set at least one LLM key: OPENAI_API_KEY=sk-...
<details>
<summary>LLM routing configuration</summary>
Multiple LLM backends supported via ModelRouter:
| Task Type | Route | Example Models | |-----------|-------|----------------| | default / extraction / summary | default | gpt-4o-mini / MiniMax M2.1 | | analysis / reasoning / judge | reasoning | DeepSeek R1 / GLM 4.7 | | code | code | gpt-4o |
</details> <details> <summary>Push notification configuration</summary>DailyPaper supports Email / Slack / DingTalk / Telegram / Discord / WeCom / Feishu push.
Web UI — Configure in the Topic Workflow settings panel (recommended).
Environment variables:
PAPERBOT_NOTIFY_ENABLED=true
PAPERBOT_NOTIFY_CHANNELS=email,slack
PAPERBOT_NOTIFY_SMTP_HOST=smtp.qq.com
PAPERBOT_NOTIFY_SMTP_PORT=587
PAPERBOT_NOTIFY_SMTP_USERNAME=your@qq.com
PAPERBOT_NOTIFY_SMTP_PASSWORD=your-auth-code
PAPERBOT_NOTIFY_EMAIL_FROM=your@qq.com
PAPERBOT_NOTIFY_EMAIL_TO=recipient@example.com
</details>
Run
# Database migration (first time)
alembic upgrade head
# API server
# Use python3 for macOS/Linux
python -m uvicorn src.paperbot.api.main:app --reload --port 8000
# Web dashboard (separate terminal)
cd web && npm install && npm run dev
# Background jobs (optional)
arq paperbot.infrastructure.queue.arq_worker.WorkerSettings
CLI Usage
# Daily paper with LLM + Judge + push
python -m paperbot.presentation.cli.main daily-paper \
-q "LLM reasoning" -q "code generation" \
--with-llm --with-judge --save --notify
# Topic search
python -m paperbot.presentation.cli.main topic-search \
-q "ICL compression" --source arxiv_api --source hf_daily
# Scholar tracking
python main.py track --summary
# Paper2Code
python main.py gen-code --title "..." --abstract "..." --output-dir ./output
# Deep review
python main.py review --title "..." --abstract "..."
Architecture
<!-- TODO: 待重绘高清架构图 -->
Editable source: Excalidraw · draw.io
Module Status
Full maturity matrix and progress: Roadmap #232
| Status | Modules | |--------|---------| | Production | Topic Search · DailyPaper · LLM-as-Judge · Push/Notify · Model Provider · Deadline Radar · Paper Library | | Usable | Scholar Tracking · Deep Review · Paper2Code · Memory · Context Engine · Discovery · AgentSwarm · Harvest · Import/Sync | | Planned | DB Modernization #231 · Obsidian Integration #159 |
MemoryBench Evaluation
<details open> <summary><b>Retrieval Quality</b> — 40 queries, 45 memories, 2 users (FTS5 + BM25)</summary>Aligned with LongMemEval (ICLR 2025), LoCoMo (ACL 2024), Mem0, Letta. Full methodology:
evals/memory/README.md· Epic #283
| Metric | Target | Result | | |--------|--------|--------|-| | Recall@5 | ≥ 0.80 | 0.873 | :white_check_mark: | | MRR@10 | ≥ 0.65 | 0.731 | :white_check_mark: | | nDCG@10 | ≥ 0.70 | 0.747 | :white_check_mark: | | Hit@10 | — | 1.000 | |
Breakdown by LoCoMo question type:
| Type | Recall@5 | MRR@10 | |------|----------|--------| | single-hop (24) | 0.931 | 0.770 | | multi-hop (6) | 0.708 | 0.583 | | temporal (2) | 1.000 | 0.417 | | acronym (4) | 0.708 | 0.875 |
</details> <details> <summary><b>Scope Isolation + CRUD</b> — zero-leak enforcement, Mem0 lifecycle</summary>| Check | Result | |-------|--------| | Cross-user leak rate | 0 (zero tolerance) | | Cross-scope leak rate | 0 (zero tolerance) | | CRUD Update (old content gone) | PASS | | CRUD Delete (soft-delete enforced) | PASS | | CRUD Dedup (exact duplicate skipped) | PASS |
</details> <details> <summary><b>Context Extraction</b> — L0-L3 layer assembly, Letta alignment</summary>| Test | Result | |------|--------| | Layer completeness (L0 profile → L3 paper) | 8/8 PASS | | Graceful degradation (missing paper / empty user) | 3/3 PASS | | Context precision (query → relevant memories) | 100% (3/3) | | Token budget guard (300 token cap) | 215 tokens | | TrackRouter accuracy (query → correct track) | 100% (5/5) |
</details> <details> <summary><b>Injection Robustness</b> — offline pattern detection</summary>| Metric | Target | Result | |--------|--------|--------| | Pollution rate (missed malicious) | ≤ 2% | 0.0% (6/6 caught) | | False positive rate (benign flagged) | — | 0.0% (0/6 flagged) |
Covers: instruction override, tag escape, special token injection, role hijack, Unicode bypass, privilege escalation.
</details># Run full MemoryBench suite (~6s, fully offline, no API keys needed)
PYTHONPATH=src pytest -q evals/memory/test_retrieval_bench.py \
evals/memory/test_scope_isolation.py \
evals/memory/test_context_extraction.py \
evals/memory/test_injection_robustness.py -s
Roadmap
Roadmap #232 — Living roadmap organized by functional area, with checkbox tracking and Epic links.
Active Epics:
| Epic | Area | Status | |------|------|--------| | [#197](https://github.com/jerry609/PaperBot/issues
