AutoResearchClaw
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. π¦
Install / Use
/learn @aiming-lab/AutoResearchClawREADME
<table> <tr> <td width="18%"> <a href="docs/showcase/SHOWCASE.md"><img src="docs/showcase/thumbnails/paper_I_random_matrix-01.png" width="120" alt="Sample Paper"/></a> </td> <td valign="middle"> <b>π Generated Paper Showcase</b><br><br> <b>8 papers across 8 domains</b> β math, statistics, biology, computing, NLP, RL, vision, robustness β generated fully autonomously with zero human intervention.<br><br> <a href="docs/showcase/SHOWCASE.md"><img src="https://img.shields.io/badge/View_Full_Showcase_β-All_8_Papers-d73a49?style=for-the-badge" alt="View Showcase"></a> </td> </tr> </table>
π§ͺ We're looking for testers! Try the pipeline with your own research idea β from any field β and tell us what you think. Your feedback directly shapes the next version. β Testing Guide | β δΈζζ΅θ―ζε | β ζ₯ζ¬θͺγγΉγγ¬γ€γ
π₯ News
- [03/22/2026] v0.3.2 β Cross-Platform Support + Major Stability β AutoResearchClaw now runs on any ACP-compatible agent backend (Claude Code, Codex CLI, Copilot CLI, Gemini CLI, Kimi CLI) and supports messaging platforms (Discord, Telegram, Lark, WeChat) via OpenClaw bridge. New CLI-agent code generation backend delegates Stages 10 & 13 to external CLI agents with budget control and timeout management. Also includes anti-fabrication system (VerifiedRegistry + experiment diagnosis & repair loop), 100+ bug fixes, modular executor refactoring,
--resumeauto-detection, LLM retry hardening, and community-reported fixes. - [03/18/2026] v0.3.1 β OpenCode Beast Mode + Community Contributions β New "Beast Mode" routes complex code generation to OpenCode with automatic complexity scoring and graceful fallback. Added Novita AI provider support, thread-safety hardening, improved LLM output parsing robustness, and 20+ bug fixes from community PRs and internal audit.
- [03/17/2026] v0.3.0 β MetaClaw Integration β AutoResearchClaw now supports MetaClaw cross-run learning: pipeline failures β structured lessons β reusable skills, injected into all 23 stages. +18.3% robustness in controlled experiments. Opt-in (
metaclaw_bridge.enabled: true), fully backward-compatible. See Integration Guide. - [03/16/2026] v0.2.0 β Three multi-agent subsystems (CodeAgent, BenchmarkAgent, FigureAgent), hardened Docker sandbox with network-policy-aware execution, 4-round paper quality audit (AI-slop detection, 7-dim review scoring, NeurIPS checklist), and 15+ bug fixes from production runs.
- [03/15/2026] v0.1.0 β We release AutoResearchClaw: a fully autonomous 23-stage research pipeline that turns a single research idea into a conference-ready paper. No human intervention required.
β‘ One Command. One Paper.
pip install -e . && researchclaw setup && researchclaw init && researchclaw run --topic "Your research idea here" --auto-approve
π€ What Is This?
You think it. AutoResearchClaw writes it.
Drop a research topic β get back a full academic paper with real literature from OpenAlex, Semantic Scholar & arXiv, hardware-aware sandbox experiments (GPU/MPS/CPU auto-detected), statistical analysis, multi-agent peer review, and conference-ready LaTeX targeting NeurIPS/ICML/ICLR. No babysitting. No copy-pasting. No hallucinated references.
<table> <tr><td>π</td><td><code>paper_draft.md</code></td><td>Full academic paper (Introduction, Related Work, Method, Experiments, Results, Conclusion)</td></tr> <tr><td>π</td><td><code>paper.tex</code></td><td>Conference-ready LaTeX (NeurIPS / ICLR / ICML templates)</td></tr> <tr><td>π</td><td><code>references.bib</code></td><td>Real BibTeX references from OpenAlex, Semantic Scholar and arXiv β auto-pruned to match inline citations</td></tr> <tr><td>π</td><td><code>verification_report.json</code></td><td>4-layer citation integrity + relevance verification (arXiv, CrossRef, DataCite, LLM)</td></tr> <tr><td>π§ͺ</td><td><code>experiment runs/</code></td><td>Generated code + sandbox results + structured JSON metrics</td></tr> <tr><td>π</td><td><code>charts/</code></td><td>Auto-generated condition comparison charts with error bars and confidence intervals</td></tr> <tr><td>π</td><td><code>reviews.md</code></td><td>Multi-agent peer review with methodology-evidence consistency checks</td></tr> <tr><td>π§¬</td><td><code>evolution/</code></td><td>Self-learning lessons extracted from each run</td></tr> <tr><td>π¦</td><td><code>deliverables/</code></td><td>All final outputs in one folder β compile-ready for Overleaf</td></tr> </table>The pipeline runs end-to-end without human intervention. When experiments fail, it self-heals. When hypotheses don't hold, it pivots. When citations are fake, it kills them.
π Run it anywhere. AutoResearchClaw isn't locked to a single platform. Use it standalone via CLI, plug it into OpenClaw, or wire it up through any ACP-compatible agent β π€ Claude Code, π» Codex CLI, π Copilot CLI, β Gemini CLI, π Kimi CLI, you name it. And because OpenClaw bridges to messaging platforms, you can kick off a full research run from π¬ Discord, βοΈ Telegram, π¦ Lark (ι£δΉ¦), π WeChat, or wherever your team already hangs out. One topic in, one paper out β no matter where you type it.
π Quick Start
# 1. Clone & install
git clone https://github.com/aiming-lab/AutoResearchClaw.git
cd AutoResearchClaw
python3 -m venv .venv && source .venv/bin/activate
pip install -e .
# 2. Setup (interactive β installs OpenCode beast mode, checks Docker/LaTeX)
researchclaw setup
# 3. Configure
researchclaw init # Interactive: choose LLM provider, creates config.arc.yaml
# Or manually: cp config.researchclaw.example.yaml config.arc.yaml
# 4. Run
export OPENAI_API_KEY="sk-..."
researchclaw run --config config.arc.yaml --topic "Your research idea" --auto-approve
Output β artifacts/rc-YYYYMMDD-HHMMSS-<hash>/deliverables/ β compile-ready LaTeX, BibTeX, experiment code, charts.
project:
name: "my-research"
research:
topic: "Your research topic here"
llm:
base_url: "https://api.openai.com/v1"
api_key_env: "OPENAI_API_KEY"
primary_model: "gpt-4o"
fallback_models: ["gpt-4o-mini"]
experiment:
mode: "sandbox"
sandbox:
python_path: ".venv/bin/python"
</details>
π§ What Makes It Different
| Capability | How It Works | |-----------|-------------| | π PIVOT / REFINE Loop | Stage 15 autonomously decides: PROCEED, REFINE (tweak params), or PIVOT (new direction). Artifacts auto-versioned. | | π€ Multi-Agent Debate | Hypothesis generation, result analysis, and peer review each use structured multi-perspective debate. | | 𧬠Self-Learning | Lessons extracted per run (decision rationale, runtime warnings, metric anomalies) with 30-day time-decay. Future runs learn from past mistakes. | | π Knowledge Base | Every run builds structured KB across 6 categories (decisions, experiments, findings, literature, questions, reviews). | | π‘οΈ Sentinel Watchdog | Background quality monitor: NaN/Inf detection, paper-evidence consistency, citation relevance scoring, anti-fabrication guard. |
