X2Strategy

Any Research Input → Strategy Spec → Executable Code → Backtest → Diagnosis

Getting Started · How It Works · Examples · Docs · 简体中文

Turn quantitative finance research — papers, drafts, reports, or strategy ideas — into validated, executable trading strategies. Automatically.

</div>

Highlights

🔬 Multi-Format Input — PDF papers, Markdown drafts, DOCX reports, plain text. Auto-detected.
🧠 5-Layer LLM Extraction — Multi-strategy detection → indicators → signal logic → execution plan → risk controls.
✅ Verified Code Generation — AST validation + Backtrader structural checks + indicator registry, not just "generate and hope".
📊 Automated Backtesting — Execute, extract metrics, and diagnose against paper-reported performance.
🤖 Agent-Native — Works as an Agent Skill (/x2strategy) in VS Code Copilot, Claude Code, or any compatible agent.
💰 ~$0.1 per paper — DeepSeek-powered. Any LiteLLM-supported provider works.

How It Works

                        ┌──────────────────────────────────────────────────────────────┐
                        │                    X2Strategy                              │
                        │                                                              │
  PDF / MD / DOCX / TXT │   ┌─────────┐   ┌───────────┐   ┌──────────┐   ┌─────────┐ │
  ─────────────────────►│   │  Parse   ├──►│  Extract   ├──►│ Generate ├──►│Backtest │ │
                        │   │ (parser) │   │ (L0 → L4) │   │  (code)  │   │+ Diagnose││
                        │   └─────────┘   └───────────┘   └──────────┘   └─────────┘ │
                        │        ▼              ▼               ▼             ▼        │
                        │   PaperContent   StrategySpec   Backtrader.py   Report.md   │
                        └──────────────────────────────────────────────────────────────┘

| Stage | Input | Output | What Happens | |:------|:------|:-------|:-------------| | Parse | Any document | PaperContent | Format-aware extraction (PyMuPDF / direct read / python-docx) | | Extract | PaperContent | StrategySpec[] | 5-layer LLM: detect strategies → extract indicators, logic, execution, risk | | Generate | StrategySpec | strategy.py | Data module → signal module → backtest module → integration | | Validate | strategy.py | Pass / Fail | AST syntax + Backtrader structure + indicator existence checks | | Backtest | strategy.py | Metrics | Subprocess execution with timeout, metric extraction | | Diagnose | Metrics | report.md | Compare against paper-reported results, flag deviations |

Getting Started

Option A: As an Agent Skill (Recommended)

Agent Skills is an open standard. Clone into the agent's skill directory — it auto-discovers SKILL.md and registers the /x2strategy slash command.

<table> <tr><td><b>GitHub Copilot</b></td><td>

git clone https://github.com/ALAGENT-HKU/x2strategy.git ~/.copilot/skills/x2strategy

</td></tr> <tr><td><b>Claude Code</b></td><td>

git clone https://github.com/ALAGENT-HKU/x2strategy.git ~/.claude/skills/x2strategy

</td></tr> <tr><td><b>Project-scoped</b></td><td>

git clone https://github.com/ALAGENT-HKU/x2strategy.git .github/skills/x2strategy

</td></tr> </table>

Then install dependencies:

cd ~/.copilot/skills/x2strategy   # or wherever you cloned
# if you haven't installed uv, run `pip install uv`
uv sync --extra codegen                  # core + backtrader + yfinance + akshare

[!IMPORTANT] The directory name must be x2strategy (matching the name field in SKILL.md). Once installed, type /x2strategy in chat or the agent auto-activates when relevant.

Option B: Standalone CLI

git clone https://github.com/ALAGENT-HKU/x2strategy.git && cd x2strategy
uv sync --extra codegen    # core + backtest
uv sync --extra agent      # + FAISS semantic search (for 100+ page papers)
uv sync --extra dev        # + pytest

<details> <summary>pip alternative</summary>

python -m venv .venv && source .venv/bin/activate
pip install -e ".[codegen,agent,dev]"

</details>

Quick Start

# 1. Configure
cp .env.example .env          # add your API key (DEEPSEEK_API_KEY recommended)

# 2. Extract strategy specs from any input format
uv run python scripts/analyze.py paper.pdf -o library/my_paper/
uv run python scripts/analyze.py strategy_draft.md -o library/my_draft/
uv run python scripts/analyze.py report.docx -o library/my_report/

# 3. Generate Backtrader code from spec
uv run python scripts/generate.py library/my_paper/spec.json --strategy-index 0

# 4. Validate + backtest
uv run python scripts/validate_strategy.py library/my_paper/strategy.py
uv run python scripts/backtest.py library/my_paper/strategy.py -o library/my_paper/results/

Or use the agent skill — just say:

"Analyze this paper and implement the main strategy" + attach a PDF

The agent handles everything: parsing, extraction, code generation, validation, backtesting, and diagnosis.

Supported Input Formats

| Format | Extensions | Parser | Notes | |:-------|:-----------|:-------|:------| | PDF | .pdf | PyMuPDF → Mode A (direct) or Mode B (FAISS) | Full support, covering 95%+ of papers | | Markdown | .md .markdown | Direct text read | Ideal for strategy drafts and notes | | Word | .docx | python-docx (uv sync --extra docx) | Internal research reports | | Plain text | .txt | Direct read | Raw strategy descriptions |

Format is auto-detected from file extension. No configuration needed.

Examples

Pre-generated outputs from real papers are available in examples/:

| Paper | Strategies Detected | Artifacts | |:------|:-------------------|:----------| | Tactical Asset Allocation (Faber 2007) | 1 — GTAA with SMA timing | spec + code | | Pairs Trading (Goncalves-Pinto et al.) | 3 — Distance, Stationarity, Cointegration | spec | | Value and Momentum (Asness et al.) | 2 — Value Factor, Momentum Factor | spec |

<details> <summary>Example output structure</summary>

library/tactical_aa/
├── content.json          # Parsed paper content
├── content.md            # Human-readable paper summary
├── spec.json             # Structured strategy specification
├── spec.md               # Human-readable spec
├── metadata.json         # Run metadata (model, timing, etc.)
├── strategy.py           # Generated Backtrader code
├── validation_report.md  # AST + structural validation results
└── results/
    ├── backtest_output.txt
    └── diagnosis_report.md

</details>

Project Structure

x2strategy/
├── paper2spec/                 # Phase 1: Document → Structured Spec
│   ├── parser.py               #   Multi-format parser (PDF / MD / DOCX / TXT)
│   ├── extractor.py            #   PaperContent → ExtractionResult (L0-L4)
│   ├── models.py               #   Data models (PaperContent, StrategySpec, etc.)
│   ├── prompts.py              #   5-layer extraction prompt templates
│   ├── llm.py                  #   LiteLLM unified interface
│   ├── render.py               #   JSON → Markdown rendering
│   └── search.py               #   arXiv + SSRN paper search
│
├── spec2code/                  # Phase 2: Spec → Code → Backtest → Diagnosis
│   ├── prompts.py              #   Data / Signal / Backtest / Integration templates
│   ├── validator.py            #   AST + structural + indicator validation
│   ├── executor.py             #   Subprocess-based backtest execution
│   ├── analyzer.py             #   Result comparison + diagnosis report
│   └── models.py               #   CodeModules, ValidationResult
│
├── references/                 # Verified domain knowledge (not LLM hallucinations)
│   ├── backtrader_patterns.md  #   Source-verified Backtrader patterns
│   ├── indicator_cookbook.md    #   Official indicator params (from bt source code)
│   ├── data_sources.md         #   yfinance + akshare API docs
│   ├── paper2spec.md           #   Paper2Spec deep-dive guide
│   └── spec2code.md            #   Spec2Code deep-dive guide
│
├── scripts/                    # CLI entry points
│   ├── analyze.py              #   Full paper2spec pipeline
│   ├── generate.py             #   Full spec2code pipeline
│   └── validate_strategy.py    #   Standalone validation
│
├── schemas/                    # JSON Schema definitions
├── examples/                   # Pre-generated reference outputs
├── tests/                      # 180+ unit & integration tests
├── SKILL.md                    # Agent Skill entry point
└── pyproject.toml              # Project config & dependencies

Key Design Decisions

Why Reference Docs, Not Prompts?

LLMs frequently hallucinate Backtrader API details:

SMA default period is 30, not 20
RSI uses SmoothedMovingAverage, not EMA
BollingerBands lines are .top/.mid/.bot, not .upper/.lower

Our references/ directory contains source-code-verified knowledge. The agent reads these docs on demand — zero hallucination on API details.

</td> <td width="50%">

X2strategy

Install / Use

README