QuantaAlpha
QuantaAlpha transforms how you discover quantitative alpha factors by combining LLM intelligence with evolutionary strategies. Just describe your research direction, and watch as factors are automatically mined, evolved, and validated through self-evolving trajectories.
Install / Use
/learn @QuantaAlpha/QuantaAlphaREADME
🎯 Overview
QuantaAlpha transforms how you discover quantitative alpha factors by combining LLM intelligence with evolutionary strategies. Just describe your research direction, and watch as factors are automatically mined, evolved, and validated through self-evolving trajectories.
<p align="center">💬 Research Direction → 🧩 Diversified Planning → 🔄 Trajectory → ✅ Validated Alpha Factors</p>Demo: Below is a short demo of the full flow from research direction to factor mining and backtesting UI.
<div align="center"> <video src="https://github.com/user-attachments/assets/726511ce-a384-4727-a7be-948a2cf05e4b" controls style="max-width: 90%; border-radius: 8px; box-shadow: 0 4px 8px rgba(0,0,0,0.1);"> Your browser does not support the video tag. <a href="https://github.com/user-attachments/assets/726511ce-a384-4727-a7be-948a2cf05e4b">Watch the demo video</a>. </video> <p style="font-size: 12px; color: #666; margin-top: 8px;"> ▶ Click to play the QuantaAlpha end-to-end workflow demo. </p> </div>📊 Performance
1. Factor Performance
<div align="center"> <img src="docs/images/figure3.png" width="90%" alt="Zero-Shot Transfer" style="border-radius: 8px; box-shadow: 0 4px 8px rgba(0,0,0,0.1);"/> <p style="font-size: 12px; color: #666;">CSI 300 factors transferred to CSI 500/S&P 500</p> </div>2. Key Results
<div align="center">| Dimension | Metric | Performance | | :---: | :---: | :---: | | Predictive Power | Information Coefficient (IC) | 0.1501 | | | Rank IC | 0.1465 | | Strategy Return | Annualized Excess Return (ARR) | 27.75% | | | Max Drawdown (MDD) | 7.98% | | | Calmar Ratio (CR) | 3.4774 |
</div> <div align="center"> <img src="docs/images/主实验.png" width="90%" alt="Main Experiment Results" style="border-radius: 8px; box-shadow: 0 4px 8px rgba(0,0,0,0.1);"/> </div>🚀 Quick Start
<p align="center" style="font-size: 13px; color: #666; margin-top: 10px;"> 🔬 Experiments: paper reproduction settings & metric definitions — <a href="experiment/README_EXPERIMENT.md"><b>English</b></a> · <a href="experiment/README_EXPERIMENT_CN.md"><b>中文</b></a> </p>1. Clone & Install
git clone https://github.com/QuantaAlpha/QuantaAlpha.git
cd QuantaAlpha
conda create -n quantaalpha python=3.10
conda activate quantaalpha
# Install the package in development mode
SETUPTOOLS_SCM_PRETEND_VERSION=0.1.0 pip install -e .
# Install additional dependencies
pip install -r requirements.txt
2. Configure Environment
cp configs/.env.example .env
Edit .env with your settings:
# === Required: Data Paths ===
QLIB_DATA_DIR=/path/to/your/qlib/cn_data # Qlib data directory
DATA_RESULTS_DIR=/path/to/your/results # Output directory
# === Required: LLM API ===
OPENAI_API_KEY=your-api-key
OPENAI_BASE_URL=https://your-llm-provider/v1 # e.g. DashScope, OpenAI
CHAT_MODEL=deepseek-v3 # or gpt-4, qwen-max, etc.
REASONING_MODEL=deepseek-v3
3. Prepare Data
QuantaAlpha requires two types of data: Qlib market data (for backtesting) and pre-computed price-volume HDF5 files (for factor mining). We provide all of them on HuggingFace for convenience.
Dataset: https://huggingface.co/datasets/QuantaAlpha/qlib_csi300
| File | Description | Size | Usage |
| :--- | :--- | :--- | :--- |
| cn_data.zip | Qlib raw market data (A-share, 2016–2025) | 493 MB | Required for Qlib initialization & backtesting |
| daily_pv.h5 | Pre-computed full price-volume data | 398 MB | Required for factor mining |
| daily_pv_debug.h5 | Pre-computed debug subset (smaller) | 1.41 MB | Required for factor mining (debug/validation) |
Why provide HDF5 files? The system can auto-generate
daily_pv.h5from Qlib data on first run, but this process is very slow. Downloading pre-built HDF5 files saves significant time.
Step 1: Download
# Option A: Using huggingface-cli (recommended)
pip install huggingface_hub
huggingface-cli download QuantaAlpha/qlib_csi300 --repo-type dataset --local-dir ./hf_data
# Option B: Using wget
mkdir -p hf_data
wget -P hf_data https://huggingface.co/datasets/QuantaAlpha/qlib_csi300/resolve/main/cn_data.zip
wget -P hf_data https://huggingface.co/datasets/QuantaAlpha/qlib_csi300/resolve/main/daily_pv.h5
wget -P hf_data https://huggingface.co/datasets/QuantaAlpha/qlib_csi300/resolve/main/daily_pv_debug.h5
Step 2: Extract & Place Files
# 1. Extract Qlib data
unzip hf_data/cn_data.zip -d ./data/qlib
# 2. Place HDF5 files into the default data directories
mkdir -p git_ignore_folder/factor_implementation_source_data
mkdir -p git_ignore_folder/factor_implementation_source_data_debug
cp hf_data/daily_pv.h5 git_ignore_folder/factor_implementation_source_data/daily_pv.h5
cp hf_data/daily_pv_debug.h5 git_ignore_folder/factor_implementation_source_data_debug/daily_pv.h5
Note:
daily_pv_debug.h5must be renamed todaily_pv.h5when placed in the debug directory.
Step 3: Configure Paths in .env
# Point to the extracted Qlib data directory (must contain calendars/, features/, instruments/)
QLIB_DATA_DIR=./data/qlib/cn_data
# Output directory for experiment results
DATA_RESULTS_DIR=./data/results
The HDF5 data directories can also be customized via environment variables if you prefer a different location:
# Optional: override default HDF5 data paths
FACTOR_CoSTEER_DATA_FOLDER=/your/custom/path/factor_source_data
FACTOR_CoSTEER_DATA_FOLDER_DEBUG=/your/custom/path/factor_source_data_debug
4. Run Factor Mining
./run.sh "<your input>"
# Example: Run with a research direction
./run.sh "Price-Volume Factor Mining"
# Example: Run with custom factor library suffix
./run.sh "Microstructure Factors" "exp_micro"
The experiment will automatically mine, evolve, and validate alpha factors, and save all discovered factors to all_factors_library*.json.
5. Independent Backtesting
After mining, combine factors from the library for a full-period backtest:
# Backtest with custom factors only
python -m quantaalpha.backtest.run_backtest \
-c configs/backtest.yaml \
--factor-source custom \
--factor-json all_factors_library.json
# Combine with Alpha158(20) baseline factors
python -m quantaalpha.backtest.run_backtest \
-c configs/backtest.yaml \
--factor-source combined \
--factor-json all_factors_library.json
# Dry run (load factors only, skip backtest)
python -m quantaalpha.backtest.run_backtest \
-c configs/backtest.yaml \
--factor-source custom \
--factor-json all_factors_library.json \
--dry-run -v
Results are saved to the directory specified in configs/backtest.yaml (experiment.output_dir).
📘 Need help? Check our comprehensive User Guide for advanced configuration, experiment reproduction, and detailed usage examples.
🖥️ Web UI
QuantaAlpha provides a web-based dashboard where you can complete the entire workflow through a visual interface — no command line needed.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
isf-agent
a repo for an agent that helps researchers apply for isf funding
