AlphaForge
Official implementation for AAAI2025: AlphaForge: A Framework to Mine and Dynamically Combine Formulaic Alpha Factors
Install / Use
/learn @DulyHao/AlphaForgeREADME
AlphaForge(AFF)
Data Preparation
Similar to AlphaGen, We Use Qlib as data save tool and download data from free & open-source data source baostock.
Please install Qlib Qlib first
Then download stock data through running data_collection/fetch_baostock_data.py
The next, Modify the correspoding /path/for/qlib_data in gan.utils.data.py to the data you downloaded (the dafault setting is ~/.qlib/qlib_data/cn_data_rolling)
Run Our Model
stage1: Minning alpha factors
python train_AFF.py --instruments=csi300 --train_end_year=2020 --seeds=[0,1,2,3,4] --save_name=test --zoo_size=100
Here,
instrumentsis the dataset to use, e.g.,csi300,csi500.seedsis random seed list, e.g.,[0,1,2]or[0].train_end_yearis the last year of training set, when train_end_year is 2020,the train,valid and test set is seperately:2010-01-01 to 2020-12-31,2021-01-01 to 2021-12-31,2022-01-01 to 2022-12-31save_nameis the prefix when saving running results.zoo_sizeis the num of factors to save at stage 1 mining model.
stage2: Combining alpha factors
python combine_AFF.py --instruments=csi300 --train_end_year=2020 --seeds=[0,1,2,3,4] --save_name=test --n_factors=10 --window=inf
Here instruments,train_end_year,seeds,save_name,` must be the same as it in stage 1
n_factorsis the num of factors used at each day, it should be less than or equal tozoo_sizein stage 1.windowis the slicing window that is used to evaluate the alpha factors in order to dynamicly select and cobine.
stage3: Show the results
You could run the ipython notebook file
exp_AFF_calc_result.ipynb
to generate and concat experiment result.
Run baseline experiments
The experiment process of other models is similar to running our AFF model, Except that none of the other models have a combine step.
GP:
train: train_RL.py, show result: exp_RL_calc_result.ipynb
RL:
train: train_RL.py, show result: exp_RL_calc_result.ipynb
DSO:
train: train_RL.py, show result: exp_RL_calc_result.ipynb
ML models including XGBoost, LightGBM and MLP:
train & show results: exp_ML_train_and_result.ipynb
