Btgym
Scalable, event-driven, deep-learning-friendly backtesting library
Install / Use
/learn @Kismuz/BtgymREADME
...Minimizing the mean square error on future experience. - Richard S. Sutton
<a name="title"></a>BTGym
Scalable event-driven RL-friendly backtesting library. Build on top of Backtrader with OpenAI Gym environment API.
Backtrader is open-source algorithmic trading library:
GitHub: http://github.com/mementum/backtrader
Documentation and community:
http://www.backtrader.com/
OpenAI Gym is...,
well, everyone knows Gym:
GitHub: http://github.com/openai/gym
Documentation and community:
https://gym.openai.com/
<a name="outline"></a>Outline
General purpose of this project is to provide gym-integrated framework for running reinforcement learning experiments in [close to] real world algorithmic trading environments.
DISCLAIMER:
Code presented here is research/development grade.
Can be unstable, buggy, poor performing and is subject to change.
Note that this package is neither out-of-the-box-moneymaker, nor it provides ready-to-converge RL solutions.
Think of it as framework for setting experiments with complex non-stationary stochastic environments.
As a research project BTGym in its current stage can hardly deliver easy end-user experience in as sense that
setting meaninfull experiments will require some practical programming experience as well as general knowledge
of reinforcement learning theory.
News and update notes
<a name="contents"></a>Contents
- Installation
- Quickstart
- Description
- Documentation and community
- Known bugs and limitations
- Roadmap
- Update news
<a name="install"></a>Installation
It is highly recommended to run BTGym in designated virtual environment.
Clone or copy btgym repository to local disk, cd to it and run: pip install -e . to install package and all dependencies:
git clone https://github.com/Kismuz/btgym.git
cd btgym
pip install -e .
To update to latest version::
cd btgym
git pull
pip install --upgrade -e .
Notes:
-
BTGym requres Matplotlib version 2.0.2, downgrade your installation if you have version 2.1:
pip install matplotlib==2.0.2
-
LSOF utility should be installed to your OS, which can not be the default case for some Linux distributives, see: https://en.wikipedia.org/wiki/Lsof
<a name="start"></a>Quickstart
Making gym environment with all parmeters set to defaults is as simple as:
from btgym import BTgymEnv
MyEnvironment = BTgymEnv(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',)
Adding more controls may look like:
from gym import spaces
from btgym import BTgymEnv
MyEnvironment = BTgymEnv(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',
episode_duration={'days': 2, 'hours': 23, 'minutes': 55},
drawdown_call=50,
state_shape=dict(raw=spaces.Box(low=0,high=1,shape=(30,4))),
port=5555,
verbose=1,
)
See more options at Documentation: Quickstart >>
and how-to's in Examples directory >>.
<a name="description"></a> General description
<a name="problem"></a> Problem setting
-
Discrete actions setup: consider setup with one riskless asset acting as broker account cash and K (by default - one) risky assets. For every risky asset there exists track of historic price records referred as
data-line. Apart from assets data lines there [optionally] exists number of exogenous data lines holding some information and statistics, e.g. economic indexes, encoded news, macroeconomic indicators, weather forecasts etc. which are considered relevant to decision-making. It is supposed for this setup that:- there is no interest rates for any asset;
- broker actions are fixed-size market orders (
buy,sell,close); short selling is permitted; - transaction costs are modelled via broker commission;
- 'market liquidity' and 'capital impact' assumptions are met;
- time indexes match for all data lines provided;
-
The problem is modelled as discrete-time finite-horizon partially observable Markov decision process for equity/currency trading:
- for every asset traded agent action space is discrete
(0:hold[do nothing], 1:buy, 2:sell, 3:close[position]); - environment is episodic: maximum episode duration and episode termination conditions are set;
- for every timestep of the episode agent is given environment state observation as tensor of last
mtime-embedded preprocessed values for every data-line included and emits actions according some stochastic policy. - agent's goal is to maximize expected cumulative capital by learning optimal policy;
- for every asset traded agent action space is discrete
-
Continuous actions setup[BETA]: this setup closely relates to continuous portfolio optimisation problem definition; it differs from setup above in:
- base broker actions are real numbers:
a[i] in [0,1], 0<=i<=K, SUM{a[i]} = 1forKrisky assets added; each action is a market target order to adjust portfolio to get sharea[i]*100%fori-th asset; - entire single-step broker action is dictionary of form:
{cash_name: a[0], asset_name_1: a[1], ..., asset_name_K: a[K]}; - short selling is not permitted;
- base broker actions are real numbers:
-
For RL it implies having continuous action space as
K+1dim vector.
<a name="data"></a> Data selection options for backtest agent training:
Notice: data shaping approach is under development, expect some changes. [7.01.18]
- random sampling: historic price change dataset is divided to training, cross-validation and testing subsets. Since agent actions do not influence market, it is possible to randomly sample continuous subset of training data for every episode. [Seems to be] most data-efficient method. Cross-validation and testing performed later as usual on most "recent" data;
- sequential sampling: full dataset is feeded sequentially as if agent is performing real-time trading, episode by episode. Most reality-like, least data-efficient, natural non-stationarity remedy.
- sliding time-window sampling: mixture of above, episde is sampled randomly from comparatively short time period, sliding from furthest to most recent training data. Should be less prone to overfitting than random sampling.
<a name="reference"></a>Documentation and Community
- Read Docs and API Reference.
- Browse Development Wiki.
- Review opened and closed Issues.
- Go to BTGym Slack channel. If you are new - use this invite link to join.
<a name="issues"></a> Known bugs and limitations:
- requres Matplotlib version 2.0.2;
- matplotlib backend warning: appears when importing pyplot and using
%matplotlib inlinemagic before btgym import. It's recommended to import btacktrader and btgym first to ensure proper backend choice; - not tested with Python < 3.5;
- doesn't seem to work correctly under Windows; partially done
- by default, is configured to accept Forex 1 min. data from www.HistData.com;
- ~~only random data sampling is implemented;~~
- ~~no built-in dataset splitting to training/cv/testing subsets;~~ done
- ~~only one equity/currency pair can be traded~~ done
- ~~no 'skip-frames' implementation within environment;~~ done
- ~~no plotting features, except if using pycharm integration observer.~~ ~~Not sure if it is suited for intraday strategies.~~ [partially] done
- ~~making new environment kills all processes using specified network port. Watch out your jupyter kernels.~~ fixed
<a name="roadmap"></a> TODO's and Road Map:
- [x] refine logic for parameters applying priority (engine vs strategy vs kwargs vs defaults);
- [X] API reference;
- [x] examples;
- [x] frame-skipping feature;
- [x] dataset tr/cv/t approach;
- [x] state rendering;
- [x] proper rendering for entire episode;
- [x] tensorboard integration;
- [x] multiply agents asynchronous operation feature (e.g for A3C):
- [x] dedicated data server;
- [x] multi-modal observation space shape;
- [x] A3C implementation for BTgym;
- [x] UNREAL implementation for BTgym;
- [x] PPO implementation for BTgym;
- [ ] RL^2 / MAML / DARLA adaptations - IN PROGRESS;
- [x] learning from demonstrations; - partially done
- [ ] risk-sensitive agents implementation;
- [x] sequential and sliding time-window sampling;
- [x] multiply instruments trading;
- [x] docker image; - CPU version,
Signalprimecontribution, - [ ] TF serving model serialisation functionality;
<a name="news"></a>News and updates:
-
10.01.2019:
- docker CPU version is now available, contributed by
Signalprime, (https://github.com/signalprime), seebtgym/docker/README.mdfor details;
- docker CPU version is now available, contributed by
-
9.02.2019:
- Introduction to analytic data model notebook added to model_based_stat_arb examples folder.
-
25.01.2019: updates:
- lstm_policy class now requires both
internalandexternalobservation sub-spaces to be present and allows both be one-level nested sub-spaces itself (was only true forexternal); all declared sub-spaces got encoded by separate convolution encoders; - policy deterministic action option is implemented for discrete action spaces and can be utilised by
syncro_runner; by default it is enabled for test episodes;
- lstm_policy class now requires both
