SkillAgentSearch skills...

Btgym

Scalable, event-driven, deep-learning-friendly backtesting library

Install / Use

/learn @Kismuz/Btgym

README

...Minimizing the mean square error on future experience.  - Richard S. Sutton

<a name="title"></a>BTGym

Scalable event-driven RL-friendly backtesting library. Build on top of Backtrader with OpenAI Gym environment API.

Backtrader is open-source algorithmic trading library:
GitHub: http://github.com/mementum/backtrader
Documentation and community:
http://www.backtrader.com/

OpenAI Gym is..., well, everyone knows Gym:
GitHub: http://github.com/openai/gym
Documentation and community:
https://gym.openai.com/


<a name="outline"></a>Outline

General purpose of this project is to provide gym-integrated framework for running reinforcement learning experiments in [close to] real world algorithmic trading environments.

DISCLAIMER:
Code presented here is research/development grade.
Can be unstable, buggy, poor performing and is subject to change.

Note that this package is neither out-of-the-box-moneymaker, nor it provides ready-to-converge RL solutions.
Think of it as framework for setting experiments with complex non-stationary stochastic environments.

As a research project BTGym in its current stage can hardly deliver easy end-user experience in as sense that
setting meaninfull  experiments will require some practical programming experience as well as general knowledge
of reinforcement learning theory.

News and update notes


<a name="contents"></a>Contents


<a name="install"></a>Installation

It is highly recommended to run BTGym in designated virtual environment.

Clone or copy btgym repository to local disk, cd to it and run: pip install -e . to install package and all dependencies:

git clone https://github.com/Kismuz/btgym.git

cd btgym

pip install -e .

To update to latest version::

cd btgym

git pull

pip install --upgrade -e .
Notes:
  1. BTGym requres Matplotlib version 2.0.2, downgrade your installation if you have version 2.1:

    pip install matplotlib==2.0.2

  2. LSOF utility should be installed to your OS, which can not be the default case for some Linux distributives, see: https://en.wikipedia.org/wiki/Lsof


<a name="start"></a>Quickstart

Making gym environment with all parmeters set to defaults is as simple as:

from btgym import BTgymEnv

MyEnvironment = BTgymEnv(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',)

Adding more controls may look like:

from gym import spaces
from btgym import BTgymEnv

MyEnvironment = BTgymEnv(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',
                         episode_duration={'days': 2, 'hours': 23, 'minutes': 55},
                         drawdown_call=50,
                         state_shape=dict(raw=spaces.Box(low=0,high=1,shape=(30,4))),
                         port=5555,
                         verbose=1,
                         )
See more options at Documentation: Quickstart >>
and how-to's in Examples directory >>.

<a name="description"></a> General description

<a name="problem"></a> Problem setting

  • Discrete actions setup: consider setup with one riskless asset acting as broker account cash and K (by default - one) risky assets. For every risky asset there exists track of historic price records referred as data-line. Apart from assets data lines there [optionally] exists number of exogenous data lines holding some information and statistics, e.g. economic indexes, encoded news, macroeconomic indicators, weather forecasts etc. which are considered relevant to decision-making. It is supposed for this setup that:

    1. there is no interest rates for any asset;
    2. broker actions are fixed-size market orders (buy, sell, close); short selling is permitted;
    3. transaction costs are modelled via broker commission;
    4. 'market liquidity' and 'capital impact' assumptions are met;
    5. time indexes match for all data lines provided;
  • The problem is modelled as discrete-time finite-horizon partially observable Markov decision process for equity/currency trading:

    • for every asset traded agent action space is discrete (0: hold [do nothing], 1:buy, 2: sell, 3:close [position]);
    • environment is episodic: maximum episode duration and episode termination conditions are set;
    • for every timestep of the episode agent is given environment state observation as tensor of last m time-embedded preprocessed values for every data-line included and emits actions according some stochastic policy.
    • agent's goal is to maximize expected cumulative capital by learning optimal policy;
  • Continuous actions setup[BETA]: this setup closely relates to continuous portfolio optimisation problem definition; it differs from setup above in:

    1. base broker actions are real numbers: a[i] in [0,1], 0<=i<=K, SUM{a[i]} = 1 for K risky assets added; each action is a market target order to adjust portfolio to get share a[i]*100% for i-th asset;
    2. entire single-step broker action is dictionary of form: {cash_name: a[0], asset_name_1: a[1], ..., asset_name_K: a[K]};
    3. short selling is not permitted;
  • For RL it implies having continuous action space as K+1 dim vector.

<a name="data"></a> Data selection options for backtest agent training:

Notice: data shaping approach is under development, expect some changes. [7.01.18]

  • random sampling: historic price change dataset is divided to training, cross-validation and testing subsets. Since agent actions do not influence market, it is possible to randomly sample continuous subset of training data for every episode. [Seems to be] most data-efficient method. Cross-validation and testing performed later as usual on most "recent" data;
  • sequential sampling: full dataset is feeded sequentially as if agent is performing real-time trading, episode by episode. Most reality-like, least data-efficient, natural non-stationarity remedy.
  • sliding time-window sampling: mixture of above, episde is sampled randomly from comparatively short time period, sliding from furthest to most recent training data. Should be less prone to overfitting than random sampling.

<a name="reference"></a>Documentation and Community


<a name="issues"></a> Known bugs and limitations:

  • requres Matplotlib version 2.0.2;
  • matplotlib backend warning: appears when importing pyplot and using %matplotlib inline magic before btgym import. It's recommended to import btacktrader and btgym first to ensure proper backend choice;
  • not tested with Python < 3.5;
  • doesn't seem to work correctly under Windows; partially done
  • by default, is configured to accept Forex 1 min. data from www.HistData.com;
  • ~~only random data sampling is implemented;~~
  • ~~no built-in dataset splitting to training/cv/testing subsets;~~ done
  • ~~only one equity/currency pair can be traded~~ done
  • ~~no 'skip-frames' implementation within environment;~~ done
  • ~~no plotting features, except if using pycharm integration observer.~~ ~~Not sure if it is suited for intraday strategies.~~ [partially] done
  • ~~making new environment kills all processes using specified network port. Watch out your jupyter kernels.~~ fixed

<a name="roadmap"></a> TODO's and Road Map:

  • [x] refine logic for parameters applying priority (engine vs strategy vs kwargs vs defaults);
  • [X] API reference;
  • [x] examples;
  • [x] frame-skipping feature;
  • [x] dataset tr/cv/t approach;
  • [x] state rendering;
  • [x] proper rendering for entire episode;
  • [x] tensorboard integration;
  • [x] multiply agents asynchronous operation feature (e.g for A3C):
  • [x] dedicated data server;
  • [x] multi-modal observation space shape;
  • [x] A3C implementation for BTgym;
  • [x] UNREAL implementation for BTgym;
  • [x] PPO implementation for BTgym;
  • [ ] RL^2 / MAML / DARLA adaptations - IN PROGRESS;
  • [x] learning from demonstrations; - partially done
  • [ ] risk-sensitive agents implementation;
  • [x] sequential and sliding time-window sampling;
  • [x] multiply instruments trading;
  • [x] docker image; - CPU version, Signalprime contribution,
  • [ ] TF serving model serialisation functionality;

<a name="news"></a>News and updates:

  • 10.01.2019:

    • docker CPU version is now available, contributed by Signalprime, (https://github.com/signalprime), see btgym/docker/README.md for details;
  • 9.02.2019:

  • 25.01.2019: updates:

    • lstm_policy class now requires both internal and external observation sub-spaces to be present and allows both be one-level nested sub-spaces itself (was only true for external); all declared sub-spaces got encoded by separate convolution encoders;
    • policy deterministic action option is implemented for discrete action spaces and can be utilised by syncro_runner; by default it is enabled for test episodes;
View on GitHub
GitHub Stars1.0k
CategoryDevelopment
Updated26d ago
Forks261

Languages

Python

Security Score

100/100

Audited on Feb 24, 2026

No findings