PyBacktesting
š Optimizing the Elliott Wave Theory using genetic algorithms to forecast the financial markets.
Install / Use
/learn @philippe-ostiguy/PyBacktestingREADME

Optimizing the Elliott Wave Theory using genetic algorithms to forecast the financial markets
Hi!
My name is Philippe. The goal of this project is to model the Elliott Wave Theory to forecast the financial markets. Once we have the model and know the parameters, we optimize it using a machine learning technique called genetic algorithm. Then we test it using Walk forward optimization. The fitness function we're using for optimization and testing is the Sharpe ratio.
The experiment was carried out on the EUR/USD currency pair on an hourly basis. The period was from 2015/10 to 2020/04 (including 2 training and 2 testing periods). The training periods were 18 months each (from 2015-10-15 to 2017-04-15 and from 2018-01-15 to 2019-07-15) and the testing periods were 9 months each (from 2017-04-15 to 2018-01-15 and from 2019-07-15 to 2020-04-15).
The Sharpe ratio was above 3 for each training period (which is excellent). The results were mixed for the training periods. The Sharpe ratio was 1.63 for the first training period (which is really good) and -13.99 for the second period with 0 winning trades (which is really bad).
One of the issue with the model during the testing periods is that it generated few trades (11 for the first testing period and 10 for second testing period). This may be due to an over optimized model which caused overfitting. We could also test the same model on different assets and different timeframes.
The library is built so that it is possible to modify the trading strategy by creating modules in the different packages (indicators, optimize, trading_rules). For example, the current rule for trying to enter the market is when a trend is detected (r2 and Mann-Kendall). We could create a new module that tries to enter the market using a mean-reversion strategy like when the market is 2 standard deviations from the average price of the last 100 days (in the trading_rules package).
To find more details about this project, scroll down
The project structure :
āāā EURUSD.csv <- Data
āāā LICENSE.txt <- License
āāā README.md <- ReadMe document
āāā __init__.py
āāā charting.py <- Charting module
āāā date_manip.py <- Module to manipulate date
āāā entry <- Package that tries to enter the market (with different modules)
āĀ Ā āāā __init__.py
āĀ Ā āāā entry_fibo.py <- Module that tries to enter the market using the Fibonacci technique
āāā exit <- Package that tries to exit the market (with different modules)
āĀ Ā āāā __init__.py
āĀ Ā āāā exit_fibo.py <- Module that tries to enter the market using the Fibonacci technique
āāā indicator.py <- Return the values of our indicators of our choice
āāā indicators <- Package that evaluates the indicators
āĀ Ā āāā __init__.py
āĀ Ā āāā regression
āĀ Ā āāā __init__.py
āĀ Ā āāā linear_regression.py <- Module that evaluates the slope and r_square of a serie
āĀ Ā āāā mann_kendall.py <- Module that assess the Mann-Kendall test
āāā init_operations.py <- Module that resets the necessary values
āāā initialize.py <- Module that declares hyperparamaters and parameters to optimize
āāā main.py <- Main module that executes the program
āāā manip_data.py <- Helper module to manipulate csv and pandas Dataframe
āāā math_op.py <- Module support for mathematical operations
āāā optimize <- Package with optimization techniques
āĀ Ā āāā __init__.py
āĀ Ā āāā genetic_algorithm.py <- Module that uses a genetic algorithm to optimize
āāā optimize_.py <- Module that runs the optimization process if desired
āāā pnl.py <- Module to assess the trading strategy performance
āāā trading_rules <- Package with possible trading rules
āāā __init__.py
āāā r_square_tr.py <- Module that detects buy and sell signals with r_square and Mann Kendall test
To see the list of hyperparameters and parameters to optimize, go to this file
Each .py file has its docstring, so make sure to check it out to understand the details.
To find out more about me
For questions or comments, please feel free to reach out on LinkedIn
Part 1 - DEFINING
---- Defining the problem ----
The goal of this project is to model the Elliott Wave Theory to forecast the financial markets. Once we have the model and know the parameters, we optimize it using a machine learning technique called genetic algorithm and test in a different period (Walk forward optimization). The fitness function used for optimization and testing is the Sharpe ratio.
There is no real technique at the moment to model the Elliott Wave Theory as it is difficult to model and the modeling is highly subjective. To understand the concept of Elliott Wave Theory, refer to this post.
Since the optimization space of a trading strategy can be complex, genetic algorithms are an efficient machine learning technique to find a good approximation of the optimal solution. It mimics the biological process of evolution.

Part 2 - DISCOVER
---- Loading the data ----
The experiment was carried out on the EUR/USD currency pair on an hourly basis. The period was from 2015/10 to 2020/04 (including 2 training and 2 testing periods). The training periods were 18 months each (from 2015-10-15 to 2017-04-15 and from 2018-01-15 to 2019-07-15) and the testing periods were 9 months each (from 2017-04-15 to 2018-01-15 and from 2019-07-15 to 2020-04-15).
The data source for this experiment was Dukascopy as it required a lot of data. The program read the data in a csv format. If you want to do an experiment on a different asset and/or timeframe, make sure to load the data in the folder of your choice and change the path with the variable self.directory in initialize.py.
If less data is needed for an experiment or if the experiment is carried on daily basis data, the Alpha Vantage API is a great source to get free and quality data (with certain restrictions, like a maximum API call per minute). This is a great article on the Alpha Vantage API.
Parameters
----------
`self.directory` : str
Where the the data are located for training and testing periods
`self.asset` : str
Name of the file where we get the data
`self.is_fx` : bool
Tell if `self.asset` is forex (the data don't have the same format for forex and stocks because they are
from different providers).
`self.dir_output` : str
Where we store the results
`self.name_out` : str
Name of the results file name (csv)
`self.start_date` : datetime object
Beginning date of training and testing period. The variable is already transformed from a str to a
Datetime object
`self.end_date` : datetime object
Ending date of training and testing period. The variable is already transformed from a str to a
Datetime object
self.directory = '/Users/philippeostiguy/Desktop/Trading/Programmation_python/Trading/'
self.dir_output = '/Users/philippeostiguy/Desktop/Trading/Programmation_python/Trading/results/'
self.name_out = 'results'
self.is_fx = True
self.asset = "EURUSD"
self.start_date = datetime.strptime('2015-10-15', "%Y-%m-%d")
self.end_date = datetime.strptime('2016-02-18', "%Y-%m-%d")
We can examine our data :
series_.head()
| # | Date | Open | High | Low | Adj Close | |------|---------------------|---------|---------|---------|-----------| | 96 | 2015-10-15 00:00:00 | 1.14809 | 1.14859 | 1.14785 | 1.14801 | | 97 | 2015-10-15 01:00:00 | 1.14802 | 1.14876 | 1.14788 | 1.14828 | | 98 | 2015-10-15 02:00:00 | 1.14831 | 1.14950 | 1.14768 | 1.14803 | | 99 | 2015-10-15 03:00:00 | 1.14802 | 1.14826 | 1.14254 | 1.14375 | | 100 | 2015-10-15 04:00:00 | 1.14372 | 1.14596 | 1.14335 | 1.14417 |
And see the lenght, value types and if there are empty values (none) :
series_.info()
| # | Column | Non-Null Count | Dtype | |---|-----------|----------------|----------------| | 0 | Date | 28176 non-null | datetime64[ns] | | 1 | Open | 28176 non-null | float64 | | 2 | High | 28176 non-null | float64 | | 3 | Low | 28176 non-null | float64 | | 4 | Adj Close | 28176 non-null | float64 |
---- Cleaning the data ----
In manip_data.py, we drop the nan value, if any (none) and remove the data when the market is closed with series_.drop_duplicates(keep=False,subset=list(dup_col.keys()))
series_ = series_.dropna() #drop nan values
if dup_col != None:
#If all values in column self.dup_col are the same, we erase them
series_ = series_.drop_duplicates(keep=False,subset=list(dup_col.keys()))
series_=series_.reset_in
