Optimizing the Elliott Wave Theory using genetic algorithms to forecast the financial markets

Hi!

My name is Philippe. The goal of this project is to model the Elliott Wave Theory to forecast the financial markets. Once we have the model and know the parameters, we optimize it using a machine learning technique called genetic algorithm. Then we test it using Walk forward optimization. The fitness function we're using for optimization and testing is the Sharpe ratio.

The experiment was carried out on the EUR/USD currency pair on an hourly basis. The period was from 2015/10 to 2020/04 (including 2 training and 2 testing periods). The training periods were 18 months each (from 2015-10-15 to 2017-04-15 and from 2018-01-15 to 2019-07-15) and the testing periods were 9 months each (from 2017-04-15 to 2018-01-15 and from 2019-07-15 to 2020-04-15).

The Sharpe ratio was above 3 for each training period (which is excellent). The results were mixed for the training periods. The Sharpe ratio was 1.63 for the first training period (which is really good) and -13.99 for the second period with 0 winning trades (which is really bad).

One of the issue with the model during the testing periods is that it generated few trades (11 for the first testing period and 10 for second testing period). This may be due to an over optimized model which caused overfitting. We could also test the same model on different assets and different timeframes.

The library is built so that it is possible to modify the trading strategy by creating modules in the different packages (indicators, optimize, trading_rules). For example, the current rule for trying to enter the market is when a trend is detected (r2 and Mann-Kendall). We could create a new module that tries to enter the market using a mean-reversion strategy like when the market is 2 standard deviations from the average price of the last 100 days (in the trading_rules package).

To find more details about this project, scroll down

The project structure :

├── EURUSD.csv                                  <- Data
├── LICENSE.txt                                 <- License
├── README.md                                   <- ReadMe document
├── __init__.py                 
├── charting.py                                 <- Charting module
├── date_manip.py                               <- Module to manipulate date
├── entry                                       <- Package that tries to enter the market (with different modules)
│   ├── __init__.py
│   └── entry_fibo.py                           <- Module that tries to enter the market using the Fibonacci technique
├── exit                                        <- Package that tries to exit the market (with different modules)
│   ├── __init__.py
│   └── exit_fibo.py                            <- Module that tries to enter the market using the Fibonacci technique
├── indicator.py                                <- Return the values of our indicators of our choice
├── indicators                                  <- Package that evaluates the indicators
│   ├── __init__.py
│   └── regression                              
│       ├── __init__.py
│       ├── linear_regression.py                <- Module that evaluates the slope and r_square of a serie
│       └── mann_kendall.py                     <- Module that assess the Mann-Kendall test
├── init_operations.py                          <- Module that resets the necessary values
├── initialize.py                               <- Module that declares hyperparamaters and parameters to optimize
├── main.py                                     <- Main module that executes the program
├── manip_data.py                               <- Helper module to manipulate csv and pandas Dataframe
├── math_op.py                                  <- Module support for mathematical operations
├── optimize                                    <- Package with optimization techniques
│   ├── __init__.py
│   └── genetic_algorithm.py                    <- Module that uses a genetic algorithm to optimize
├── optimize_.py                                <- Module that runs the optimization process if desired                
├── pnl.py                                      <- Module to assess the trading strategy performance
└── trading_rules                               <- Package with possible trading rules
    ├── __init__.py
    └── r_square_tr.py                          <- Module that detects buy and sell signals with r_square and Mann Kendall test

To see the list of hyperparameters and parameters to optimize, go to this file

Each .py file has its docstring, so make sure to check it out to understand the details.

To find out more about me

For questions or comments, please feel free to reach out on LinkedIn

Part 1 - DEFINING

---- Defining the problem ----

The goal of this project is to model the Elliott Wave Theory to forecast the financial markets. Once we have the model and know the parameters, we optimize it using a machine learning technique called genetic algorithm and test in a different period (Walk forward optimization). The fitness function used for optimization and testing is the Sharpe ratio.

There is no real technique at the moment to model the Elliott Wave Theory as it is difficult to model and the modeling is highly subjective. To understand the concept of Elliott Wave Theory, refer to this post.

Since the optimization space of a trading strategy can be complex, genetic algorithms are an efficient machine learning technique to find a good approximation of the optimal solution. It mimics the biological process of evolution.

Part 2 - DISCOVER

---- Loading the data ----

The data source for this experiment was Dukascopy as it required a lot of data. The program read the data in a csv format. If you want to do an experiment on a different asset and/or timeframe, make sure to load the data in the folder of your choice and change the path with the variable self.directory in initialize.py.

If less data is needed for an experiment or if the experiment is carried on daily basis data, the Alpha Vantage API is a great source to get free and quality data (with certain restrictions, like a maximum API call per minute). This is a great article on the Alpha Vantage API.

Parameters
----------
`self.directory` : str
    Where the the data are located for training and testing periods
`self.asset` : str
    Name of the file where we get the data
`self.is_fx` : bool
    Tell if `self.asset` is forex (the data don't have the same format for forex and stocks because they are
    from different providers).
`self.dir_output` : str
    Where we store the results
`self.name_out` : str
    Name of the results file name (csv)
`self.start_date` : datetime object
    Beginning date of training and testing period. The variable is already transformed from a str to a 
    Datetime object
`self.end_date` : datetime object
    Ending date of training and testing period. The variable is already transformed from a str to a 
    Datetime object

self.directory = '/Users/philippeostiguy/Desktop/Trading/Programmation_python/Trading/'
self.dir_output = '/Users/philippeostiguy/Desktop/Trading/Programmation_python/Trading/results/'
self.name_out = 'results'
self.is_fx = True
self.asset = "EURUSD"
self.start_date = datetime.strptime('2015-10-15', "%Y-%m-%d")
self.end_date = datetime.strptime('2016-02-18', "%Y-%m-%d")

We can examine our data :

series_.head()

| # | Date | Open | High | Low | Adj Close | |------|---------------------|---------|---------|---------|-----------| | 96 | 2015-10-15 00:00:00 | 1.14809 | 1.14859 | 1.14785 | 1.14801 | | 97 | 2015-10-15 01:00:00 | 1.14802 | 1.14876 | 1.14788 | 1.14828 | | 98 | 2015-10-15 02:00:00 | 1.14831 | 1.14950 | 1.14768 | 1.14803 | | 99 | 2015-10-15 03:00:00 | 1.14802 | 1.14826 | 1.14254 | 1.14375 | | 100 | 2015-10-15 04:00:00 | 1.14372 | 1.14596 | 1.14335 | 1.14417 |

And see the lenght, value types and if there are empty values (none) :

series_.info()

| # | Column | Non-Null Count | Dtype | |---|-----------|----------------|----------------| | 0 | Date | 28176 non-null | datetime64[ns] | | 1 | Open | 28176 non-null | float64 | | 2 | High | 28176 non-null | float64 | | 3 | Low | 28176 non-null | float64 | | 4 | Adj Close | 28176 non-null | float64 |

---- Cleaning the data ----

In manip_data.py, we drop the nan value, if any (none) and remove the data when the market is closed with series_.drop_duplicates(keep=False,subset=list(dup_col.keys()))

series_ = series_.dropna() #drop nan values
if dup_col != None:
    #If all values in column self.dup_col are the same, we erase them
    series_ = series_.drop_duplicates(keep=False,subset=list(dup_col.keys()))
series_=series_.reset_in

PyBacktesting

Install / Use

README