Exact
The Evolutionary eXploration of Neural Networks Framework -- EXAMM, EXA-GP and EXACT
Install / Use
/learn @travisdesell/ExactREADME

Table of Contents
- EXAMM and EXA-GP Overview
- Installation and Setup
- Quickstart
- Managing Datasets
- Running EXAMM and EXA-GP
- Tracking and Managing Evolved Networks
- Using Evolved Neural Networks for Inference
EXAMM and EXA-GP Overview
EXAMM (Evolutionary eXploration of Augmenting Memory Models) is a neuroevolution (evolutionary neural architecture search) algorithm which automates the design and training of recurrent neural networks (RNNs) for time series forecasting. EXAMM uses a constructive evolutionary process which evolves progressively larger RNNs by a set of mutation and crossover operations. EXAMM is a fine-grained neuroevolution algorith, operating at the level of individual nodes and edges which allows for extremely efficient and minimal networks. It utilizes a library of various modern memory cells (LSTM, GRU, MGU, UGRNN, and Delta-RNN) [^examm_memory_cells] and can establish recurrent connections with varying time skips for improved learning and forecasting [^examm_deep_recurrent]. It also uses a Lamarckian weight inheritance strategy, allowing generated networks to re-use weights of their parents to reduce the amount of training by backpropagation required [^examm_lamarckian].
[^examm_memory_cells]: Alex Ororbia, AbdElRahman ElSaid, and Travis Desell. Investigating Recurrent Neural Network Memory Structures using Neuro-Evolution. <em>The Genetic and Evolutionary Computation Conference (GECCO 2019).</em> Prague, Czech Republic. July 8-12, 2019.
[^examm_deep_recurrent]: Travis Desell, AbdElRahman ElSaid and Alexander G. Ororbia. An Empirical Exploration of Deep Recurrent Connections Using Neuro-Evolution. The 23nd International Conference on the Applications of Evolutionary Computation (EvoStar: EvoApps 2020). Seville, Spain. April 15-17, 2020. <em>Best paper nominee</em>.
[^examm_lamarckian]: Zimeng Lyu, AbdElRahman ElSaid, Joshua Karns, Mohamed Mkaouer, Travis Desell. An Experimental Study of Weight Initialization and Lamarckian Inheritance on Neuroevolution. The 24th International Conference on the Applications of Evolutionary Computation (EvoStar: EvoApps 2021).
EXAMM has since been extended to the Evolutionary Exploration of Augmenting Genetic Programs (EXA-GP) algorithm, which replaces the memory cells of EXAMM with basic genetic programming (GP) operations (e.g., sum, product, sin, cos, tanh, sigmoid, inverse). EXA-GP has been shown to generate compact genetic programs (multivariate functions) for time series forecasting which can outperform the RNNs evolved by EXAMM while at the same time being more interpretable[^exagp][^exagp_min].
[^exagp]: Jared Murphy, Devroop Kar, Joshua Karns, and Travis Desell. EXA-GP: Unifying Graph-Based Genetic Programming and Neuroevolution for Explainable Time Series Forecasting. Proceedings of the Genetic and Evolutionary Computation Conference Companion. Melbourne, Australia. July 14-18, 2024.
[^exagp_min]: Jared Murphy, Travis Desell. Minimizing the EXA-GP Graph-Based Genetic Programming Algorithm for Interpretable Time Series Forecasting. Proceedings of the Genetic and Evolutionary Computation Conference Companion. Melbourne, Australia. July 14-18, 2024.
Implemented in C++, EXAMM and EXA-GP are designed for efficient CPU-based computation (which for time series forecasting RNNs are typically more performant than GPUs) and offers excellent scalability due to its asynchronous island based distributed strategy (see above) with repopulation events which prune evolutionary dead ends to improve perforance[^examm_islands]. They employ a distributed architecture where worker processes handle RNN training while a main process manages population evolution and orchestrates the overall evolutionary process. This allows for better performance via either multithreaded execution or distributed execution on high performance computing clusters via the message passing interface (MPI).
[^examm_islands]: Zimeng Lyu, Joshua Karns, AbdElRahman ElSaid, Mohamed Mkaouer, Travis Desell. Improving Distributed Neuroevolution Using Island Extinction and Repopulation. The 24th International Conference on the Applications of Evolutionary Computation (EvoStar: EvoApps 2021).
Installation and Setup
EXAMM and EXA-GP have been designed to have a fairly minimal set of requirements, and we recommend using either OSX or Linux. For Windows users, we recommend using Windows Subsystem for Linux (WSL) to run EXAMM or EXA-GP in a linux VM. EXAMM/EXA-GP use CMake to create a makefile for building (this can potentially also be used to make a visual studio project, however we have not tested this).
OSX and Linux Setup
For OSX we recommend using Homebrew to handle installing packages, for Linux please use your package manager of choice. Installing all required libraries below (or their linux versions) should be sufficient to compile EXAMM/EXA-GP:
xcode-select --install
brew install cmake
brew install mysql
brew install open-mpi
brew install libtiff
brew install libpng
brew install clang-format
Cluster Setup
The following is for internal use on RIT's high performance computing cluster, however if your own computing cluster utilizes Spack you may find this useful.
# GCC (9.3)
spack load gcc/lhqcen5
# CMake
spack load cmake/pbddesj
# OpenMPI
spack load openmpi/xcunp5q
# libtiff
spack load libtiff/gnxev37
Building
After the above libraries have been installed and/or loaded, compiling EXAMM/EXA-GP should be as simple doing the following within your root EXAMM directory.
mkdir build
cd build
cmake ..
make
Quickstart
For quick start with example datasets using basic settings, the following scripts provide examples of running EXAMM on the coal benchmark datasets provided in this repository running either the multithreaded version or the MPI version. For a deeper dive on EXAMM/EXA-GP's command line arguments please see the Running EXAMM and EXA-GP section.
Multithreaded Version
# In the root directory:
sh scripts/base_run/coal_mt.sh
MPI Version
# In the root directory:
sh scripts/base_run/coal_mpi.sh
Managing Datasets
EXAMM and EXA-GP are designed to use multivariate time series data as training and validation data. When EXAMM or EXA-GP generate a new recurrent neural network (RNN) or genetic program (GP), the RNN or GP is trained for a specified number of backpropagation epochs on the training data, and then the fitness of the RNN or GP is calculated by evaluating it using the validation data. Simple comma-separated value (CSV) files are used to represent th the training and validation data (examples can be found within the datasets subdirectory of the project). The first row of the CSV file should contain the column headers (without a # character), and all columns should have numerical values as data. For example:
file1.csv:
a,b,c,d
0.5,0.2,0.1,0.2
0.8,0.1,0.3,0.5
...
0.9,-0.2,0.2,0.6
file2.csv:
a,b,c,d
0.7,-0.2,0.7,0.3
0.6,-0.1,0.5,0.4
...
0.4,0.3,-0.1,0.6
file3.csv:
a,b,c,d
-0.5,0.6,0.5,0.9
-0.8,0.7,-0.3,0.8
...
-0.9,-0.8,-0.3,0.3
Given three example files which can be used for training and evolving the networks (either RNNs or GPs) as well as validating their results to calculate the fitness. These are a four column CSV files with the first column being named a, the second column being named b and so on. These column names can be used to specifiy which columns are used as inputs to the evolved networks. The files used for training are specified with the --training_filenames <str>+ command line option and the files used for validation are specified with the --validation_filenames <str>+ command line option. Similarly, the --input_parameter_names <str>+ specify which columns are used as inputs to the networks and --output_parameter_names <str>+ specify which columns are being predicted (i.e., the outputs of the networks). Note that the same columns can be used for both inputs and outputs.
As the networks evolved are used for time series forecasting, the --time_offset <int> command line option specifies how far in the future (how many rows) the network is predicting. So if --time_offset 5 is specified the values from row 1 would be used to predict the values in row 6, the values in row 2 would be used to predict the values in row 7, and so on. --time_offset can also be set to 0 to predict the input data, which can be useful for evolving auto-encoder like networks.
EXAAM and EXA-GP currently utilize unbatched stochastic gradient descent to train the evolved networks, so each training file specified is used as a sample which are randomly shuffled each epoch. We have found however that while memory cell recurrent architectures are supposed to well handle long term time dependencies in practice this is not necessarily the case. It is possible to improve performance by dividing up input time series data into smaller sequences[^examm_coal]. The --train_sequence_length <int> command line option can be used to specify how many rows to slice each training file into (if they are not evenly divisible by this number the last slice will be the remaining rows of the file).
[^examm_coal]: Zimeng Lyu, Shuchita Patw
Related Skills
node-connect
342.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
85.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
342.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
342.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
