Neuroevolution
Simulation of neural network evolution
Install / Use
/learn @Danil-Kutnyy/NeuroevolutionREADME
Neuroevolution, Simulation of neural network evolution.
Example of evolved neural network:

My project is to create neural networks that can evolve like living organisms. This mechanism of evolution is inspired by real-world biology and is heavily focused on biochemistry. Much like real living organisms, my neural networks consist of cells, each with their own genome and proteins. Proteins can express and repress genes, manipulate their own genetic code and other proteins, regulate neural network connections, facilitate gene splicing, and manage the flow of proteins between cells - all of which contribute to creating a complex gene regulatory network and an indirect encoding mechanism for neural networks, where even a single letter mutation can cause dramatic changes to a model.
Some cool results of my neural network evolution after a few hundred generations of training on MNIST can be found in Google drive
How to evolve your own neural network(small guide)?
If you want to try evolve your own Neural Networks, you only need python interpreter and Tenserflow installed. And the code of course!
Start with population.py - run the script, in my case I use zsh terminal on MacOS.
python3 path/to/destination/population.py
Default number of neural networks in a population is set to 10 and maximum development time - 12 second, so it will take about 120 second to develop all NNs.
<img width="282" alt="Screenshot" src="https://user-images.githubusercontent.com/121340828/219373144-2700b606-f5a1-4d6a-ba15-74376229ea2b.png"> <img width="500" alt="Screenshot1" src="https://user-images.githubusercontent.com/121340828/219373597-32d733af-42a2-4e7f-9aad-6f8c2e036c07.png">
Then, each one will start to learn MNIST dataset for 3 epochs and will be evaluated. This process of leaning will be shown interactively, and you will see, how much accuracy does a model get each time(from 0 to 1).
After each model has been evaluated, best will be selected and their genes will be recombined, and population will be saved in a boost_perfomnc_gen folder, in the gen_N.json file, where N - number of your generation.
If you would like to see the resulted neural network architecture:
- choose last
gen_N.jsonfile (represents last generation of neural network models) - open
test.py - On the 1st line of code, there will be:
generation_file = "default_gen_284.json" - change
default_gen_284.jsontogen_N.json - By default, 1st neural network in population is choosen(
neural_network_pop_number=0). Choose, which exact network in present generation you want to visualise(by default there exist 10 NNs, index numbers: 0-9) - run the script
- full model architecture will be saved as
test_model.png<img width="350" alt="Screenshot3" src="https://user-images.githubusercontent.com/121340828/219374366-49b94d8a-1327-42b4-bbea-55018217ded0.png">
test_model.png
About
The code for this project consists of three parts:
- Genpiler (a genetical compiler) - the heart of the evolution code, which simulates many known biochemistry processes of living organisms, transforming a sequence of "ACGT" letters (the genetic code) into a mature neural network with complex interconnections, defined matrix operations, activation functions, training parameters and meta parameters.
- Tensorflow_model.py transcribes the resulting neural network into a TensorFlow model.
- Population.py creates a population of neural networks, evaluates them with MNIST dataset and creates a new generation by taking the best-performed networks, recombining their genomes (through sexual reproduction) and mutating them.
Other interesting results:
<img width="800" alt="Screenshot4" src="https://user-images.githubusercontent.com/121340828/219380757-25f5c0a7-241f-44d9-a3c5-c09e47681569.png"> <img width="1000" alt="Screenshot6" src="https://user-images.githubusercontent.com/121340828/219381971-6e978d77-562a-419a-9896-c38a8114e100.png">How the genetic compiler works
Cells
Neural networks are composed of 3 basic units - cells, a list of common proteins, and metaparameters. Each cell is a basic unit of the neural network, and it carries out matrix operations in a TensorFlow model. In Python code, cells are represented as a list. This list includes a genome, a protein dictionary, a cell name, connections, a matrix operation, an activation function, and weights. Cell as a list representation contaon this items:
- The genome is a sequence of arbitrary A, C, T, and G letter combinations. Over time, lowercase letters (a, c, t, g) may be included, to indicate sequences that are not available for transcription.
- The protein dictionary is a set of proteins, each represented by a sequence of A, C, T, and G letters, as well as a rate parameter. This rate parameter is a number between 1 and 4000, and it simulates the concentration rate of the protein. Some proteins can only be activated when the concentration reaches a certain level.
- The cell name is a specific sequence, in the same form as the protein and genome. It is used to identify specific cells and cell types, so that proteins can work with the exact cell and cell types. For example, a protein can work with all cells that have the sequence "ACTGACTGAC" in their name.
- The connections list shows all the forward connections of the cell.
- The matrix operation is defined by the type of matrix operation available in the TensorFlow documentation.
- The activation function is also defined by the type of activation function available in the TensorFlow documentation.
- The weights define the weights of the parameters in the TensorFlow model.
Common Proteins
Common proteins are similar to the proteins found in a single cell, but they play an important role in cell-to-cell communication. These proteins are able to move between cells, allowing them to act as a signaling mechanism or to perform other functions. For example, a protein may exit one cell and enter another cell through the common_proteins dictionary, allowing for communication between the two cells.
Metaparematers:
time_limit- maximum time for neural network developmentlearning_rate= []mutation_rate= [None, None, None, None, None, None, None](don’t work!)
Gene transcription and expression
Gene transcription
First and single cell of NN, before strating development process, has some genome(which evolves over generations) and a protein - AAAATTGCATAACGACGACGGC. What does this protein do?

This is a gene transcription protein, and it starts a gene transcription cascade. To better understand its structure, let’s divide the protein into pieces: AAAATT GC |ATA ACG ACG ACG| GC The first 6 letters - AAAATT - indicate what type of protein it is. There are 23 types of different proteins, and this is type 1 - gene transcription protein. The sequence GCATAACGACGACGGC encodes how this protein works.
- (If there are
GTAAorGTCAsequences in the gene, the protein contains multiple “functional centers” and the program will cut the protein into multiple parts (according to how manyGTAAorGTCAthere are) and act as if these are different proteins. In this way, one protein can perform multiple different functions of different protein types - it can express some genes, and repress others, for example). If we addGTAAand the sameAAAATTGCATAACGACGACGGCone more time, we will haveAAAATTGCATAACGACGACGGCGTAAAAAATTGCATAACGACGACGGCprotein. The program will read this as one protein with two active sites and do two of the same functions in a row.
GC part is called an exon cut, as you can see in the example. It means that the pieces of the genome between the "GC" do the actual function, while the GC site itself acts as a separator for the parameters. I will show an example later. ATA ACG ACG ACG is the exon (parameter) of a gene transcription protein, divided into codons, which are three-letter sequences.
Each protein, though it has a specific name, in this case "gene transcription activation," can do multiple things, for example:
- Express a gene at a specific site (shown later)
- Express such a gene with a specific rate (how much protein to express, usually 1-4000)
- Express such a gene at a controllable random rate (
rate = randint(1, N), whereNis a number that can be encoded in the exon) - Pass a cell barrier and diffuse into the
common_proteinenvironment
The "gene transcription activation" protein can do all of these things, so each exon (protein parameter) encodes an exact action. The first codon (three-letter sequence) encodes what type of exon it is, and the other codons encode other information. In the example, the first codon ATA of this parameter shows the type of parameter. ATA means that this is an expression site parameter, so the next three codons: ACG ACG ACG specify the site to which the gene expression protein will bind to express a gene (shown in the example later). A special function codons_to_nucl is used to transcribe codons into a sequence of ACTG alphabet. In our case, the AC ACG ACG codons encode the sequence CTCTCT. This sequence will be used as a binding site.
Now, after we understand how the protein AAAATTGCATAACGACGACGGC will be read by our program and do its function, lets see how gene expression happens.
Gene expression
Imagine such a piece of genetic code is present in th
