PGAPy: Python Wrapper for PGAPack Parallel Genetic Algorithm Library

.. |--| unicode:: U+2013 .. en dash .. |epsilon| unicode:: U+03B5 .. epsilon

:Author: Ralf Schlatterbeck rsc@runtux.com

News

News July 2025:

Implement new crowding metrics. The original crowding metric in NSGA-II does not work very well, especially in higher dimensions. The update implements three new metrics. An example is given in the following figures. New metrics are used depending on the number of objectives. More details can be found in the user guide in the section on population replacement_ in the user guide of PGAPack_.

|fig1| |fig2|

.. |fig1| image:: examples/crowding-nsga.png :width: 45%

.. |fig2| image:: examples/crowding-mnn.png :width: 45%

News May 2025:

Implement permutation preserving crossover and mutation operators.
Bug-Fixes (most notably a feature interaction between hillclimber released in April and duplicate checking)
Fix off-by-one error in two new crossover operators (2.7.2)

News April 2025:

Add an optional hillclimb method that is called for all new individuals in the current generation. It can perform a hillclimbing heuristics that runs in parallel when a parallel version is in use
Add a new constructor parameter random_deterministic that tells the framework to use an random number generator seeded from the rank-0 random number generator for each individual evaluated. This allows use of random numbers during hillclimbing or evaluation but still produces random numbers that are reproduceable across runs (with the same random seed, of course).

News Oct 2023:

Add Differential Evolution for integer
Add Negative Assortative Mating option

News 04-2023:

The last build on PyPi was broken for serial installs, it was missing the mpi_stub.c needed for the serial version. Parallel installs were still possible so I didn't notice, sorry!
Add MPI_Abort to the wrapper, it is called with::

pga.MPI_Abort (errcode)

See below in secion Running with MPI_ how this can be used to abort the MPI run in case of exception in the evaluate method.

News 12-2022: Add regression test and update to new upstream with several bug-fixes. Includes also some bug fixes in wrapper.

News 10-2022: Add user defined datatypes. Example in examples/gp use user defined data types to implement genetic programming (we represent expressions by a tree data structure). This uses the new serialization API in pgapack to transfer a Python pickle representation to peer MPI processes. Also incorporate the latest changes in pgapack which optimizes duplicate checking. This is of interest for large population sizes in the genetic programming examples. Note that the gene_difference method has been renamed to gene_distance.

News 08-2022: Epsilon-constrained optimization and a crossover variant that preserves permutations (so with integer genes the gene can represent a permutation).

News 03-2022: Attempt to make this installable on Windows. This involves some workaround in the code because the visual compiler does not support certain C99 constructs.

News 01-2022: This version wraps multiple evaluation with NSGA-III (note the additional 'I').

News 12-2021: This version wraps multiple evaluation values from your objective function: Now you can return more than one value to either use it for constraints (that must be fulfilled before the objective is optimized) or for multi-objective optimization with the Nondominated Sorting Genetic Algorithm V.2 (NSGA-II). You can combine both, multi-objective optimization and constraints.

News: This version wraps the Differential Evolution method (that's quite an old method but is newly implemented in pgapack).

Introduction

PGAPy is a wrapper for PGAPack_, the parallel genetic algorithm library (see PGAPack Readme), a powerfull genetic algorithm library by D. Levine, Mathematics and Computer Science Division Argonne National Laboratory. The library is written in C. PGAPy wraps this library for use with Python. The original PGAPack library is already quite old but is one of the most complete and accurate (and fast, although this is not my major concern when wrapping it to python) genetic algorithm implementations out there with a lot of bells and whistles for experimentation. It also has shown a remarkably small number of bugs over the years. It supports parallel execution via the message passing interface MPI_ in addition to a normal "serial" version. That's why I wanted to use it in Python, too.

To get started you need the PGAPack_ library, although it now comes bundled with PGApy, to install a parallel version you currently need a pre-installed PGAPack_ compiled for the MPI library of choice. See Installation_ section for details.

There currently is not much documentation for PGAPy. You really, absolutely need to read the documentation that comes with PGAPack_. See documentation at Read the Docs. The PGAPack user guide is now shipped together with PGAPy. It is installed together with some examples in share/pgapy, wherever the Python installer decides to place this in the final installation (usually /usr/local/share on Linux).

The original PGAPack library can still be downloaded from the PGAPack ftp site, it is written in ANSI C but will probably not compile against a recent version of MPI. It will also not work with recent versions of PGAPy. Note that this version is not actively maintained. I've started a PGAPack fork on github_ where I've ported the library to the latest version of the MPI_ standard and have fixed some minor inconsistencies in the documentation. I've also implemented some new features, notably enhancements in selection schemes, a new replacement strategy called restricted tournament replacement [1], [2], [4]_ and, more recently, the differential evolution strategy [5], [6]. In addition this version now supports multi objective optimization with NSGA-II [7]_ and many-objective optimization with NSGA-III [8], [9]. It also supports the Epsilon Constraint method [10]_.

Note: When using NSGA_III replacement for multi (or many-) objective optimization you need to either

set reference points on the hyperplane intersecting all axes at offset 1. These reference points can be obtained with the convenience function pga.das_dennis, it creates a regular set of reference points using an algorithm originally publised by I. Das and J. E. Dennis [12]_. These points are then passed as the parameter reference_points to the PGA constructor.

See examples/dtlz2.py for a usage example and the user guide for the bibliographic reference. The function gets the dimensionality of the objective space (num_eval minus num_constraint) and the number of partition to use.
Or set reference directions (in the objective space) with the reference_directions parameter, number of partitions for these directions with the refdir_partitions parameter (see das_dennis above, this uses Das/Dennis points internally), and a scale factor with the parameter refdir_scale.

You can set both, these parameters are not mutually exclusive.

I'm mainly testing pgapy on Linux. But I've recently made it run on Windows, too but I'm not very actively testing on Windows. Let me know if you run it on Windows, sucessfully or not sucessfully.

As mentioned above, you can find my PGAPack fork on github, this repository has the three upstream releases as versions in git and contains some updates concerning support of newer MPI versions and documentation updates. I've also included patches in the git repository of the Debian maintainer of the package, Dirk Eddelbuettel. I'm actively maintaining that branch, adding new features and bug-fixes.

To get you started, I've included some very simple examples in examples, e.g., one-max.py implements the "Maxbit" example similar to one in the PGAPack documentation_. The examples were inspired by the book "Genetic Algorithms in Python" but are written from scratch and don't include any code from the book. The examples illustrates several points:

Your class implementing the genetic algorithm needs to inherit from pga.PGA (pga is the PGAPy wrapper module).
You need to define an evaluation function called evaluate that returns a sequence of numbers indicating the fitness of the gene given. It gets the parameters p and pop that can be used to fetch allele values from the gene using the get_allele method, for more details refer to the PGAPack documentation_. The number of evaluations returned by your function is defined with the constructor parameter num_eval, the default for this parameter is 1. If your evaluation function does not return multiple evaluations (with the default setting of num_eval) you can either return a one-element sequence or a single return value.
When using multiple evaluations, these can either be used for constraints (the default) or for multi-objective optimization. In the latter case, the number of constraints (which by default is one less than the number of evaluations set with the parameter num_eval) must be set to a number that leaves at least two evaluations for objectives. The number of constraints can be set with the parameter num_constraint. When using multi-objective optimization, you need one of the two replacement-types PGA_POPREPL_NSGA_II or PGA_POPREPL_NSGA_III, set this with the pop_replace_type parameter.
You can define additional functions overriding built-in functions of the PGAPack library, illustrated by the example of print_string. Note that we could call the original print_string method of our PGA superclass. In the same way you can implement, e.g., your own crossover method.
The constructo

Pgapy

Install / Use

README

PGAPy: Python Wrapper for PGAPack Parallel Genetic Algorithm Library

News

Introduction