Pgapack
Parallel Genetic Algorithm Library originally by David Levine from Argonne National Laboratory
Install / Use
/learn @schlatterbeck/PgapackREADME
.. |--| unicode:: U+2013 .. en dash
.. |examples/c/namefull.c| replace:: examples/c/namefull.c
.. |examples/c/udtstr.c| replace:: examples/c/udtstr.c
.. |examples/fortran/namefull.f| replace:: examples/fortran/namefull.f
.. |examples/fortran/constr.f| replace:: examples/fortran/constr.f
.. |examples/nsgaii/crowdingplot| replace:: examples/nsgaii/crowdingplot
PGAPack +++++++
PGAPack is a general-purpose, data-structure-neutral, parallel genetic algorithm library originally developed by David Levine at Argonne National Laboratory. It has libraries for C and Fortran. There are companion projects:
- PGAPy_, a Python wrapper for PGApack https://github.com/schlatterbeck/pgapy
debian-pgapack_, a project for building debian packages (for all MPI libraries supported by Debian): https://github.com/schlatterbeck/debian-pgapack
Documentation is on Read the Docs_.
Updates
Update July 2025:
Implement new crowding metrics. The original crowding metric in NSGA-II
does not work very well, especially in higher dimensions. The update
implements three new metrics. An example is given in the following
figures. New metrics are used depending on the number of objectives.
More details can be found in the user guide in the section on population replacement_.
|fig1| |fig2|
.. |fig1| image:: docs/crowding-nsga.png :width: 45%
.. |fig2| image:: docs/crowding-mnn.png :width: 45%
Update June 2025:
- Implement Jensen's algorithm for non-dominated sorting. The previous implementation was O(n2) while the new is O(n*log(n)(m-1)) where n is the population size and m is the number of objectives.
- Add maximization examples for NSGA-II
- Add coding conventions, mostly for use with AI assisted coding
- Document new Jensen algorithm
- Better document the "water" problem in
examples/nsgaii - Allow scaling of examples in
examples/nsgaiii: The DTLZ test problems allow scaling the number of objectives, remove hard-coded dimension, also allow to optimize these problems with NSGA-II (not just -III) and allow to measure timing - Replace all calls to
PGAErrorandPGAErrorPrintfwhere the flagPGA_FATALwas used with newPGAFatalPrintf: This allows hints for the compiler that this function doesn't return - Allow to set the old non-dominated sorting algorithm or a combined
algorithm that compares old and new versions with
PGASetSortND
Update May 2025:
- Implement permutation preserving crossover and mutation operators: By default integer genes are initialized with a permutation. But there was only the edge crossover (for TSP problems) that preserves the property that the gene is a permutation. We now have a bunch of new crossover operators that preserve the permuation property. We also add two new mutation operators (there was already the permute mutation operator) that preserve permutations.
- Bug Fix of feature interaction hillclimber and duplicate checking: This would trigger an assertion. There is still an interaction because the hillclimber can create new individuals that are checked for duplicates only with certain replacement strategies. We cannot get rid of this interaction with the current implementation (the idea of the hillclimber is that it runs in parallel on all MPI processors, these do not have access to the population and cannot check for duplicates).
- Bug Fix for RTR, PAIRWISE_BEST, and the two NSGA population replacement schemes: These could call the hillclimber and the evaluation function with empty individuals for the first generation (after generation 0 which is called on the old population). Typically these replacement schemes act on the whole population, the bug manifested only when replacing only part of the population each generation (setting PGASetNumReplaceValue to a value smaller than PGASetPopSize).
- Bug fix of type definitions of IntegerMin and IntegerMax, this is an internal data structure for init ranges and could create problems for architectures where a long int is not the same size as an int. We also refactored the integer permutation initialization to shuffle items in place.
2nd update April 2025:
- Implement new hillclimbing user-function: For the parallel version this is called in parallel on all evaluated individuals
- Implement deterministic random number generator for the parallel case: This is achieved by re-seeding a second random number generator for the evaluation and hillclimbing functions so that we get consistent random numbers in the hillclimber (or in the evaluation function) even when running the parallel version.
update April 2025:
- Fix memory leaks and add valgrind target to Makefile. Note that the memory leaks only involved allocating some memory at the start and never freeing it |--| there was no memory leaked during the run. Thanks to Joachim Comes for providing debug traces and static checks for tracking the memory leaks.
- Add install target and create a .pc file for pkg-config.
- Fix the 'soname' for the shared library, seems like so far pgapack (but not mpi) was linked statically. We now run the tests against the shared lib.
- Document visualization of multi-objective optimization
2nd update Oct 2023:
- Implement Negative Assortative Mating, a mating restriction that tries to mate individuals with large genetic distance.
Oct 2023:
- Add Differential Evolution with integer genes
- Fix feature interaction bug with population replacement RTR or population replacement pairwise best and the no duplicates flag.
Apr 2023:
- Add MPI_Abort to the fake mpi wrapper
- Add missing MPI_Finalize() prototype to fakempi include
3rd update Jan 2023:
- Now the user-guide is converted to sphinx. The old LaTeX version can still be built but will vanish eventually
- You can access the documentation on
Read the Docs_ now.
2nd update Jan 2023:
- Generalize mean hamming distance reporting to mean genetic distance reporting. This now works with all data types not just binary. This uses the already-existing genetic distance user function which is implemented for all builtin data types and can be overridden to use euclidian distance instead of the default manhattan distance for integer and real data types.
- Deprecated PGAHammingDistance in favor of PGAGeneDistance. There is a backward-compatible define for C but not for Fortran. It's doubful anybody has ever used this in custom code (it was mainly used in built-in reporting of the hamming distance of binary strings).
Update Jan 2023:
- Add Sphinx documentation
- Generate manual pages from Sphinx docs, this has a lot of bug-fixes in the manpages (e.g. wrong documentation of return value) and documents some functions that did previously not have a manual page
- Default for PGASetSumConstraintsFlag is now PGA_TRUE, works more reliable, this is only relevant when using one of the NSGA multi objective algorithms
Update Dec 2022:
- Bug fixes discovered during implementation of a regression test for the python wrapper
- Output of string printing and result printing can now be redirected to a file
Update Oct 2022:
- Use hashing for comparing individuals when the NoDuplicates flag is
turned on. Previously the comparison operations were O(n²) in the
population size. Now the effort is linear. The downside is that when
you have a custom comparison function with PGA_USERFUNCTION_DUPLICATE
you also need a hash function with PGA_USERFUNCTION_HASH. This always
applies when you have user-defined datatypes (and want to use the
NoDuplicates flag). Examples for C are in |examples/c/namefull.c|_ and
in |examples/c/udtstr.c|_ (for a user defined datatype) and for Fortran
in |examples/fortran/namefull.f|_ (there are no user-defined datatypes
in Fortran). The good news is that there is a utility function
PGAUtilHashthat you can use when implementing a custom hashing function. - Factor MPI serialization: When serializing an Individual, pgapack needs certain fields to be sent together with the gene, this is now in its own function. This function can also be used for MPI serialization when using user-defined datatypes with MPI: In that case a user function PGA_USERFUNCTION_BUILDDATATYPE has to be written. The new code substantially reduces the boilerplate code for writing a function for building an MPI datatype and will probably need no updates when the pgapack internal information changes. An example is in |examples/c/udtstr.c|_ and in the user guide.
- Add a new serialization API for MPI serialization that can be used instead of PGA_USERFUNCTION_BUILDDATATYPE. This is especially useful when the user-defined datatype is variable length. We send the length of the serialization in a first MPI message before sending the (variable length) individual. Since we're not using multicast, this works fine for transferring variable-size information with MPI. This new API will be used in the companion-project PGAPy_ for user defined datatypes in python.
- Bug-fix in multi-objective optimization: When evaluations are exactly equal the ranking would not correctly compute the dominance relation
- Bug-fix in multi-objective optimization: The crowding metric was not properly initialized resulting sometimes in different optimization paths when compiled with/without optimization (-O2 and -O3 in gcc)
- Fix feature interaction between multi-objective optimization and the NoDuplicates flag: When combining two populations in the multi objective optimization algorithms (NSGA-II and NSGA-III) where both populations contain instances of the same indidivual, duplicates would result.
- The script for plotting the pareto front for 3-dimensional problems
used to be in
examples/nsgaiii/crowdingplot3(this was already a symbolic link in the latest releases) and is now gone, use the-3option for |examples/nsgaii/c
