Mace
MACE - Fast and accurate machine learning interatomic potentials with higher order equivariant message passing.
Install / Use
/learn @ACEsuit/MaceREADME
<span style="font-size:larger;">MACE</span>
Table of contents
- MACE
About MACE
MACE provides fast and accurate machine learning interatomic potentials with higher order equivariant message passing.
This repository contains the MACE reference implementation developed by Ilyes Batatia, Gregor Simm, David Kovacs, and the group of Gabor Csanyi, and friends (see Contributors).
Also available:
- MACE in JAX, currently about 2x times faster at evaluation, but training is recommended in Pytorch for optimal performances.
- MACE layers for constructing higher order equivariant graph neural networks for arbitrary 3D point clouds.
Documentation
A partial documentation is available at: https://mace-docs.readthedocs.io
Installation
1. Requirements
- Python >= 3.8 (for openMM, use Python = 3.9)
- PyTorch >= 1.12 (training with float64 is not supported with PyTorch 2.1 but is supported with 2.2 and later, Pytorch 2.4.1 is not supported)
Make sure to install PyTorch. Please refer to the official PyTorch installation for the installation instructions. Select the appropriate options for your system.
Installation from PyPI
This is the recommended way to install MACE.
pip install --upgrade pip
pip install mace-torch
Note: The homonymous package on PyPI has nothing to do with this one.
Installation from source
git clone https://github.com/ACEsuit/mace.git
pip install ./mace
Usage
Training
To train a MACE model, you can use the mace_run_train script, which should be in the usual place that pip places binaries (or you can explicitly run python3 <path_to_cloned_dir>/mace/cli/run_train.py)
mace_run_train \
--name="MACE_model" \
--train_file="train.xyz" \
--valid_fraction=0.05 \
--test_file="test.xyz" \
--config_type_weights='{"Default":1.0}' \
--E0s='{1:-13.663181292231226, 6:-1029.2809654211628, 7:-1484.1187695035828, 8:-2042.0330099956639}' \
--model="MACE" \
--hidden_irreps='128x0e + 128x1o' \
--r_max=5.0 \
--batch_size=10 \
--max_num_epochs=1500 \
--stage_two \
--start_stage_two=1200 \
--ema \
--ema_decay=0.99 \
--amsgrad \
--restart_latest \
--device=cuda \
To give a specific validation set, use the argument --valid_file. To set a larger batch size for evaluating the validation set, specify --valid_batch_size.
To control the model's size, you need to change --hidden_irreps. For most applications, the recommended default model size is --hidden_irreps='256x0e' (meaning 256 invariant messages) or --hidden_irreps='128x0e + 128x1o'. If the model is not accurate enough, you can include higher order features, e.g., 128x0e + 128x1o + 128x2e, or increase the number of channels to 256. It is also possible to specify the model using the --num_channels=128 and --max_L=1keys.
It is usually preferred to add the isolated atoms to the training set, rather than reading in their energies through the command line like in the example above. To label them in the training set, set config_type=IsolatedAtom in their info fields.
When training a model from scratch, if you prefer not to use or do not know the energies of the isolated atoms, you can use the option --E0s="average" which estimates the atomic energies using least squares regression. Note that using fitted E0s corresponds to fitting the deviations of the atomic energies from the average, rather than fitting the atomization energy (which is the case when using isolated-atom E0s), and this will most likely result in less stable potentials for molecular dynamics applications.
When finetuning foundation models, you can use --E0s="estimated", which estimates the atomic reference energies by solving a linear system that optimally corrects the foundation model's predictions on the training data. This approach computes E0 corrections by first running the foundation model on all training configurations, computing the prediction errors (reference energies minus predicted energies), and then solving a least-squares system to find optimal E0 corrections for each element. This is preferable in general over the 'average' option.
If the keyword --stage_two (previously called swa) is enabled, the energy weight of the loss is increased for the last ~20% of the training epochs (from --start_stage_two epochs). This setting usually helps lower the energy errors.
The precision can be changed using the keyword --default_dtype, the default is float64 but float32 gives a significant speed-up (usually a factor of x2 in training).
The keywords --batch_size and --max_num_epochs should be adapted based on the size of the training set. The batch size should be increased when the number of training data increases, and the number of epochs should be decreased. An heuristic for initial settings, is to consider the number of gradient update constant to 200 000, which can be computed as $\text{max-num-epochs}*\frac{\text{num-configs-training}}{\text{batch-size}}$.
The code can handle training set with heterogeneous labels, for example containing both bulk structures with stress and isolated molecules. In this example, to make the code ignore stress on molecules, append to your molecules configuration a config_stress_weight = 0.0.
By default, a figure displaying the progression of loss and RMSEs during training, along with a scatter plot of the model's inferences on the train, validation, and test sets, will be generated in the results folder at the end of training. This can be disabled using --plot False. To track these metrics throughout training (excluding inference on the test set), you can enable periodic plotting for the train and validation sets by specifying --plot_frequency N, which updates the plots every Nth epoch.
Apple Silicon GPU acceleration
To use Apple Silicon GPU acceleration make sure to install the latest PyTorch version and specify --device=mps.
Multi-GPU training
For multi-GPU training, use the --distributed flag. This will use PyTorch's DistributedDataParallel module to train the model on multiple GPUs. Combine with on-line data loading for large datasets (see below). An example slurm script can be found in mace/scripts/distributed_example.sbatch.
YAML configuration
Option to parse all or some arguments using a YAML is available. For example, to train a model using the arguments above, you can create a YAML file your_configs.yaml with the following content:
name: nacl
seed: 2024
train_file: train.xyz
stage_two: yes
start_stage_two: 1200
max_num_epochs: 1500
device: cpu
test_file: test.xyz
E0s:
41: -1029.2809654211628
38: -1484.1187695035828
8: -2042.0330099956639
config_type_weights:
Default: 1.0
And append to the command line --config="your_configs.yaml". Any argument specified in the command line will overwrite the one in the YAML file.
Evaluation
To evaluate your MACE model on an XYZ file, run the mace_eval_configs:
mace_eval_configs \
--configs="your_configs.xyz" \
--model="your_model.model" \
--output="./your_output.xyz"
Tutorials
You can run our Colab tutorial to quickly get started with MACE.
We also have a more detailed Colab tutorials on:
- Introduction to MACE training and evaluation
- Introduction to MACE active learning and fine-tuning
- MACE theory and code (advanced)
CUDA acceleration with cuEquivariance
MACE supports CUDA accelerati
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
isf-agent
a repo for an agent that helps researchers apply for isf funding
