It's A SNAP!

Scalable Neural network Atomic Potentials (SNAP)

A PyTorch Lightning-based neural network potential (NNP) training wrapper.

Environment

Create a new conda environment as follows:

conda create --name wrap python=3.11
conda activate wrap 
export PYTHONUSERBASE=$CONDA_PREFIX
python -m pip install --user torch torchvision --index-url https://download.pytorch.org/whl/cu124
python -m pip install --user torch_scatter -f https://data.pyg.org/whl/torch-2.4.0+cu124.html
python -m pip install --user lightning
python -m pip install --user torch_geometric
python -m pip install --user torch_ema
python -m pip install --user e3nn 
python -m pip install --user ase pandas h5py prettytable
python -m pip install --user matscipy

Load environment with conda activate wrap

Install this repo as follows:

python -m pip install git+https://github.com/pnnl/SNAP.git

Tested Package Versions

python 3.11
pytorch 2.5 (cu12-12.4.127)
torch_scatter 2.1.2
lightning 2.4.0
torch_gemetric 2.6.1
e3nn 0.5.1
torch-ema 0.3
numpy 1.25.2

Data Preprocessing

Structures should be saved in .extxyz format, including atomic forces, and placed in the following file structure where DATADIR is the top-level directory.

$DATADIR 
       |_raw
          |_$SAMPLE
                   |_files.extxyz (or files.xyz)
                   |_statistics.json

It is recommended that per-atom E0 values computed at the same level of theory as your data are used to normalize the total energy. These values for all atoms should be saved in the statistics.json file in dictionary format as follows: {'atomic_energies': {Z_i: E0_i, ...}, 'atomic_numbers': [Z_i, ...]}. If statistics.json is not present during the preprocessing step, one will be computed for each $SAMPLE folder using the fitting algorithm used in MACE.

See ASE io for converting simulation output files to .extxyz. Note that MACE-MP-0 expects energies to be in eV and forces to be in eV/Å.

Model Training

See train-mace-mp-0.py for example training script to finetune MACE-MP-0.

The below example shows how to finetune the 'small' MACE-MP-0 model.

srun python train-mace-mp-0.py --savedir {SAVEDIR} --model 'small' \
    --datadir ${DATADIR} --split-file ${DATADIR}/processed/split.npz \
    --batch-size 16 --max-epochs 500 --min-epochs 25 \
    --train-forces

Training Flags

Additional MACE-MP-0 Flags

Multi-GPU

There are two parameters in the SLURM submission script that determine how many processes will run your training, the #SBATCH --nodes=X setting and #SBATCH --ntasks-per-node=Y settings. The numbers there need to match what is configured in your Trainer in the code: Trainer(num_nodes=X, devices=Y). If you change the numbers, update them in BOTH places.

The example script sets both num_nodes and devices to be automatically be detected by the Trainer. If using the example script, training over 2 gpus (nproc_per_node) on 1 node (nnodes) can be performed as follows:

srun python -m torch.distributed.run --nnodes=1 --nproc_per_node=2 train-mace-mp-0.py \
    --datadir ${DATADIR} --split-file ${DATADIR}/processed/split.npz \
    --batch-size 16 --max-epochs 500 --min-epochs 25 \
    --train-forces

References

If you use this code, please cite our associated publication:

@article{bilbrey2025uncertainty,
  title={Uncertainty Quantification for Neural Network Potential Foundation Models},
  author={Bilbrey, Jenna A and Firoz, Jesun S and Lee, Mal-Soon and Choudhury, Sutanay},
  journal={npj Computational Materials},
  volume={11},
  number={109},
  year={2025},
  doi={10.1038/s41524-025-01572-y},
  url={https://www.nature.com/articles/s41524-025-01572-y},
}

Acknowledgements

Initial development of this codebase was supported by the "Transferring exascale computational chemistry to cloud computing environment and emerging hardware technologies (TEC4)" project, which is funded by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, the Division of Chemical Sciences, Geosciences, and Biosciences (under FWP 82037).

SNAP

Install / Use

README