Megnet
Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals
Install / Use
/learn @materialyzeai/MegnetREADME
Deprecation
This repository has been deprecated in favor of a new PyTorch + Deep Graph Library implementation in the matgl repository. It will no longer be updated and is retained purely as a reference implementation in the original Tensorflow.
Table of Contents
- Introduction
- MEGNet Framework
- Installation
- Usage
- Datasets
- Implementation details
- Computing requirements
- Known limitations
- Contributors
- References
<a name="introduction"></a>
Introduction
This repository represents the efforts of the Materials Virtual Lab in developing graph networks for machine learning in materials science. It is a work in progress and the models we have developed thus far are only based on our best efforts. We welcome efforts by anyone to build and test models using our code and data, all of which are publicly available. Any comments or suggestions are also welcome (please post on the Github Issues page.)
A web app using our pre-trained MEGNet models for property prediction in
crystals is available at http://megnet.crystals.ai. For tutorials, please visit notebooks in this repo. We have also established an online simulation tool and a tutorial lecture at nanoHUB (https://nanohub.org/resources/megnet).
Note: A DGL implementation of MEGNet is now available. For users trying to build their own MEGNet models, it is highly recommended you check this version out, which may be easier to work with and extend in future.
<a name="megnet-framework"></a>
MEGNet framework
The MatErials Graph Network (MEGNet) is an implementation of DeepMind's graph networks[1] for universal machine learning in materials science. We have demonstrated its success in achieving very low prediction errors in a broad array of properties in both molecules and crystals (see "Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals"[2]). New releases have included our recent work on multi-fidelity materials property modeling (See "Learning properties of ordered and disordered materials from multi-fidelity data"[3]).
Briefly, Figure 1 shows the sequential update steps of the graph network, whereby bonds, atoms, and global state attributes are updated using information from each other, generating an output graph.

Figure 2 shows the overall schematic of the MEGNet. Each graph network module
is preceded by two multi-layer perceptrons (known as Dense layers in Keras
terminology), constituting a MEGNet block. Multiple MEGNet blocks can be
stacked, allowing for information flow across greater spatial distances. The
number of blocks required depend on the range of interactions necessary to
predict a target property. In the final step, a set2set is used to map the
output to a scalar/vector property.

<a name="installation"></a>
Installation
Megnet can be installed via pip for the latest stable version:
pip install megnet
For the latest dev version, please clone this repo and install using:
python setup.py develop
<a name="usage"></a>
Usage
Our current implementation supports a variety of use cases for users with different requirements and experience with deep learning. Please also visit the notebooks directory for Jupyter notebooks with more detailed code examples.
Using pre-built models
In our work, we have already built MEGNet models for the QM9 data set and
Materials Project dataset. These models are provided as serialized HDF5+JSON
files. Users who are purely interested in using these models for prediction
can quickly load and use them via the convenient MEGNetModel.from_file
method. These models are available in the mvl_models folder of this repo.
The following models are available:
- QM9 molecule data:
- HOMO: Highest occupied molecular orbital energy
- LUMO: Lowest unoccupied molecular orbital energy
- Gap: energy gap
- ZPVE: zero point vibrational energy
- µ: dipole moment
- α: isotropic polarizability
- <R2>: electronic spatial extent
- U0: internal energy at 0 K
- U: internal energy at 298 K
- H: enthalpy at 298 K
- G: Gibbs free energy at 298 K
- Cv: heat capacity at 298 K
- ω1: highest vibrational frequency.
- Materials Project data:
- Formation energy from the elements
- Band gap
- Log 10 of Bulk Modulus (K)
- Log 10 of Shear Modulus (G)
The MAEs on the various models are given below:
Performance of QM9 MEGNet-Simple models
| Property | Units | MAE | |----------|------------|-------| | HOMO | eV | 0.043 | | LUMO | eV | 0.044 | | Gap | eV | 0.066 | | ZPVE | meV | 1.43 | | µ | Debye | 0.05 | | α | Bohr^3 | 0.081 | | <R2> | Bohr^2 | 0.302 | | U0 | eV | 0.012 | | U | eV | 0.013 | | H | eV | 0.012 | | G | eV | 0.012 | | Cv | cal/(molK) | 0.029 | | ω1 | cm^-1 | 1.18 |
Performance of MP-2018.6.1
| Property | Units | MAE | |----------|------------|-------| | Ef | eV/atom | 0.028 | | Eg | eV | 0.33 | | K_VRH | log10(GPa) | 0.050 | | G_VRH | log10(GPa) | 0.079 |
Performance of MP-2019.4.1
| Property | Units | MAE | |----------|------------|-------| | Ef | eV/atom | 0.026 | | Efermi | eV | 0.288 |
New models will be added as they are developed in the mvl_models folder. Each folder contains a summary of model details and benchmarks. For the initial models and bencharmks comparison to previous models, please refer to "Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals"[2].
Below is an example of crystal model usage:
from megnet.utils.models import load_model
from pymatgen.core import Structure, Lattice
# load a model in megnet.utils.models.AVAILABLE_MODELS
model = load_model("logK_MP_2018")
# We can construct a structure using pymatgen
structure = Structure(Lattice.cubic(3.167),
['Mo', 'Mo'], [[0, 0, 0], [0.5, 0.5, 0.5]])
# Use the model to predict bulk modulus K. Note that the model is trained on
# log10 K. So a conversion is necessary.
predicted_K = 10 ** model.predict_structure(structure).ravel()[0]
print(f'The predicted K for {structure.composition.reduced_formula} is {predicted_K:.0f} GPa.')
A full example is in notebooks/crystal_example.ipynb.
For molecular models, we have an example in
notebooks/qm9_pretrained.ipynb.
We support prediction directly from a pymatgen molecule object. With a few more
lines of code, the model can predict from SMILES representation of molecules,
as shown in the example. It is also straightforward to load a xyz molecule
file with pymatgen and predict the properties using the models. However, the
users are generally not advised to use the qm9 molecule models for other
molecules outside the qm9 datasets, since the training data coverage is
limited.
Below is an example of predicting the "HOMO" of a smiles representation
from megnet.utils.molecule import get_pmg_mol_from_smiles
from megnet.models import MEGNetModel
# same model API for molecule and crystals, you can also use the load_model method
# as in previous example
model = MEGNetModel.from_file('mvl_models/qm9-2018.6.1/HOMO.hdf5')
# Need to convert SMILES into pymatgen Molecule
mol = get_pmg_mol_from_smiles("C")
model.predict_structure(mol)
Training a new MEGNetModel from structures
For users who wish to build a new model from a set of crystal structures with
corresponding properties, there is a convenient MEGNetModel class for setting
up and training the model. By default, the number of MEGNet blocks is 3 and the
atomic number Z is used as the only node feature (with embedding).
from megnet.models import MEGNetModel
from megnet.data.crystal import CrystalGraph
import numpy as np
nfeat_bond = 10
r_cutoff = 5
gaussian_centers = np.linspace(0, r_cutoff + 1, nfeat_bond)
gaussian_width = 0.5
graph_converter = CrystalGraph(cutoff=r_cutoff)
model = MEGNetModel(graph_converter=graph_converter, centers=gaussian_centers, widt
