PySIPFENN
Python python toolset for Structure-Informed Property and Feature Engineering with Neural Networks. It offers unique advantages through (1) effortless extensibility, (2) optimizations for ordered, dilute, and random atomic configurations, and (3) automated model tuning.
Install / Use
/learn @PhasesResearchLab/PySIPFENNREADME
pySIPFENN
Summary
This repository contains python toolset for Structure-Informed Property and Feature Engineering with Neural Networks which implements a numer of user-friendly tools for:
- Calculating different vector representations of atomic structures for a number of applications including supervised (e.g., predictive machine learning models) and unsupervised learning (e.g., clustering of atomic structures based on similarity or performing anomaly detection). Notably, utilize crystallographic information and some other techniques to make this process very efficient for the vast majority of use cases (see 10.1016/j.commatsci.2024.113495)
- Efficient deployment of pre-trained ML models (not limited to neural networks) obtained from repositories like Zenodo (including some we trained) or trained locally on user's machine. The system is very plug-and-play thanks to using Open Neural Network Exchange (ONNX) format which can be exported from nearly any machine learning framework.
- Tuning pre-trained ML models to new domains, like new chemical compositions, different ab initio functional, or entirely new properties. Since V0.16, users can take advantage of integration with OPTIMADE API which allows one to tune models based on DFT datasets like Materials Project, OQMD, AFLOW, or NIST-JARVIS, in just 3 lines of code specifying which provider to use, what to query for, and hyperparameters for tuning.
The underlying methodology, efficiency optimizations, design choices, and implementation specifics are given in the following publications:
-
Adam M. Krajewski, Jonathan W. Siegel, Zi-Kui Liu, Efficient Structure-Informed Featurization and Property Prediction of Ordered, Dilute, and Random Atomic Structures, Computational Materials Science, Volume 247, 2025, 113495, DOI: 10.1016/j.commatsci.2024.113495
-
Adam M. Krajewski, Jonathan W. Siegel, Jinchao Xu, Zi-Kui Liu, Extensible Structure-Informed Prediction of Formation Energy with improved accuracy and usability employing neural networks, Computational Materials Science, Volume 208, 2022, 111254, DOI:10.1016/j.commatsci.2022.111254
A more complete (and verbose) description of capabilities is given in documentation at (pysipfenn.org). You may also consider visiting our Phases Research Lab group website at (phaseslab.org).
Recent News:
-
(v0.16.0) Three exciting news! (1) The all new
ModelAdjusterssubmodule automates tuning and can fetch data directly fromOPTIMADE API; (2) A new manuscript detailing advantages of our featurization tools has been put on arXiv:2404.02849; and (3) the name of the software was updated to python toolset for Structure-Informed Property and Feature Engineering with Neural Networks to retain thepySIPFENNacronym but better reflect our strengths and development direction. -
(v0.15.0) A new descriptor (feature vector) calculator
KS2022_randomSolutionshas been implemented. It is used for structure-informed featurization of compositions randomly occupying a lattice, spiritually similar to SQS generation, but also taking into account (1) chemical differences between elements and (2) structural effects. -
(v0.14.0) Users can now take advantage of a Prototype Library to obtain common structures from any
Calculatorinstance withc.prototypeLibrary[<name>]['structure']. It can be easily updated or appended with high-level API or by manually modifyig its YAML here. -
(v0.13.0) Model exports (and more!) to PyTorch, CoreML, and ONNX are now effortless thanks to
core.modelExportersmodule. Please note you need to install pySIPFENN withdevoption (e.g.,pip install "pysipfenn[dev]") to use it. See docs here. -
(v0.12.2) Swith to LGPLv3 allowing for integration with proprietary software developed by CALPHAD community, while supporting the development of new pySIPFENN features for all.
-
(March 2023 Workshop) We would like to thank all 100 of our amazing attendees for making our workshop, co-organized with the Materials Genome Foundation.
Main Schematic
The figure below is the main schematic of pySIPFENN framework detailing the interplay of internal components. The user interface provides a high-level API to process structural data within core.Calculator, pass it to featurization submodules in descriptorDefinitions to obtain vector representation, then passed to models defined in models.json and (typically) run automatically through all available models. All internal data of core.Calculator is accessible directly, enabling rapid customization. An auxiliary high-level API enables advanced users to operate and retrain the models.
Applications
pySIPFENN is a very flexible tool that can, in principle, be used for the prediction of any property of interest that depends on an atomic configuration with very few modifications. The models shipped by default are trained to predict formation energy because that is what our research group is interested in; however, if one wanted to predict Poisson’s ratio and trained a model based on the same features, adding it would take minutes. Simply add the model in open ONNX format and link it using the models.json file, as described in the documentation.
Real-World Examples
In our line of work, pySIPFENN and the formation energies it predicts are usually used as a computational engine that generates proto-data for creation of thermodynamic databases (TDBs) using ESPEI (https://espei.org). The TDBs are then used through pycalphad (https://pycalphad.org) to predict phase diagrams and other thermodynamic properties.
Another of its uses in our research is guiding the Density Functional Theory (DFT) calculations as a low-cost screening tool. Their efficient conjunction then drives the experiments leading to discovery of new materials, as presented in these two papers:
Related Skills
node-connect
352.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
