PhononBench
PhononBench is a phonon-based benchmark for large-scale dynamical stability evaluation of AI-generated crystals, featuring 100k+ structures, DFT-level MatterSim phonon calculations, and open-source high-throughput workflows.
Install / Use
/learn @xqh19970407/PhononBenchREADME
PhononBench (API:http://phononbench.cn)
<p align="center"> <img src="./fig-main.png" alt="PhononBench Overview" width="500"> </p> PhononBench is a phonon-based benchmark for large-scale dynamical stability evaluation of AI-generated crystals, featuring 100k+ structures, DFT-level MatterSim phonon calculations, and open-source high-throughput workflows.Summary of Crystal Generation and Dynamical Stability Statistics
| Model | Relaxed | Dynamically Stable | Input Script Success | Unique CIFs | Total Generated | |------------------------|---------:|-------------------:|---------------------:|------------:|----------------:| | CrystalFlow-MP20 | 8,533 | 1,428 | 8,852 | 9,952 | 16,000 | | CrystalFormer-Alex20 | 8,642 | 2,969 | 8,807 | 8,986 | 40,000 | | CrystalFormer-MP20 | 4,408 | 510 | 4,990 | 5,143 | 20,000 | | CrystaLLM-MP20 | 1,951 | 58 | 2,074 | 2,074 | 16,000 | | DiffCSP-MP20 | 9,163 | 2,488 | 9,959 | 10,000 | 16,000 | | InvDesFlow-AL-MP20 | 8,000 | 2,176 | – | – | – | | InvDesFlow-AL-Alex20 | 22,755 | 8,743 | 24,997 | 25,000 | 30,000 | | MatterGen-Alex20 | 10,902 | 4,469 | 11,829 | 11,829 | 16,000 | | MatterGen-MP20 | 9,279 | 2,278 | 10,000 | 10,000 | 16,000 |
Benchmark Dataset Download and Directory Structure
All data in PhononBench consist of fully relaxed crystal structures generated by different crystal generative models and used for phonon-based dynamical stability evaluation.
The dataset follows a unified directory structure, e.g., /PhononBench/InvDesFlow-AL/relaxed/gpu0_part0/
Each relaxed directory contains a Label.txt file indicating whether each structure is dynamically stable based on phonon calculations. Stability labels are provided consistently for all models included in PhononBench.
The complete dataset is available at: https://zenodo.org/records/18185662
Note
Fully relaxing all generated structures and performing complete phonon calculations, as reported in the paper, can be very time-consuming. To facilitate quick evaluation, we provide the script summarize_relaxation_and_stability.py, which can be used to directly summarize the stability rates of different generative models by simply updating the data paths to the datasets downloaded from Zenodo. The detailed high-throughput workflow described below can also be used for screening purposes and is open for advanced users; however, due to its technical complexity, we recommend using our provided API for most applications.
🌐 API for DFT-Oriented Phonon Calculations
We provide a public web service PhononBench for rapid phonon evaluation powered by AI-based phonon models. 🔗 Web interface: http://phononbench.cn (If you encounter temporary downtime or inaccessibility during the review process, please feel free to contact us via the anonymous correspondence channel provided in the submission system. The service may occasionally be unavailable due to peak usage.)
<p align="center"> <img src="./API-PhononBench.png" alt="API PhononBench" width="500"> </p> Since its public release on January 8, 2026, the PhononBench API has been used by researchers from a broad range of academic and industrial institutions. This usage highlights the relevance of PhononBench as a practical tool for phonon-based dynamical stability assessment.Installation
PhononBench relies on MatterSim for DFT-level phonon calculations. We therefore strongly recommend installing MatterSim first, following the official environment setup, before using PhononBench.
Prerequisites
- Python ≥ 3.10
- mamba or micromamba (recommended for fast and reliable dependency resolution)
- Linux environment (recommended for large-scale phonon calculations)
We recommend installing MatterSim from source using mamba, as this is the most reliable setup for large-scale phonon calculations and was the environment used in this work.
# clone MatterSim
git clone https://github.com/microsoft/mattersim.git
cd mattersim
# create the environment
mamba env create -f environment.yaml
mamba activate mattersim
# install MatterSim in editable mode
uv pip install -e .
Evaluating Your Own Crystals
PhononBench provides a standardized workflow for evaluating the dynamical stability of crystal structures generated by custom models. To evaluate your own crystal generation model, follow the steps below.
Step 1: Generate Crystal Structures
First, use your crystal generation model to generate a large set of crystal structures and save them in CIF format.
- We recommend generating at least 10,000 crystal structures to ensure that, after duplicate removal and structure relaxation, more than 4,000 valid structures remain for reliable dynamical stability evaluation.
- Each structure should be saved as an individual
.ciffile.
Example directory structure:
your_model_outputs/
├── structure_00001.cif
├── structure_00002.cif
├── ...
└── structure_10000.cif
Step 2: Prepare Phonopy Input Files
Next, use the provided script batch_prepare_phonopy_input.py to automatically generate the Phonopy input files required for phonon calculations.
Only two arguments need to be specified:
--input_dir: directory containing your generated CIF files--out: output directory for phonon calculation inputs
The supercell size is controlled by --dim.
Example command:
python batch_prepare_phonopy_input.py \
--input_dir /your_Path/Benchmark/MatterGen-gen/dft_band_gap/1.5/gen-cifs \
--dim 2 2 2 \
--out /your_Path/Benchmark/MatterGen-gen/dft_band_gap/bg_1.5/phonon-calculation-input
This script will:
- Read all CIF files from
--input_dir - Build supercells according to
--dim - Generate compressed Phonopy input files (
.yaml.bz2) for each structure - Save all generated inputs to the directory specified by
--out
Step 3: Run Phonon Calculations with Multi-GPU Parallelization
After preparing the Phonopy input files, an additional utility repository is required to run large-scale phonon calculations.
First, clone the required repository:
git clone https://github.com/hyllios/utils.git
Then, copy the two phonon calculation scripts provided by PhononBench into the
benchmark_ph directory of the cloned repository:
phonon_multi_gpu_run.pysubmit_jobs.sh
Place both scripts under the following path:
utils/benchmark_ph/
After completing these steps, you can proceed with the multi-GPU phonon calculations as described in the next section.
PhononBench provides a multi-GPU parallel execution script (submit_jobs.sh) to efficiently perform large-scale phonon calculations using MatterSim, depending on the available GPU resources.
Users should modify the script according to their GPU configuration and directory structure.
Key Arguments
The following paths must be set according to your local setup:
-
--refDirectory containing the Phonopy input files (.yaml.bz2) generated in Step 2. -
--destOutput directory for phonon calculation results. -
--relaxedDestDirectory for saving relaxed crystal structures obtained during phonon calculations.
After configuring the paths and GPU settings, make the script executable and run it with:
chmod +x submit_jobs.sh
bash submit_jobs.sh
Notes
- The number of GPUs is controlled by
phys_gpusandlogic_gpus. - Each GPU is further divided into multiple sub-jobs via
subparts_per_gputo improve utilization. - The script distributes phonon calculations evenly across GPUs based on
gpu_indexandsubpart_index. - Log files are written separately for each GPU and sub-job to facilitate monitoring and debugging.
Fallback: Single-CIF Phonon Run
If the multi-GPU workflow is not available or for quick validation, PhononBench provides a standalone script to run a complete phonon calculation from a single CIF file. Copy phonon_from_cif_single.py into the utils/benchmark_ph/ directory of the cloned repository, and run:
python phonon_from_cif_single.py --cif /src/Si.cif --dim 2 2 2 --model mattersim-v1 --out /test
This script performs structure relaxation and phonon calculations using MatterSim and writes all results to the specified output directory. Successful execution confirms that the environment and phonon workflow are correctly configured before scaling to multi-GPU runs.
Step 4: Summarize Relaxation and Dynamical Stability Statistics
The script summarize_relaxation_and_stability.py automatically scans all model directories in PhononBench and summarizes relaxation success and phonon-based dynamical stability statistics for each crystal generative model.
Citation
If you use PhononBench in your research, please cite the following paper:
@misc{han2025phononbench,
title={PhononBench:A Large-Scale Phonon-Based Benchmark for Dynamical Stability in Crystal Generation},
author={Xiao-Qi Han and Peng-Jie Guo and Ze-Feng Gao and Zhong-Yi Lu},
year={2025},
eprint={2512.21227},
archivePrefix={arXiv},
primaryClass={cond-mat.mtrl-sci},
url={https://arxiv.org/abs/2512.21227},
}
Related Skills
node-connect
335.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
82.5kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
335.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
82.5kCommit, push, and open a PR
