PhononBench

PhononBench is a phonon-based benchmark for large-scale dynamical stability evaluation of AI-generated crystals, featuring 100k+ structures, DFT-level MatterSim phonon calculations, and open-source high-throughput workflows.

Generate Convert Improve

Install / Use

/learn @xqh19970407/PhononBench

About this skill

Quality Score

0/100

README

PhononBench (API:http://phononbench.cn)

<p align="center"> <img src="./fig-main.png" alt="PhononBench Overview" width="500"> </p> PhononBench is a phonon-based benchmark for large-scale dynamical stability evaluation of AI-generated crystals, featuring 100k+ structures, DFT-level MatterSim phonon calculations, and open-source high-throughput workflows.

Summary of Crystal Generation and Dynamical Stability Statistics

| Model | Relaxed | Dynamically Stable | Input Script Success | Unique CIFs | Total Generated | |------------------------|---------:|-------------------:|---------------------:|------------:|----------------:| | CrystalFlow-MP20 | 8,533 | 1,428 | 8,852 | 9,952 | 16,000 | | CrystalFormer-Alex20 | 8,642 | 2,969 | 8,807 | 8,986 | 40,000 | | CrystalFormer-MP20 | 4,408 | 510 | 4,990 | 5,143 | 20,000 | | CrystaLLM-MP20 | 1,951 | 58 | 2,074 | 2,074 | 16,000 | | DiffCSP-MP20 | 9,163 | 2,488 | 9,959 | 10,000 | 16,000 | | InvDesFlow-AL-MP20 | 8,000 | 2,176 | – | – | – | | InvDesFlow-AL-Alex20 | 22,755 | 8,743 | 24,997 | 25,000 | 30,000 | | MatterGen-Alex20 | 10,902 | 4,469 | 11,829 | 11,829 | 16,000 | | MatterGen-MP20 | 9,279 | 2,278 | 10,000 | 10,000 | 16,000 |

Benchmark Dataset Download and Directory Structure

All data in PhononBench consist of fully relaxed crystal structures generated by different crystal generative models and used for phonon-based dynamical stability evaluation.

The dataset follows a unified directory structure, e.g., /PhononBench/InvDesFlow-AL/relaxed/gpu0_part0/

Each relaxed directory contains a Label.txt file indicating whether each structure is dynamically stable based on phonon calculations. Stability labels are provided consistently for all models included in PhononBench.

The complete dataset is available at: https://zenodo.org/records/18185662

Note

Fully relaxing all generated structures and performing complete phonon calculations, as reported in the paper, can be very time-consuming. To facilitate quick evaluation, we provide the script summarize_relaxation_and_stability.py, which can be used to directly summarize the stability rates of different generative models by simply updating the data paths to the datasets downloaded from Zenodo. The detailed high-throughput workflow described below can also be used for screening purposes and is open for advanced users; however, due to its technical complexity, we recommend using our provided API for most applications.

🌐 API for DFT-Oriented Phonon Calculations

We provide a public web service PhononBench for rapid phonon evaluation powered by AI-based phonon models. 🔗 Web interface: http://phononbench.cn (If you encounter temporary downtime or inaccessibility during the review process, please feel free to contact us via the anonymous correspondence channel provided in the submission system. The service may occasionally be unavailable due to peak usage.)

<p align="center"> <img src="./API-PhononBench.png" alt="API PhononBench" width="500"> </p> Since its public release on January 8, 2026, the PhononBench API has been used by researchers from a broad range of academic and industrial institutions. This usage highlights the relevance of PhononBench as a practical tool for phonon-based dynamical stability assessment.

Installation

PhononBench relies on MatterSim for DFT-level phonon calculations. We therefore strongly recommend installing MatterSim first, following the official environment setup, before using PhononBench.

Prerequisites

Python ≥ 3.10
mamba or micromamba (recommended for fast and reliable dependency resolution)
Linux environment (recommended for large-scale phonon calculations)

We recommend installing MatterSim from source using mamba, as this is the most reliable setup for large-scale phonon calculations and was the environment used in this work.

# clone MatterSim
git clone https://github.com/microsoft/mattersim.git
cd mattersim

# create the environment
mamba env create -f environment.yaml
mamba activate mattersim

# install MatterSim in editable mode
uv pip install -e .

Evaluating Your Own Crystals

PhononBench provides a standardized workflow for evaluating the dynamical stability of crystal structures generated by custom models. To evaluate your own crystal generation model, follow the steps below.

Step 1: Generate Crystal Structures

First, use your crystal generation model to generate a large set of crystal structures and save them in CIF format.

We recommend generating at least 10,000 crystal structures to ensure that, after duplicate removal and structure relaxation, more than 4,000 valid structures remain for reliable dynamical stability evaluation.
Each structure should be saved as an individual .cif file.

Example directory structure:

your_model_outputs/
├── structure_00001.cif
├── structure_00002.cif
├── ...
└── structure_10000.cif

Step 2: Prepare Phonopy Input Files

Next, use the provided script batch_prepare_phonopy_input.py to automatically generate the Phonopy input files required for phonon calculations.

Only two arguments need to be specified:

--input_dir: directory containing your generated CIF files
--out: output directory for phonon calculation inputs

The supercell size is controlled by --dim.

Example command:

python batch_prepare_phonopy_input.py \
    --input_dir /your_Path/Benchmark/MatterGen-gen/dft_band_gap/1.5/gen-cifs \
    --dim 2 2 2 \
    --out /your_Path/Benchmark/MatterGen-gen/dft_band_gap/bg_1.5/phonon-calculation-input

This script will:

Read all CIF files from --input_dir
Build supercells according to --dim
Generate compressed Phonopy input files (.yaml.bz2) for each structure
Save all generated inputs to the directory specified by --out

Step 3: Run Phonon Calculations with Multi-GPU Parallelization

After preparing the Phonopy input files, an additional utility repository is required to run large-scale phonon calculations.

First, clone the required repository:

git clone https://github.com/hyllios/utils.git

Then, copy the two phonon calculation scripts provided by PhononBench into the benchmark_ph directory of the cloned repository:

phonon_multi_gpu_run.py
submit_jobs.sh

Place both scripts under the following path:

utils/benchmark_ph/

After completing these steps, you can proceed with the multi-GPU phonon calculations as described in the next section.

PhononBench provides a multi-GPU parallel execution script (submit_jobs.sh) to efficiently perform large-scale phonon calculations using MatterSim, depending on the available GPU resources.

Users should modify the script according to their GPU configuration and directory structure.

Key Arguments

The following paths must be set according to your local setup:

--ref Directory containing the Phonopy input files (.yaml.bz2) generated in Step 2.
--dest Output directory for phonon calculation results.
--relaxedDest Directory for saving relaxed crystal structures obtained during phonon calculations.

After configuring the paths and GPU settings, make the script executable and run it with:

chmod +x submit_jobs.sh
bash submit_jobs.sh

Notes

The number of GPUs is controlled by phys_gpus and logic_gpus.
Each GPU is further divided into multiple sub-jobs via subparts_per_gpu to improve utilization.
The script distributes phonon calculations evenly across GPUs based on gpu_index and subpart_index.
Log files are written separately for each GPU and sub-job to facilitate monitoring and debugging.

Fallback: Single-CIF Phonon Run

If the multi-GPU workflow is not available or for quick validation, PhononBench provides a standalone script to run a complete phonon calculation from a single CIF file. Copy phonon_from_cif_single.py into the utils/benchmark_ph/ directory of the cloned repository, and run:

python phonon_from_cif_single.py --cif /src/Si.cif --dim 2 2 2 --model mattersim-v1 --out /test

This script performs structure relaxation and phonon calculations using MatterSim and writes all results to the specified output directory. Successful execution confirms that the environment and phonon workflow are correctly configured before scaling to multi-GPU runs.

Step 4: Summarize Relaxation and Dynamical Stability Statistics

The script summarize_relaxation_and_stability.py automatically scans all model directories in PhononBench and summarizes relaxation success and phonon-based dynamical stability statistics for each crystal generative model.

Citation

If you use PhononBench in your research, please cite the following paper:

@misc{han2025phononbench,
      title={PhononBench:A Large-Scale Phonon-Based Benchmark for Dynamical Stability in Crystal Generation}, 
      author={Xiao-Qi Han and Peng-Jie Guo and Ze-Feng Gao and Zhong-Yi Lu},
      year={2025},
      eprint={2512.21227},
      archivePrefix={arXiv},
      primaryClass={cond-mat.mtrl-sci},
      url={https://arxiv.org/abs/2512.21227}, 
}

Related Skills

node-connect

335.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

82.5k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

335.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

82.5k

Commit, push, and open a PR