FusedSCEquiTensorPot

FusedSCEquiTensorPot is an E(3)-equivariant neural potential for atomistic modeling with multiple tensor-product backends, explicit external-field conditioning, physical-tensor supervision, multi-fidelity training, and direct LAMMPS deployment.

Overview

Backends: spherical, channelwise spherical (spherical-save-cue), partial Cartesian, sparse Cartesian, ICTD, and strict-parity pure-cartesian-ictd-o3.
Field-aware learning: electric field (1o), magnetic field (1e), and rank-aware tensor inputs can be embedded into equivariant message passing.
Physical tensor targets: charge, dipole, polarizability, quadrupole, BEC, and magnetic moment.
Multi-fidelity: graph-level fidelity conditioning, delta-learning (delta-baseline), per-fidelity weighting, and per-fidelity metrics.
Deployment: export to core.pt, run through USER-MFFTORCH or ML-IAP, and support runtime field / fidelity control in LAMMPS.

What This Repository Focuses On

Multiple Cartesian and spherical equivariant trunks under one training/evaluation stack.
External-field-aware tensor learning, including explicit parity-sensitive O(3) modeling.
End-to-end workflow from preprocessing and training to core.pt export and LAMMPS runtime.
Research-oriented extensions such as long-range prototypes, active learning, NEB, phonons, and thermal transport.

Dataset notes and conversion examples (rMD17, ANI-1x, QM7-X, SPICE, generic HDF5) are in USAGE.md (中文) and USAGE_EN.md (English).

Installation

pip install -e .

For a reproducible Linux CUDA setup with pinned PyTorch/cuEquivariance/PyG wheels:

bash scripts/install_pt271_cu128.sh
pip install -e .

Optional extras:

pip install -e ".[cue]" for spherical-save-cue
pip install -e ".[pyg]" for faster PyG scatter / neighbor ops
pip install -e ".[al]" for SOAP-based active learning diversity
pip install -e ".[thermal]" for thermal transport (phono3py, scipy)

Quick Start

1. Preprocess Data

mff-preprocess --input-file data.xyz --output-dir data

To skip neighbor list preprocessing (for quick sanity-check):

mff-preprocess --input-file data.xyz --output-dir data --skip-h5

If your extxyz uses custom field names, you can override them explicitly:

mff-preprocess \
  --input-file custom.extxyz \
  --output-dir data \
  --energy-key REF_energy \
  --force-key REF_force \
  --species-key elem \
  --coord-key coords \
  --atomic-number-key atomic_number

2. Train

Minimal training:

mff --train --data-dir data --epochs 1000 --batch-size 8 --device cuda

Recommended backbone examples:

# Memory-efficient ICTD
mff --train --data-dir data --device cuda --tensor-product-mode pure-cartesian-ictd

# Full-parity O(3) ICTD
mff --train --data-dir data --device cuda --tensor-product-mode pure-cartesian-ictd-o3

# Sparse Cartesian
mff --train --data-dir data --device cuda --tensor-product-mode pure-cartesian-sparse

Field-aware training examples:

# Electric field + dipole/polarizability
mff --train --data-dir data --tensor-product-mode pure-cartesian-sparse \
  --external-tensor-rank 1 --external-field-file data/efield.npy \
  --physical-tensors dipole,polarizability \
  --dipole-file data/dipole.npy --polarizability-file data/pol.npy \
  --physical-tensor-weights "dipole:2.0,polarizability:1.0"

# Magnetic field (1e) + magnetic moment
mff --train --data-dir data --tensor-product-mode pure-cartesian-ictd-o3 \
  --external-tensor-rank 1 --external-tensor-irrep 1e \
  --o3-irrep-preset auto \
  --o3-active-irreps '0e,1e,2e' \
  --external-field-file data/bfield.npy \
  --physical-tensors magnetic_moment \
  --magnetic-moment-file data/magnetic_moment.npy

# Simultaneous electric field (1o) + magnetic field (1e)
mff --train --data-dir data --tensor-product-mode pure-cartesian-ictd-o3 \
  --external-tensor-rank 1 --external-tensor-irrep 1o \
  --external-field-file data/efield.npy \
  --magnetic-field-file data/bfield.npy \
  --o3-irrep-preset auto \
  --o3-active-irreps '0e,1e,1o,2e'

Optional ZBL short-range repulsion:

mff --train --data-dir data --tensor-product-mode pure-cartesian-ictd \
  --zbl-enabled \
  --zbl-inner-cutoff 0.6 \
  --zbl-outer-cutoff 1.2 \
  --zbl-exponent 0.23 \
  --zbl-energy-scale 1.0

Multi-Fidelity

Supported modes:

spherical-save-cue
pure-cartesian-ictd
pure-cartesian-ictd-o3
pure-cartesian-sparse
pure-cartesian-sparse-save

Conditioning-only multi-fidelity:

mff --train \
  --data-dir data \
  --tensor-product-mode pure-cartesian-ictd \
  --num-fidelity-levels 2 \
  --fidelity-id-file data/train_fidelity_id.npy \
  --fidelity-loss-weights '0:1.0,1:3.0'

Delta-learning multi-fidelity:

mff --train \
  --data-dir data \
  --tensor-product-mode pure-cartesian-ictd-o3 \
  --num-fidelity-levels 2 \
  --multi-fidelity-mode delta-baseline \
  --fidelity-id-file data/train_fidelity_id.npy \
  --fidelity-loss-weights '0:1.0,1:3.0' \
  --delta-regularization-weight 1e-4

Merge multiple processed HDF5 files into one multi-fidelity dataset:

mff --merge-multifidelity \
  --inputs data/processed_pbe.h5 data/processed_hse.h5 \
  --fidelity-ids 0 1 \
  --output-h5 data/processed_train_mf.h5 \
  --output-fidelity-npy data/train_fidelity_id.npy

Train with LES-style long-range (mesh_fft, recommended first-stage settings):

# 3D periodic reciprocal long-range
mff-train --data-dir data --tensor-product-mode pure-cartesian-ictd \
  --long-range-mode reciprocal-spectral-v1 \
  --long-range-reciprocal-backend mesh_fft \
  --long-range-boundary periodic \
  --long-range-mesh-size 16 \
  --long-range-green-mode poisson \
  --long-range-energy-partition potential \
  --long-range-assignment cic

# Slab reciprocal long-range: x/y periodic + z vacuum padding
mff-train --data-dir data --tensor-product-mode pure-cartesian-ictd \
  --long-range-mode reciprocal-spectral-v1 \
  --long-range-reciprocal-backend mesh_fft \
  --long-range-boundary slab \
  --long-range-mesh-size 16 \
  --long-range-slab-padding-factor 2 \
  --long-range-green-mode poisson \
  --long-range-energy-partition potential \
  --long-range-assignment cic

Notes:

Supported training architectures: pure-cartesian-ictd, spherical-save-cue
Recommended first use: keep --long-range-green-mode poisson
ASE active learning now supports the same periodic / slab boundary semantics for the Python calculator path

By default, dynamic loss weights a/b are clamped to [1, 1000] (they change during training). You can override the range:

mff-train --data-dir data --a 10.0 --b 100.0 --update-param 750 --weight-a-growth 1.05 --weight-b-decay 0.98 --a-max 1000 --b-min 1 --b-max 1000

Optional: override baseline atomic energies (E0):

# from CSV (Atom,E0)
mff-train --data-dir data --atomic-energy-file data/fitted_E0.csv

# or directly from CLI
mff-train --data-dir data --atomic-energy-keys 1 6 7 8 --atomic-energy-values -430.53 -821.03 -1488.19 -2044.35

3. Evaluation

Evaluate a trained model. The recommended default is to let mff-evaluate restore model-structure hyperparameters and tensor_product_mode from the checkpoint automatically:

mff-evaluate --checkpoint combined_model.pth --test-prefix test --output-prefix test --use-h5

If you explicitly pass conflicting structure arguments such as --tensor-product-mode, --embedding-dim, --output-size, or --invariant-channels, the CLI takes precedence over the checkpoint. For new checkpoints, mff-evaluate can also restore atomic_energy_keys/atomic_energy_values directly from the checkpoint; older checkpoints still fall back to local fitted_E0.csv behavior. Only pass those arguments when you intentionally want to override the checkpoint configuration.

Outputs include:

test_loss.csv
test_energy.csv
test_force.csv

Optional: use --compile e3trans to accelerate evaluation with torch.compile.

For molecular dynamics simulation:

mff-evaluate --checkpoint combined_model.pth --md-sim

For NEB (Nudged Elastic Band) calculations:

mff-evaluate --checkpoint combined_model.pth --neb

For phonon spectrum (Hessian, vibrational frequencies):

mff-evaluate --checkpoint combined_model.pth --phonon --phonon-input structure.xyz

Optional: stress training (PBC with stress/virial in XYZ):

mff-train --data-dir data -c 0.1 --input-file pbc_with_stress.xyz

4. Active Learning (Optional) 🔄

Grow your training set automatically where the potential is under-sampled: one CLI runs the full train → explore → select → label (DFT) → merge loop. Works on a single machine (PySCF, VASP, …) or on HPC (SLURM, one job per structure).

# Local: PySCF, 8 parallel workers
mff-active-learn --explore-type ase --explore-mode md --label-type pyscf \
    --pyscf-method b3lyp --pyscf-basis 6-31g* \
    --label-n-workers 8 --md-steps 500 --n-iterations 5

# HPC: SLURM, one job per structure
mff-active-learn --explore-type ase --label-type slurm \
    --slurm-template dft_job.sh --slurm-partition cpu \
    --slurm-nodes 1 --slurm-ntasks 32 --slurm-time 04:00:00

📖 Full CLI & options: USAGE.md (中文) · USAGE_EN.md (English) · ACTIVE_LEARNING.md (backends, multi-stage, FAQ

FSCTEP

Install / Use

README