FSCTEP
A Multi-Operator Equivariant Framework for High-Performance Machine Learning Force Fields, supporting External Fields embedding and Physical Tensors prediction.
Install / Use
/learn @Parity-LRX/FSCTEPREADME
FusedSCEquiTensorPot
FusedSCEquiTensorPot is an E(3)-equivariant neural potential for atomistic modeling with multiple tensor-product backends, explicit external-field conditioning, physical-tensor supervision, multi-fidelity training, and direct LAMMPS deployment.
Overview
- Backends: spherical, channelwise spherical (
spherical-save-cue), partial Cartesian, sparse Cartesian, ICTD, and strict-paritypure-cartesian-ictd-o3. - Field-aware learning: electric field (
1o), magnetic field (1e), and rank-aware tensor inputs can be embedded into equivariant message passing. - Physical tensor targets: charge, dipole, polarizability, quadrupole, BEC, and magnetic moment.
- Multi-fidelity: graph-level fidelity conditioning, delta-learning (
delta-baseline), per-fidelity weighting, and per-fidelity metrics. - Deployment: export to
core.pt, run throughUSER-MFFTORCHor ML-IAP, and support runtime field / fidelity control in LAMMPS.
What This Repository Focuses On
- Multiple Cartesian and spherical equivariant trunks under one training/evaluation stack.
- External-field-aware tensor learning, including explicit parity-sensitive O(3) modeling.
- End-to-end workflow from preprocessing and training to
core.ptexport and LAMMPS runtime. - Research-oriented extensions such as long-range prototypes, active learning, NEB, phonons, and thermal transport.
Dataset notes and conversion examples (rMD17, ANI-1x, QM7-X, SPICE, generic HDF5) are in USAGE.md (中文) and USAGE_EN.md (English).
Installation
pip install -e .
For a reproducible Linux CUDA setup with pinned PyTorch/cuEquivariance/PyG wheels:
bash scripts/install_pt271_cu128.sh
pip install -e .
Optional extras:
pip install -e ".[cue]"forspherical-save-cuepip install -e ".[pyg]"for faster PyG scatter / neighbor opspip install -e ".[al]"for SOAP-based active learning diversitypip install -e ".[thermal]"for thermal transport (phono3py,scipy)
Quick Start
1. Preprocess Data
mff-preprocess --input-file data.xyz --output-dir data
To skip neighbor list preprocessing (for quick sanity-check):
mff-preprocess --input-file data.xyz --output-dir data --skip-h5
If your extxyz uses custom field names, you can override them explicitly:
mff-preprocess \
--input-file custom.extxyz \
--output-dir data \
--energy-key REF_energy \
--force-key REF_force \
--species-key elem \
--coord-key coords \
--atomic-number-key atomic_number
2. Train
Minimal training:
mff --train --data-dir data --epochs 1000 --batch-size 8 --device cuda
Recommended backbone examples:
# Memory-efficient ICTD
mff --train --data-dir data --device cuda --tensor-product-mode pure-cartesian-ictd
# Full-parity O(3) ICTD
mff --train --data-dir data --device cuda --tensor-product-mode pure-cartesian-ictd-o3
# Sparse Cartesian
mff --train --data-dir data --device cuda --tensor-product-mode pure-cartesian-sparse
Field-aware training examples:
# Electric field + dipole/polarizability
mff --train --data-dir data --tensor-product-mode pure-cartesian-sparse \
--external-tensor-rank 1 --external-field-file data/efield.npy \
--physical-tensors dipole,polarizability \
--dipole-file data/dipole.npy --polarizability-file data/pol.npy \
--physical-tensor-weights "dipole:2.0,polarizability:1.0"
# Magnetic field (1e) + magnetic moment
mff --train --data-dir data --tensor-product-mode pure-cartesian-ictd-o3 \
--external-tensor-rank 1 --external-tensor-irrep 1e \
--o3-irrep-preset auto \
--o3-active-irreps '0e,1e,2e' \
--external-field-file data/bfield.npy \
--physical-tensors magnetic_moment \
--magnetic-moment-file data/magnetic_moment.npy
# Simultaneous electric field (1o) + magnetic field (1e)
mff --train --data-dir data --tensor-product-mode pure-cartesian-ictd-o3 \
--external-tensor-rank 1 --external-tensor-irrep 1o \
--external-field-file data/efield.npy \
--magnetic-field-file data/bfield.npy \
--o3-irrep-preset auto \
--o3-active-irreps '0e,1e,1o,2e'
Optional ZBL short-range repulsion:
mff --train --data-dir data --tensor-product-mode pure-cartesian-ictd \
--zbl-enabled \
--zbl-inner-cutoff 0.6 \
--zbl-outer-cutoff 1.2 \
--zbl-exponent 0.23 \
--zbl-energy-scale 1.0
Multi-Fidelity
Supported modes:
spherical-save-cuepure-cartesian-ictdpure-cartesian-ictd-o3pure-cartesian-sparsepure-cartesian-sparse-save
Conditioning-only multi-fidelity:
mff --train \
--data-dir data \
--tensor-product-mode pure-cartesian-ictd \
--num-fidelity-levels 2 \
--fidelity-id-file data/train_fidelity_id.npy \
--fidelity-loss-weights '0:1.0,1:3.0'
Delta-learning multi-fidelity:
mff --train \
--data-dir data \
--tensor-product-mode pure-cartesian-ictd-o3 \
--num-fidelity-levels 2 \
--multi-fidelity-mode delta-baseline \
--fidelity-id-file data/train_fidelity_id.npy \
--fidelity-loss-weights '0:1.0,1:3.0' \
--delta-regularization-weight 1e-4
Merge multiple processed HDF5 files into one multi-fidelity dataset:
mff --merge-multifidelity \
--inputs data/processed_pbe.h5 data/processed_hse.h5 \
--fidelity-ids 0 1 \
--output-h5 data/processed_train_mf.h5 \
--output-fidelity-npy data/train_fidelity_id.npy
Train with LES-style long-range (mesh_fft, recommended first-stage settings):
# 3D periodic reciprocal long-range
mff-train --data-dir data --tensor-product-mode pure-cartesian-ictd \
--long-range-mode reciprocal-spectral-v1 \
--long-range-reciprocal-backend mesh_fft \
--long-range-boundary periodic \
--long-range-mesh-size 16 \
--long-range-green-mode poisson \
--long-range-energy-partition potential \
--long-range-assignment cic
# Slab reciprocal long-range: x/y periodic + z vacuum padding
mff-train --data-dir data --tensor-product-mode pure-cartesian-ictd \
--long-range-mode reciprocal-spectral-v1 \
--long-range-reciprocal-backend mesh_fft \
--long-range-boundary slab \
--long-range-mesh-size 16 \
--long-range-slab-padding-factor 2 \
--long-range-green-mode poisson \
--long-range-energy-partition potential \
--long-range-assignment cic
Notes:
- Supported training architectures:
pure-cartesian-ictd,spherical-save-cue - Recommended first use: keep
--long-range-green-mode poisson ASEactive learning now supports the sameperiodic/slabboundary semantics for the Python calculator path
By default, dynamic loss weights a/b are clamped to [1, 1000] (they change during training). You can override the range:
mff-train --data-dir data --a 10.0 --b 100.0 --update-param 750 --weight-a-growth 1.05 --weight-b-decay 0.98 --a-max 1000 --b-min 1 --b-max 1000
Optional: override baseline atomic energies (E0):
# from CSV (Atom,E0)
mff-train --data-dir data --atomic-energy-file data/fitted_E0.csv
# or directly from CLI
mff-train --data-dir data --atomic-energy-keys 1 6 7 8 --atomic-energy-values -430.53 -821.03 -1488.19 -2044.35
3. Evaluation
Evaluate a trained model. The recommended default is to let mff-evaluate restore model-structure hyperparameters and tensor_product_mode from the checkpoint automatically:
mff-evaluate --checkpoint combined_model.pth --test-prefix test --output-prefix test --use-h5
If you explicitly pass conflicting structure arguments such as --tensor-product-mode, --embedding-dim, --output-size, or --invariant-channels, the CLI takes precedence over the checkpoint. For new checkpoints, mff-evaluate can also restore atomic_energy_keys/atomic_energy_values directly from the checkpoint; older checkpoints still fall back to local fitted_E0.csv behavior. Only pass those arguments when you intentionally want to override the checkpoint configuration.
Outputs include:
test_loss.csvtest_energy.csvtest_force.csv
Optional: use --compile e3trans to accelerate evaluation with torch.compile.
For molecular dynamics simulation:
mff-evaluate --checkpoint combined_model.pth --md-sim
For NEB (Nudged Elastic Band) calculations:
mff-evaluate --checkpoint combined_model.pth --neb
For phonon spectrum (Hessian, vibrational frequencies):
mff-evaluate --checkpoint combined_model.pth --phonon --phonon-input structure.xyz
Optional: stress training (PBC with stress/virial in XYZ):
mff-train --data-dir data -c 0.1 --input-file pbc_with_stress.xyz
4. Active Learning (Optional) 🔄
Grow your training set automatically where the potential is under-sampled: one CLI runs the full train → explore → select → label (DFT) → merge loop. Works on a single machine (PySCF, VASP, …) or on HPC (SLURM, one job per structure).
# Local: PySCF, 8 parallel workers
mff-active-learn --explore-type ase --explore-mode md --label-type pyscf \
--pyscf-method b3lyp --pyscf-basis 6-31g* \
--label-n-workers 8 --md-steps 500 --n-iterations 5
# HPC: SLURM, one job per structure
mff-active-learn --explore-type ase --label-type slurm \
--slurm-template dft_job.sh --slurm-partition cpu \
--slurm-nodes 1 --slurm-ntasks 32 --slurm-time 04:00:00
📖 Full CLI & options: USAGE.md (中文) · USAGE_EN.md (English) · ACTIVE_LEARNING.md (backends, multi-stage, FAQ
