WaterNES
Workflows to calculate relative free energies using non-equilibrium switching for buried water molecules
Install / Use
/learn @MobleyLab/WaterNESREADME
waterNES
⚠️ Development Status
main→ stable, reproducible workflows used in older publicationssw_devel→ active development, new features, ongoing changes published in a recent paper
If you are using this code for research or reproduction, use main.
If you want the latest updates:
git checkout sw_devel
This repository serves as the main repository for the following two open access publications:
-
Advancing Binding Affinity Calculations: A Non-Equilibrium Simulations Approach for Calculation of Relative Binding Free Energies in Systems with Trapped Waters; Swapnil Wagle, Christopher Bayly and David Mobley; ChemRxiv, 2025, DOI: 10.26434/chemrxiv-2025-xtjl3 (https://chemrxiv.org/engage/chemrxiv/article-details/67d3ea816dde43c908c21ba4)
-
Leveraging a Separation of States Method for Relative Binding Free Energy Calculations in Systems with Trapped Waters; Swapnil Wagle, Pascal T. Merz, Yunhui Ge, Christopher I. Bayly and David L. Mobley; J. Chem. Theory Comput. 2024, 20, 11013−11031 (https://pubs.acs.org/doi/10.1021/acs.jctc.4c01145)
This repository contains workflows to:
-
calculate relative free energies (RBFEs) between ligand pairs with different numbers of trapped water molecules and implementing a non-equilibrium switching (NES)-based workflow, using thermodynamic cycle

-
calculate RBFEs between ligands that bind to target protein with different numbers of trapped water molecules. The RBFE is calculated using the following thermodynamic cycle

-
calculate absolute binding free energies (ABFEs) of trapped waters in proteins/protein-ligand complexes using the following thermodynamic cycle

The purple cycle includes additional restraints for the protein, and is preferred when it is likely that the binding site might collapse once the water is removed. In all other cases, the simpler blue cycle can be used.
Cycle branches
- A, the target branch: We will find its value by summing all other branches.
- B, water restraints correction for interacting water: Calculated from equilibrium simulations of its end points (stages 1 and 2, simulations of the protein interacting with the unrestrained and restrained water, respectively).
- C, water coupling / decoupling: Calculated using multiple NES simulations from each end point to the other (stages 2 and 3, simulations of the protein with and without interactions with the restrained water). The starting structures are obtained from long equilibrium simulations at the end points.
- D, water restraints correction for non-interacting water: This value can be calculated analytically, no simulations required.
- E, transfer of non-interacting water to bulk water: This branch does not involve any free energy change, $\Delta G_{E} = 0$.
- F, turn on interactions of water molecule in bulk water: This process is only depending on the water model, so it can be simulated once and reused for every system which uses the same water model. This is currently not part of the workflow.
Stages
- Stage 1: This simulates the protein complex with an unrestrained, fully interacting water molecule. Simulations include minimization, equilibration in NVT and NPT, production simulation in NPT. During the production run, the free energy with respect to the water restraint (which is turned off) is calculated.
- Stage 2: This simulates the protein complex with a restrained, fully interacting water molecule. Simulations include minimization, equilibration in NVT and NPT, production simulation in NPT. During the production run, the free energy with respect to the water restraint (which is turned on) is calculated. Characteristic structures from the production simulation are used to start NES simulations (transition stage 2 to 3).
- Stage 3: This simulates the protein complex with a restrained, non-interacting water molecule. Simulations include minimization, equilibration in NVT and NPT, production simulation in NPT. Characteristic structures from the production simulation are used to start NES simulations (transition stage 3 to 2).
- Stages 4 and 5: Coming soon.
Systems
Currently, this repo contains input files for five proteins, each in combination with one to three different ligands. The protein-ligand complexes are named after their PDB ID. They include
- HIV-1 protease: 1HPX, 1EC0, 1EBW
- Trypsin: 1AZ8, 1C5T, 1GI1
- Scytalone dehydratase[1]: 3STD, 4STD, 7STD
- Factor Xa: 1EZQ, 1LPG, 1F0S
- BPTI: 5PTI
[1]: 3STD has a different water site than 4STD and 7STD
Usage
Run simulations
All simulations can be run using the bash file
scripts/run_simulations.sh. There is an integrated documentation
which can be called by running the script with the -h flag:
$ bash scripts/run_simulations.sh -h
Run Water NES simulations.
USAGE: run_simulations.sh [-h] -d dir -t dir [-c crd] -x gmx [-o opts] -s N -p {min,eqNVT,eqNPT,prod,NES} [-n N]
Options:
-h Print this help and exit
-d dir The base working directory for the stage
-t top The topology directory including topology files
-c crd The input coordinates
Mandatory for the minimization phase, ignored otherwise
-x gmx The GROMACS executable to use
-o opts The run options, e.g. "-nt 16 -ntomp 4", optional
-s N The stage to run, one of:
1: Stage with fully interacting, non-restrained water
2: Stage with fully interacting, restrained water
3: Stage with non-interacting, restrained water
-p {min,eqNVT,eqNPT,prod,NES}
The simulation phase to run, one or more of: min, eqNVT, eqNPT, prod, NES
Separate different phases by spaces, e.g. -p "min eqNVT eqNPT"
Note: The NES phase is only available for stages 2 and 3
-n N The run number for the NES phase
Mandatory for the NES phase, ignored otherwise
In the above USAGE string, flags in brackets ([]) are optional,
options in braces ({}) denote possible values, and all other flags
are mandatory.
The directory (-d) is the base directory. All simulations input and
output is going to be written in subdirectories of this base
directory. The only requirement for this directory is full write
access. All files within the directory might be overwritten.
The topology directory (-t) needs to contain topology files for the
system to be simulated. Required files are the main topology named as
system.top, as well as files containing $\lambda$-dependent and
$\lambda$-independent position restraints for the water, named
waterRestraintLambdaIndependent.top and
waterRestraintLambdaDependent.top, respectively.
The coordinate file (-c) needs to contain starting coordinates for
the system matching with the topology found in the topology directory.
This is only needed for the minimization phase.
The GROMACS executable to use (-x) can either be a path to an
executable, or simply gmx if it is in the path.
The run options (-o) are only used for the call to gmx mdrun, and
allow to set parallelization details or other run time options
accepted by gmx mdrun.
The stage (-s) denotes one of the stages of the thermodynamic cycle
as defined above.
The simulation phase (-p) defines the simulation to be run, one of
min (minimization of the input configuration), eqNVT
(thermalization at constant volume), eqNPT (equilibration at
constant temperature and pressure), prod (production simulation in
NPT), and NES (non-equilibrium transition simulations, only for
stages 2 and 3). Each phase builds on the previous phase. Several
phases can be combined. Note that the NES phase requires that the
starting structures were prepared from the result of the prod phase,
so it cannot be directly chained.
For the NES phase, the run number needs to be given with the flag
-n. Each run starts from a different configuration, extracted from
the production simulation with the scripts/prepare_nes_structures.sh
script. The -n flag is ignored for all other phases.
Prepare configurations for NES runs
NES simulations need to be run from multiple starting structures
representative of the ensemble of configurations in the end states. To
achieve this, we extract configurations from the long NPT production
simulations in stages 2 and 3. This operation can be performed using
the scripts/prepare_nes_structures.sh script. Again, a help function
describes the functionality:
Prepare structures for NES simulations
USAGE: prepare_nes_structures.sh [-h] -d dir -x gmx -n num
Options:
-h Print this help and exit
-d dir The base working directory of the stage
This directory is expected to have a subdirectory prod/
with the results of the production simulation of that stage
-x gmx The GROMACS executable to use
-n num The number of structures to prepare
In the above USAGE string, flags in brackets ([]) are optional,
while all other flags are mandatory.
The directory (-d) is the base directory of the stage in which all
simulation input and output was written, and should be the same
directory that was used for the -d flag of the run_simulations.sh
script. Specifically, the script is looking for the mdp file and the
trajectory file in the prod/ folder under this base directory.
The GROMACS executable to use (-x) can either be a path to an
executable, or simply gmx if it is in the path.
Finally, -n denotes the number of structures to prepare. This
defines how many structures are extracted from the production
simulation for further use.
Analysis
Edge B
Analysis of the equilibrium endpoint free energy differe
Related Skills
node-connect
352.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
