Subpex
SubPEx (Sub-Pocket Explorer) is a tool to enhance ensemble (multiple-receptor/relaxed-complex) virtual screening. It uses weighted ensemble path sampling and molecular dynamics simulations to accelerate binding-pocket sampling.
Install / Use
/learn @durrantlab/SubpexREADME
What is it?
Subpocket explorer (SubPEx) is a tool that uses weighted ensemble (WE) path sampling, as implemented in WESTPA, to maximize pocket conformational search. SubPEx's goal is to produce a diverse ensemble of protein conformations for use in ensemble docking.
As with any WE implementation, SubPEx uses a progress coordinate to focus computational power on sampling phase space. The available progress coordinates are:
- composite RMSD (a linear combination of backbone and pocket heavy-atom RMSD)
- pocket heavy atoms RMSD
- backbone RMSD
- Jaccard distance of pocket volumes (jd)
We highly recommend using the composite RMSD progress coordinate. Use of other coordinates may have inadequate performance.
Installation
The first step is to download, clone, or copy the repository.
git clone git@github.com:durrantlab/subpex.git
The repository includes lock files to build the same environment we use for development. Simply run this command (after ensuring you have Makefile tools installed):
make environment
To activate the new conda environment, run:
conda activate subpex-dev
Some users may wish to create their own environments or to use an existing WESTPA environment. If so, install the following packages so SubPEx can calculate the progress coordinate:
- loguru
- mdanalysis
- westpa
- numpy
- scipy
- scikit-learn
- jinja2
- pydantic
- pydantic-settings
Usage
TODO: Needs to be updated to new version
Users should take advantage of our autobuilder (wizard.py) to setup their
SubPEx simulations. In some cases, however, users may wish to manually set up
their simulations by editing key SubPEx/WESTPA files. This approach is not
officially supported, but we provide the below instructions for
advanced/adventurous users.
Link your preliminary, equilibrated simulation
-
SubPEx assumes you have already run preliminary simulations to equilibrate your system. Soft link or copy your preliminary, equilibrated trajectories and necessary restart files to the
./reference/directory. Rename the filesmolwith the appropriate extension. (Note that./reference/already contains thenamd.md.confandamber.prod_npt.intemplate files, which SubPEX uses to interface with the NAMD and AMBER MD engines, respectively.)- If using NAMD, soft link the
.dcdfile of the final equilibration run. NAMD requires other files to restart simulations as well. Be sure to soft link the.xsc,.coor, and.inpcrdfiles as well. Remember the.prmtopfile as well.
ln -s /file/path/to/simulation/my_namd_file.dcd /WEST/ROOT/reference/mol.dcd ln -s /file/path/to/simulation/my_namd_file.xsc /WEST/ROOT/reference/mol.xsc ln -s /file/path/to/simulation/my_namd_file.coor /WEST/ROOT/reference/mol.coor ln -s /file/path/to/simulation/my_namd_file.inpcrd /WEST/ROOT/reference/mol.inpcrd ln -s /file/path/to/simulation/my_namd_file.prmtop /WEST/ROOT/reference/mol.prmtop- If using Amber, the filetype that works with the SubPEx algorithm is
.nc. You need to soft link the.rstfile of the final equilibration run as well. Remember the.prmtopfile as well.
ln -s /file/path/to/trajectory/my_amber_file.nc /WEST/ROOT/reference/mol.nc ln -s /file/path/to/trajectory/my_amber_file.rst /WEST/ROOT/reference/mol.rst ln -s /file/path/to/trajectory/my_amber_file.prmtop /WEST/ROOT/reference/mol.prmtop - If using NAMD, soft link the
-
Extract the last frame of the preliminary, equilibrated trajectory as a
pdbfile with your preferred molecular analysis program (e.g., VMD). Soft link that to the./reference/directory as well, and name the linklast_frame.pdb.ln -s /file/path/to/last/frame/my_last_frame.pdb /WEST/ROOT/reference/last_frame.pdb
Edit the west.cfg file
- Edit the following parameters in the
west.cfgfile:- the directory portion of the path variables, though the basename itself should not change. NOTE: Be sure to use full (not relative) paths.
reference: the PDB file that will be used in EVERY SINGLE progress-coordinate calculation (the last frame of the preliminary, equilibrated simulation mentioned above).selection_file: path to a text file that will contain the pocket selection string (MDAnalysis selection notation). This file will be automatically generated in a subsequent step, but specify its future path here.reference_fop: path to anxyzfile that will contain the field of points needed to calculate thejdprogress coordinate. This file is also useful for visualizing the selected pocket. It will be automatically generated in a subsequent step.west_home: home directory of the SubPEx run. You'll most likely want to use the same directory that contains thewest.cfgfile itself.topology: topology file needed for the MD simulations (likely the same topology file used in the preliminary, equilibrated simulations).- the progress coordinate (
pcoord) to use. composite: composite RMSD (recommended)prmsd: pocket heavy atoms RMSD (not officially supported)bb: backbone RMSD (not officially supported)jd: Jaccard distance (not officially supported)- the auxiliary data (
auxdata) to calculate and save. composite: composite RMSDprmsd: pocket heavy atoms RMSD*pvol: pocket volume (requiresjdtoo)bb: backbone RMSDrog: radius of gyration of the pocket (requiresjdtoo)jd: Jaccard distance- make sure that the WESTPA progress coordinate and auxdata match the SubPEx
ones (these sections are both found in the
west.cfgfile). - The WESTPA progress coordinate is specified at
west -> data -> datasets,subpex -> pcoord, and inadaptive_binning/adaptive.py - The WESTPA auxiliary data is at
west -> executable -> datasets - The SubPEx progress coordinate is at
subpex -> pcoord - The SubPEx auxiliary data is at
subpex -> auxdata
Define the pocket to sample
- You must define the location of the binding pocket you wish to sample. Find
the coordinates of the pocket center and radius using the extracted last
frame.
- Visual inspection is often useful at this step. You might first create a PDB file with a CA dummy atom. Load that together with the extracted last frame of the previous step into your preferred visualization software (ChimeraX, PyMol, VMD, etc.). Then manually move the dummy atom to the pocket center and measure its location. Similarly, use the dummy atom to determine the radius from that center required to encompass the pocket of interest.
- Return to the
west.cfgfile and edit the following parameters:
center: the pocket centerradius: the pocket radiusresolution: the distance between adjacent pocket-filling grid points (especially important if using thejdprogress coordinate)
- Run
python westpa_scripts/get_reference_fop.py west.cfg. This script will generate the files specified by theselection_fileandreference_fopparameters in thewest.cfgfile. - Visually inspect the pocket field of points (fop) and/or the selection string (MDAnalysis selection syntax).
- Ensure that the points in the fop (
reference_fop) file entirely fill the pocket of interest. - Ensure that the residues (
selection_file) truly line the pocket of interest. - Note that the popular molecular visualization program VMD can load
xyzfiles and select residues.
- After visual inspection, adjust the
west.cfgfile (center,radius, andresolutionparameters) and re-run thewestpa_scripts/get_reference_fop.pyscript. Continue to recalculate the pocket as needed to fine-tune your pocket.
Setup the progress coordinate calculations
- Update the variables in the
adaptive_binning/adaptive.pyfile to indicate the number of walkers per bin
