Pdbremix
analyse PDB files, run molecular-dynamics & analyse trajectories
Install / Use
/learn @boscoh/PdbremixREADME
title: pdbremix documentation
python version 2 not 3
pdbremix
pdbremix is a library to analyze protein structures and protein simulations
The library consists of:
- tools to analyze and view PDB structures
- tools to run MD simulations and analyze MD trajectories
- python interface to analyze PDB structures
- python interface for MD simulations and MD trajectories
An interactive version of this readme.md is here.
Installation
Download from the github repo:
Or browse the repo:
And then install:
> python setup.py install
From here, you can access unit tests and example scripts.
There are many wonderful tools in structural biology that have less-than-stellar interfaces. pdbremix wraps these tools to make them easier to use.
all the commands are stored in the bin directory,
for linux, I suggest you add the bin path to the PATH variable, then use the follow command.
To check which tools can be accessed from the path:
> checkpdbremix
Use the -o flag to get the binary config file to override (with exotic flags):
> vi `checkpdbremix -o`
For windows, change directory to bin ; > python checkpdbremix
Tools to analyze PDB structures
pdbremix is a library to analyze PDB structures and MD trajectories. As such, it provides a platform to build command-line tools for PDB files as well as to carry out useful pre-processing of PDB files for external tools.
For all tools, detailed help is available with the -h flag, and many of the scripts work with pypy for significant speed-ups.
Tools in Pure Python
Some of the tools can be used straight out of the box:
pdbfetchfetches PDB files from the RCSB websitepdbheaderdisplays summary of PDB filespdbseqdisplays sequences in a PDBpdbchainextracts chains from a PDBpdbcheckchecks for common defects in a PDBpdbstripcleans up PDB for MD simulations
The following tools implement standard algorithms:
pdbvolcalculates volume of a PDBpdbasacalculates accessible surface-area of a PDBpdbrmsdcalculates RMSD between PDB files
For these tools, you get an impressive speed-up if use use pypy:
> pdbfetch 1be9
> pdbstrip 1be9.pdb
> pypy `which pdbvol` 1be9.pdb
Wrappers around External Tools
These following tools wrap external tools to solve some very common (and painful) use-cases in PDB analysis.
-
pdbshowdisplays PDB structures in PYMOL with extras.PYMOL is a powerful viewer, but it's defaults leave a little to be desired.
pdbshowruns PYMOL with useful defaults and added functionality:- By default, shows colored chains, ribbons, and sidechains as sticks. - Define initial viewing frame by a center-residue and a top-residue. Structure is rotated to place the center-residue above the center-of-mass in the middle, and the top-residue above the center-residue. - Color by B-factor using a red-white scale, with limits.- Worm mode to show B-factor by variable width.
- Solvent molecules can be removed, specifically for MD frames that contain too many waters, which will choke PYMOL.
- Worm mode to show B-factor by variable width.
-
pdboverlaydisplay homologous PDB files using MAFFT, THESEUS and PYMOL.One of the most beautiful results of structural biology is the structural alignment of homologous proteins.
pdboverlayperforms this complex process in one easy step starting from PDB structures:- Write fasta sequences from PDB.
- Align sequences with MAFTT to find homologous regions.
- Structurally align homologous regions with THESEUS.
- Display structurally-aligned PDBs using special PYMOL script.
-
pdbinsertfill gaps in PDB with MODELLERGaps in PDB structures cause terrible problems in MD simulations. The standard tool to patch gaps is MODELLER, which requires a ton of boilerplate.
pdbinsertdoes all the dirty work with MODELLER in one fell stroke.
Tools to run MD Simulation
pdbremix provides a simplified cross-package interface to run a useful subset of molecular-dynamics simulations. Of course, this is in not a replacement for the full functionality of these packages.
For beginners, it is particularly useful to see how a simulation is set-up from a PDB file to a trajectory, as the shell scripts and log files of all intermediate steps are saved to file. It is easier to modify a working process than to generate one from scratch.
Preparing Simulations from PDB
First let's grab a PDB file from the website:
> pdbfetch 1be9
Then we can clean do some standard cleanup so that the structure exists in a unique single conformation:
> pdbstrip 1be9.pdb
This next tool interrogates the structure for features that may affect MD simulation, highlighting steric clashes, chain-breaks (missing amino acids), disulfide bonds, incomplete and nonstandard amino acids:
> pdbcheck 1be9.pdb
Then we generate a topology file from the PDB file:
> pdb2sim 1be9.pdb sim AMBER11-GBSA
This will detect multiple chains, disulfide-bonds, fit hydrogen atoms to AMBER, and guess polar residue charged states. Masses, charges and bond spring parameters are generated from the AMBER99 force-field. pdb2sim will write a set of restart files with a common basename sim:
sim.top - the toplogy file
sim.crd - the coordinates file
The current choice of force-fields:
- AMBER11-GBSA
- AMBER11
- NAMD2.8
- GROMACS4.5
For AMBER11-GBSA, pdb2sim builds a topology file for implicit solvent. For the other choices, explicit solvent is used, where pdb2sim creates a box with 10 Å padding, and fills the box with waters and counterions.
Positional constraints
Positional constraints are very important in setting up MD simulations. pdbremix simplifies the application of positional restraints by using the B-factor column of PDB files to denote positional constaints, which is what NAMD does.
To generate a PDB file for positional restraints from a set of restart files:
> sim2pdb -b sim sim.restraint.pdb
which will generate a PDB file where all backbone atoms have been selected. You can directly edit the B-factors in the PDB file. Another option -a is for all protein atoms:
> sim2pdb -a sim sim.restraint.pdb
Running simulations
pdbremix provide several tools to run MD simulations where the chosen package is detected by the extensions of the restart files.
For all packages, a robust set of simulation parameters are used, including a 1 fs time-step, and no bond-constraints on protein atoms. In explicit solvent, periodic boundary conditions are applied with PME electrostatics.
The output restart files and trajectories are written to a common basename, and an optional -r flag to load positional restraints:
-
Minimize your structure from
simrestart files tomin, using restraints defined insim.restraint.pdb:> simmin -r sim.restraint.pdb sim min -
MD simulation with a Langevin thermometer at 300K for 5000 fs:
> simtemp -r restraint.pdb min temp 300 5000 -
For constant energy for 5000 fs:
> simconst -r restraint.pdb min const 5000
This allows you to run equilibration protocols from the command-line. For instance, a prequilibration at 300K, intially a 10 ps heating of the solvent, followed by 10 ps of the system:
> sim2pdb -b restraint.pdb sim
> simmin -r restraint.pdb sim min
> simtemp -r restraint.pdb min heat1 10000 300
> simtemp heat1 heat2 10000 300
Trajectory analysis
pdbremix provides a tool to calculate RMSD and kinetic energy for trajectories, convenience tools for viewing trajectories in viewers, and some translation tools. To use these tools, the trajectory files must have the following naming structure:
- AMBER:
- md.top
- md.trj
- md.vel.trj
- GROMACS:
- md.top (and md.*itp)
- md.gro
- md.trr
- NAMD:
- md.psf
- md.dcd
- md.vel.dcd
These are trajectory analysis tools:
trajstepdisplays basic parameters of a trajectorytrajvarcalculates energy and RMSD of trajectory
As opening trajectories in standard viewers are a pain, use these tools to open them:
trajvmddisplay trajectory in VMD *recommended*trajchimdisplay trajectory in CHIMERAtrajpymdisplay trajectory in PYMOL *AMBER only*
And some package specific tools:
traj2ambconverts NAMD/GROMACS to AMBER trajectories *without* solventgrotrimtrim GROMACS .trr trajectory files
Python interface to PDB structures
An important part of pdbremix is the design of a light API to interact with PDB structures. The data structures are designed to be easy to use with idomatic Python to do things such as select atoms.
Other packages sometimes include a domain-specific language for atom selection, but ultimately this limits the ability for those libraries to interact with the Python ecosystem such as scipy, pandas, or numpy.
Vector geometry library
As in any structural biology library, pdbremix proivdes a full-featured vector geometry library v3:
from pdbremix import v3
v3 was designed to be function-based, which allows the library to switch between a pure Python version and a numpy-dependent version.
If you want just the python version:
import pdbremix.v3array as v3
Or the numpy version:
import pdbremix.v3numpy as v3
Vectors are created and copied by the vector function:
v = v3.vector() # the zero vector
z = v3.vector(1,2,3)
w = v3.vector(z) # a copy
Vectors are represented as arrays as they are subclassed from Python arrays or numpy arrays, and components are accessed as:
print v[0], v[1], v[2]
All vectors functions return by value, with the one exception of set_vector, which changes components in place:
v3.set_vector(v, 2, 2, 2)
Here are a set of common vector operations:
mag(v)
scale(v, s)
dot(v1, v2)
cross(v1, v2)
norm(v)
parallel(v, axis)
pe
Related Skills
node-connect
332.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
81.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
332.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
81.9kCommit, push, and open a PR
