SkillAgentSearch skills...

DDGScan

DDGScan: an integrated parallel workflow for the in silico point mutation scan of protein

Install / Use

/learn @JinyuanSun/DDGScan
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

DDGScan: an integrated parallel workflow for the in silico point mutation scan of protein

DOI

Table of Contents

I am testing this repo with some different input structures, if you encountered any failure please post a issue.

The GUI plugin for FoldX

GUI only work for FoldX.

Installation

To ensure successful usage of our tool, please make sure you have added the FoldX executable to your environment. Additionally, for cartesian_ddg calculations in slow mode, or ddg_monomer row1 protocol in fast mode, Rosetta is required (note: mpi build is necessary or relax step will be skipped). ABACUS is an excellent software option for protein design, providing a great statistical energy function. Please be aware that structures downloaded from RCSB may contain errors, which can directly affect energy calculations - one common issue is breaks in chains. To address this, we have implemented a loop closure module using modeller, a reliable software option with a long history, as a backend. However, please note that due to their licenses, we cannot redistribute these programs. On the bright side, openmm is open source! And we have good news - the ABACUS2 database is now available at https://zenodo.org/record/4533424. Please note that the necessary module is not available in the Zenodo version, so you may use the online server at https://biocomp.ustc.edu.cn/servers/abacus-design.php to run ABACUS2.

Install DDGScan:

To ensure that there are no possible conflicts, it is recommended that you create a new conda environment. Additionally, using the mamba package manager will result in faster installation times. To create a new conda environment for DDGScan, you can use the following commands:

conda create -n ddgscan python=3.9
conda activate ddgscan

Once the new environment is activated, you can install mamba and other required packages using the following commands:

conda install -c conda-forge mamba
mamba install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
mamba install -c conda-forge openmm pdbfixer

Next, you can clone the DDGScan repository and install the required Python packages using the following commands:

git clone https://github.com/JinyuanSun/DDGScan.git
pip install pandas numpy joblib seaborn matplotlib venn logomaker mdtraj bio scikit-learn
python setup.py install

Finally, you can create a cache directory for DDGScan and copy some necessary data files using the following command:

mkdir ~/.cache/ddgscan && cp utils/data/nn/* ~/.cache/ddgscan/

To ensure that DDGScan is installed properly and working correctly, you can run the following command and confirm that the help message is displayed:

DDGScan -h

FoldX:

Register and download the executable.

Rosetta:

Follow the Rosetta document
I will recommend that users export ROSETTADB before runing grape-fast.py by appending this into ~/.bashrc:

export ROSETTADB="/path/to/rosetta/database"

ABACUS1/2:

Send email to the authors for source code.

Modeller:

Get the Modeller license key at https://salilab.org/modeller/registration.html

export KEY_MODELLER=<your_key>
conda config --add channels salilab
conda install modeller

Usage

Grape phase I

There are many options available for DDGScan users, particularly for those who know what they want. Here is a quick walk-through of some important options:

  • pdb and chain are positional arguments that must be set, depending on the input PDB file you want to analyze.
  • The -E flag must be set according to the software you have installed on your operating system.
  • It is strongly recommended that users set the -seq flag to provide sequence information for the input PDB file.
  • For best performance, it is highly recommended to add the -MD flag and use -P CUDA if a powerful GPU is available (e.g., better than an RTX2060). This will be much faster than using a 48-core CPU.
  • If the -fill flag is used, the input structure will be automatically fixed using information from the SEQRES record in the native PDB file downloaded from RCSB using modeller. The model with the lowest molpdf energy will be used for further analysis.
<p align="center"> <img width="80%" src="./img/workflow.png" alt="Workflow of DDGScan"> </p>
usage: DDGScan grape_phaseI [-h] [-fill] [-seq SEQUENCE] [-T THREADS] [-fc FOLDX_CUTOFF] [-rc ROSETTA_CUTOFF] [-ac ABACUS_CUTOFF] [-a2c ABACUS2_CUTOFF] [-nstruct RELAX_NUMBER] [-nruns NUMOFRUNS]
                            [-E {abacus,foldx,rosetta,abacus2,abacus2_nn} [{abacus,foldx,rosetta,abacus2,abacus2_nn} ...]] [-M {run,rerun,analysis,test}] [-S {fast,slow}] [-MD] [-P {CUDA,CPU}] [-fix_mm]
                            pdb chain

positional arguments:
  pdb                   Input PDB
  chain                 Input PDB Chain to do in silico DMS

optional arguments:
  -h, --help            show this help message and exit
  -fill, --fill_break_in_pdb
                        Use modeller to fill missing residues in your pdb file. Use this option with caution!
  -seq SEQUENCE, --sequence SEQUENCE
                        The exact sequence of protein you want to design. All mutation will be named according to this sequence.
  -T THREADS, --threads THREADS
                        Number of threads to run FoldX, Rosetta
  -fc FOLDX_CUTOFF, --foldx_cutoff FOLDX_CUTOFF
                        Cutoff of FoldX ddg(kcal/mol)
  -rc ROSETTA_CUTOFF, --rosetta_cutoff ROSETTA_CUTOFF
                        Cutoff of Rosetta ddg(R.E.U.)
  -ac ABACUS_CUTOFF, --abacus_cutoff ABACUS_CUTOFF
                        Cutoff of ABACUS SEF(A.E.U.)
  -a2c ABACUS2_CUTOFF, --abacus2_cutoff ABACUS2_CUTOFF
                        Cutoff of ABACUS2 SEF(A.E.U.)
  -nstruct RELAX_NUMBER, --relax_number RELAX_NUMBER
                        Number of how many relaxed structure
  -nruns NUMOFRUNS, --numofruns NUMOFRUNS
                        Number of runs in FoldX BuildModel
  -E {abacus,foldx,rosetta,abacus2,abacus2_nn} [{abacus,foldx,rosetta,abacus2,abacus2_nn} ...], --engine {abacus,foldx,rosetta,abacus2,abacus2_nn} [{abacus,foldx,rosetta,abacus2,abacus2_nn} ...]
  -M {run,rerun,analysis,test}, --mode {run,rerun,analysis,test}
                        Run, Rerun or analysis
  -S {fast,slow}, --preset {fast,slow}
                        Fast or Slow
  -MD, --molecular_dynamics
                        Run 1ns molecular dynamics simulations for each mutation using openmm.
  -P {CUDA,CPU}, --platform {CUDA,CPU}
                        CUDA or CPU
  -fix_mm, --fix_mainchain_missing
                        fixing missing backbone bone using pdbfixer

List distribute

usage: DDGScan list_distribute [-h] [-msaddg] [-fill] [-fix_mm] [-T THREADS] [-nstruct RELAX_NUMBER] [-nruns NUMOFRUNS]
                               [-E {foldx,rosetta,abacus2,rosetta_fast,abacus2_nn} [{foldx,rosetta,abacus2,rosetta_fast,abacus2_nn} ...]] [-repair] [-relax] [-MD] [-P {CUDA,CPU}]
                               pdb mutation_list_file

positional arguments:
  pdb                   Input PDB
  mutation_list_file    Mutation list file, see README for details

optional arguments:
  -h, --help            show this help message and exit
  -msaddg, --output_of_MSAddg
                        The format of MSAddg *.scan.txt, and there may be mismatch between your pdb and sequence
  -fill, --fill_break_in_pdb
                        Use modeller to fill missing residues in your pdb file. Use this option with caution!
  -fix_mm, --fix_mainchain_missing
                        fixing missing backbone bone using pdbfixer
  -T THREADS, --threads THREADS
                        Number of threads to run FoldX, Rosetta or ABACUS2
  -nstruct RELAX_NUMBER, --relax_number RELAX_NUMBER
                        Number of how many relaxed structure
  -nruns NUMOFRUNS, --numofruns NUMOFRUNS
                        Number of runs in FoldX BuildModel
  -E {foldx,rosetta,abacus2,rosetta_fast,abacus2_nn} [{foldx,rosetta,abacus2,rosetta_fast,abacus2_nn} ...], --engine {foldx,rosetta,abacus2,rosetta_fast,abacus2_nn} [{foldx,rosetta,abacus2,rosetta_fast,abacus2_nn} ...]
  -repair, --foldx_repair
                        Run Repair before ddG calculation
  -relax, --rosetta_relax
                        Run relax before ddG calculation
  -MD, --molecular_dynamics
                        Run 1ns molecular dynamics simulations for each mutation using openmm.
  -P {CUDA,CPU}, --platform {CUDA,CPU}
                        CUDA or CPU

Analysis and plot

usage: DDGScan analysis_and_plot [-h] [--residue_position RESIDUE_POSITION]
                                 [--plot_type {all,venn,residue_bar,heatmap,position_avg_boxplot,variance_lineplot,kde_plot,residue_logo} [{all,venn,residue_bar,heatmap,position_avg_boxplot,variance_lineplot,kde_plot,residue_logo} ...]]
        

Related Skills

View on GitHub
GitHub Stars49
CategoryDevelopment
Updated1mo ago
Forks19

Languages

Python

Security Score

90/100

Audited on Feb 28, 2026

No findings