SkillAgentSearch skills...

DETANGO

Deconvolving mutation effects on protein stability and function with disentangled protein language models

Install / Use

/learn @luo-group/DETANGO
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<h1 align="center">DETANGO</h1>

💼 Environment

First, clone the GitHub repository.

git clone https://github.com/luo-group/DETANGO.git
cd DETANGO

Then, setup the Python environment for DETANGO. We use pytorch 2.2.0 and torchvision 0.17.0, which can be installed with the compatible version for your CUDA or CPU following the instructions on the official website of PyTorch (https://pytorch.org/).

# Tested on Ubuntu 24.04
conda create -n detango python=3.9.12
conda activate detango

pip install -r requirements.txt

pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu121

🛠️ Data preprocessing

To train a DETANGO model, we first need to compute the ESM-1v-predicted mutational effects and $\Delta\Delta G$ (or use experimentally-derived abundance values) for all single mutations and collect the ESM-1v's embeddings for protein residues. Please install FoldX locally and pass the path to the executables in the bash file. Using P62993 as an example, you can execute scripts/data_preprocessing_uniprot.sh, which contains the code snippet shown below.

#!/bin/bash

protein="P62993"
cuda_device=1
stability_col="foldx"
cpus=60 # number of cpus to use for foldx
foldx_path=scripts/foldx_20251231 # revise this path to your local foldx executable

# create data directory
mkdir -p data
mkdir -p data/$protein
mkdir -p data/$protein/intermediates

# download sequence and structure data
curl -o data/$protein/wt.fasta https://rest.uniprot.org/uniprotkb/$protein.fasta
curl -o data/$protein/AF-$protein-F1-model_v6.pdb https://alphafold.ebi.ac.uk/files/AF-$protein-F1-model_v6.pdb

# generate collection of single mutants, compute evolutionary plausibility scores, and collect esm embeddings
python detango/initialize_esm_data.py --protein $protein --cuda $cuda_device

# generate stability scores
if [ "$stability_col" == "foldx" ]; then
    mkdir -p data/$protein/intermediates/foldx
    python detango/initialize_foldx_data.py --protein $protein --cpus $cpus --foldx-path $foldx_path
fi

Note: it takes FoldX for approximately 40 minutes on 60 CPUs to calculate the $\Delta\Delta G$ for all single mutations.

💻 Model training and Inference

After collecting the required data for running DETANGO, we proceeded to train DETANGO models. Using P62993 as an example, you can execute scripts/model_training.sh, which contains the code snippet shown below. Be sure to modify the protein identifier, wild-type sequence, and stability column as appropriate for your dataset.

#!/bin/bash

protein="P62993"
sequence_wt="MEAIAKYDFKATADDELSFKRGDILKVLNEECDQNWYKAELNGKDGFIPKNYIEMKPHPWFFGKIPRAKAEEMLSKQRHDGAFLIRESESAPGDFSLSVKFGNDVQHFKVLRDGAGKYFLWVVKFNSLNELVDYHRSTSVSRNQQIFLRDIEQVPQQPTYVQALFDFDPQEDGELGFRRGDFIHVMDNSDPNWWKGACHGQTGMFPRNYVTPVNRNV"
stability_col="foldx"

for sample_seed in 0 1 2 3 4 5 6 7 8 9
do
    python detango/train_inference.py --protein $protein --sequence_wt $sequence_wt --stability_col $stability_col --sample_seed $sample_seed
done

python detango/merge_results.py --protein $protein --sequence_wt $sequence_wt --stability_col $stability_col

🖊️ Citation

Ding, Kerr, et al. "Deconvolving mutation effects on protein stability and function with disentangled protein language models." bioRxiv (2026): 2026-02.

@article{ding2026deconvolving,
  title={Deconvolving mutation effects on protein stability and function with disentangled protein language models},
  author={Ding, Kerr and Li, Ziang and Tu, Tony and Luo, Jiaqi and Luo, Yunan},
  journal={bioRxiv},
  pages={2026--02},
  year={2026},
  publisher={Cold Spring Harbor Laboratory}
}

📬 Contact

Please submit GitHub issues or contact Kerr Ding (kerrding[at]gatech[dot]edu) and Yunan Luo (yunan[at]gatech[dot]edu) for any questions related to the source code.

View on GitHub
GitHub Stars13
CategoryDevelopment
Updated7d ago
Forks0

Languages

Python

Security Score

75/100

Audited on Mar 29, 2026

No findings