Proteus
Pytorch implementation for ICML 2024 paper Proteus: Exploring Protein Structure Generation for Enhanced Designability and Efficiency.
Install / Use
/learn @Wangchentong/ProteusREADME
Proteus
PyTorch Implementation for Proteus: Exploring Protein Structure Generation for Enhanced Designability and Efficiency.
<a href="https://openreview.net/pdf?id=IckJCzsGVS"><img src="https://img.shields.io/badge/Paper-ICML%202024-green" style="max-width: 100%;"></a> <a href="[https://openreview.net/pdf?id=IckJCzsGVS](https://openreview.net/pdf?id=IckJCzsGVS](https://www.biorxiv.org/content/10.1101/2024.02.10.579791v2.full.pdf)"><img src="https://img.shields.io/badge/Preprint-Biorxiv%202024-blue" style="max-width: 100%;"></a>
Overview
Proteus is a novel deep diffusion network designed to generate protein backbones with enhanced designability and efficiency. Unlike RFDiffusion which relies on large pre-trained network RosettaFold for structure prediction, Proteus utilizes graph-based triangle methods and a multi-track interaction network, achieving state-of-the-art performance without the need for pre-training. Notably, the inference speed has been accelerated from 4x up to 10x compared to FrameDiff and RFdiffusion. Our model's capabilities have been validated through comprehensive in silico evaluations and experimental characterizations, demonstrating its potential to significantly advance the field of protein design.
<img width="1023" alt="image" src="https://github.com/Wangchentong/Proteus/assets/59241275/9cd5d387-66c9-4f71-9fa8-6a27cd77a25b">Table of Contents
Install
We recommend miniconda (or anaconda). Run the following to install a conda environment with the necessary dependencies. Using mamba if possible for better install speed.
# install
conda env create -f se3.yml
# optional : using mamba for faster environment installation
conda install mamba
mamba env create -f se3.yml
# activate environment
conda activate Proteus
# install this repo as a local package
pip install -e .
Inference
The checkpoint is avaiable at ./weights/paper_weights.pt
monomer inference(command used in paper)
For the first time run, it might be a little slow because of downloading esmfold ckpt
weight_path=./weights/paper_weights.pt
python ./experiments/inference_se3_diffusion.py \
inference.output_dir=inference_outputs/monomer/ \
inference.weights_path=$weight_path \
inference.diffusion.samples.samples_lengths=[100,200,300,400,600,800] \
inference.diffusion.samples.samples_per_length=100 \
inference.diffusion.num_t=100
# config below is optional
# To disable esmfold prediction and mpnn design, add extra config
inference.mpnn.enable=False inference.esmfold.enable=False
# To disable esmfold prediction add extra config
inference.esmfold.enable=False
A self_consistency.csv will be generated in the inference_outputs/monomer/${timestap}/self_consistency.csv, report all necessary metrics like dssp or sc-rmsd, etc.
oligomer inference
baseline_weight_path=./weights/paper_weights.pt
python ./experiments/inference_se3_diffusion.py \
inference.output_dir=inference_outputs/oligomer/ \
inference.weights_path=$baseline_weight_path \
inference.diffusion.samples.contigs='60-80//60-80' \
inference.diffusion.samples.samples_per_length=100 \
inference.diffusion.num_t=100
Inference output wuold be like
inference_outputs
└── 12D_02M_2023Y_20h_46m_13s # Date time of inference.
├── mpnn.fasta # mpnn designed seuences.
├── self_consistency.csv # self consistency analysis, contains rmsd and tmscore between scaffold ans esmfold, mpnn score of sequence, scaffold path, esmf path etc.
├── diffusion # dir contains scaffold generated by proteus
│ ├── 100_1_sample.pdb
│ ├── 100_2_sample.pdb # {length}_{sample_id}_sample.pdb
| └── ...
├── trajctory # dir contains traj pdb, exists when inference.diffusion.option.save_trajactory=True
│ ├── 100_1_bb_traj.pdb
│ ├── 100_2_bb_traj.pdb # {length}_{sample_id}_traj.pdb
| └── ...
├── movie # dir contains full atom protein designed by mpnn, exists when inference.diffusion.option.plot.switch_on=True
│ ├── 100_1_rigid_movie.gif # movie of protein rigid at time t
│ ├── 100_1_rigid_0_movie.gif # movie of predict protein rigid at time 0 from time t
| └── ...
├── mpnn # dir exists when pyrosetta in installed and inference.mpnn.dump=True
│ ├── 100_0_sample_mpnn_0.pdb
│ ├── 100_0_sample_mpnn_1.pdb # {length}_{sample_id}_sample_mpnn_{sequence_id}.pdb
| └── ...
└── esmf # dir contians esmf predict strcture
├── 100_0_sample_esmf_0.pdb
├── 100_0_sample_esmf_0.pdb # {length}_{sample_id}_sample_esmf_{sequence_id}.pdb
└── ...
Code Structure
The local triangle attention is implemented below:
https://github.com/Wangchentong/Proteus/blob/71f9eaf336a41c2f2145ed8914bf7e72762bc72f/model/ipa_pytorch.py#L245
License
LICENSE: MIT
Citation
If you use our work then please cite
@article{wang2024proteus,
title={Proteus: exploring protein structure generation for enhanced designability and efficiency},
author={Wang, Chentong and Qu, Yannan and Peng, Zhangzhi and Wang, Yukai and Zhu, Hongli and Chen, Dachuan and Cao, Longxing},
journal={bioRxiv},
pages={2024--02},
year={2024},
publisher={Cold Spring Harbor Laboratory}
}
Appreciation
Proteus is built upon the following codebases, please give them a star if you enjoy Proteus :)
Related Skills
clearshot
Structured screenshot analysis for UI implementation and critique. Analyzes every UI screenshot with a 5×5 spatial grid, full element inventory, and design system extraction — facts and taste together, every time. Escalates to full implementation blueprint when building. Trigger on any digital interface image file (png, jpg, gif, webp — websites, apps, dashboards, mockups, wireframes) or commands like 'analyse this screenshot,' 'rebuild this,' 'match this design,' 'clone this.' Skip for non-UI images (photos, memes, charts) unless the user explicitly wants to build a UI from them. Does NOT trigger on HTML source code, CSS, SVGs, or any code pasted as text.
openpencil
2.0kThe world's first open-source AI-native vector design tool and the first to feature concurrent Agent Teams. Design-as-Code. Turn prompts into UI directly on the live canvas. A modern alternative to Pencil.
HappyColorBlend
HappyColorBlendVibe Project Guidelines Project Overview HappyColorBlendVibe is a Figma plugin for color palette generation with advanced tint/shade blending capabilities. It allows designers to
Flyaro-waffle-app
Waffle Delight - Full Stack MERN Application Rules & Documentation Project Overview A comprehensive waffle delivery application built with MERN stack featuring premium UI/UX, admin management, a
