CBGBench
Official code repository of < CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph >
Install / Use
/learn @EDAPINENUT/CBGBenchREADME
CBGBench
<p align="center"> <img src="docs/cbg_logo.png" width="400" class="center" alt="CBGBench Logo"/> <br/> </p><a href="https://pytorch.org/get-started/locally/"><img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-ee4c2c?logo=pytorch&logoColor=white"></a>
<a href="https://img.shields.io/github/stars/EDAPINENUT/CBGBench" alt="arXiv">
<img src="https://img.shields.io/github/stars/EDAPINENUT/CBGBench" /></a>
CBGBench: Complex Binding Graph Benchmark is a benchmark for generative target-aware molecule design.
Paper [DiffBP] | Paper [D3FG] | Benchmark
[!NOTE]
- Feb. 2025: Accepted by ICLR2025 as spotlight presentation. Congrats! Some of pretrained checkpoints of the edge-cutting methods have been released, pls download from Google Drive for further usage.
- Dec. 2024: We have uploaded the molecules generated by the SOTA Methods for de novo design (LiGAN, Pocket2Mol, TargetDiff, D3FG, DecomptDiff, VoxBind, and MolCraft) for evaluation. For most of them, the generated number is 100/pockets, while for some 200 (2 times) in order to check if it is necessary to report multi-seeds' results. If it is useful for your research, pls download from the Google Drive and cite the paper.
- Oct. 2024: CBGBench paper v3 has just integrated the state-of-the-art MolCraft and VoxBind into evaluation, and the integration of their codes is in process.
- Sep. 2024: CBGBench has joined the DeepModeling community, a community devoted to AI for science, as a sandbox-level project. Learn more about DeepModeling
Introduction
This is the official code repository of the paper 'CBGBench: Fill in the Blank of Protein-Molecule Binding Graph', which aims to unify target-aware molecule design with single code implementation. Until now, we have included 7 methods as shown below: | Model | Paper link | Github | |------------|---------------------------------------------|-------------------------------------------| | Pocket2Mol | https://arxiv.org/abs/2205.07249 | https://github.com/pengxingang/Pocket2Mol | | GraphBP | https://arxiv.org/abs/2204.09410 | https://github.com/divelab/GraphBP | | DiffSBDD | https://arxiv.org/abs/2210.13695 | https://github.com/arneschneuing/DiffSBDD | | DiffBP | https://arxiv.org/abs/2211.11214 | Here is the official implementation. | | TargetDiff | https://arxiv.org/abs/2303.03543 | https://github.com/guanjq/targetdiff | | FLAG | https://openreview.net/forum?id=Rq13idF0F73 | https://github.com/zaixizhang/FLAG | | D3FG | https://arxiv.org/abs/2306.13769 | Here is the official implementation. |
These models are initially established for de novo molecule generation, and we extend more tasks including linker design, fragment growing, scaffold hopping, and side chain decoration.
Installation
Create environment with basic packages.
conda env create -f environment.yml
conda activate cbgbench
Install pytorch and torch_geometric
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia
conda install pyg pytorch-scatter pytorch-cluster -c pyg
Install tools for chemistry
# install rdkit, efgs, obabel, etc.
pip install --use-pep517 EFGs
pip install biopython
pip install lxml
conda install rdkit openbabel tensorboard tqdm pyyaml easydict python-lmdb -c conda-forge
# install plip
mkdir tools
cd tools
git clone https://github.com/pharmai/plip.git
cd plip
python setup.py install
alias plip='python plip/plip/plipcmd.py'
cd ..
### Note that if there is an error in setup.py, it can be ignored as long as openbabel is installed.
# install docking tools
conda install -c conda-forge numpy swig boost-cpp sphinx sphinx_rtd_theme
python -m pip install git+https://github.com/Valdes-Tresanco-MS/AutoDockTools_py3
pip install meeko==0.1.dev3 scipy pdb2pqr vina
# If you are unable to install vina, you can try: conda install vina
# If you encounter the following error:
# ImportError: libtiff.so.5: cannot open shared object file: No such file or directory
# Please try the following steps to resolve it:
pip uninstall pillow
pip install pillow
Prepare Dataset
CrossDocked2020
1. Download processed datasets
(i) Download the processed dataset from Google Drive. For each dataset, it corresponds to different methods and tasks. Please refer to the Table below to find the processed data that you need to use. Note that to train D3FG, it is a two-stage model that firstly generates functional groups and then generates linker atoms, so we named the two-stage model 'D3FG_fg' and 'D3FG_linker', and the training data is different for them.
Table: The data file name corresponding to different methods and tasks. | model | task | file name | |-------------|------------|---------------------------------------------------------------------------| | Pocket2Mol | de novo | data/pl/crossdocked_v1.1_rmsd1.0_pocket10_processed_fullatom.lmdb | | GraphBP | de novo | data/pl/crossdocked_v1.1_rmsd1.0_pocket10_processed_fullatom.lmdb | | DiffSBDD | de novo | data/pl/crossdocked_v1.1_rmsd1.0_pocket10_processed_fullatom.lmdb | | DiffBP | de novo | data/pl/crossdocked_v1.1_rmsd1.0_pocket10_processed_fullatom.lmdb | | TargetDiff | de novo | data/pl/crossdocked_v1.1_rmsd1.0_pocket10_processed_fullatom.lmdb | | FLAG | de novo | data/pl_arfg/crossdocked_v1.1_rmsd1.0_pocket10_processed_arfuncgroup.lmdb | | D3FG | de novo | data/pl_fg/crossdocked_v1.1_rmsd1.0_pocket10_processed_funcgroup.lmdb | | Pocket2Mol | fragment | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_frag.lmdb | | GraphBP | fragment | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_frag.lmdb | | DiffSBDD | fragment | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_frag.lmdb | | DiffBP | fragment | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_frag.lmdb | | TargetDiff | fragment | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_frag.lmdb | | Pocket2Mol | linker | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_linker.lmdb | | GraphBP | linker | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_linker.lmdb | | DiffSBDD | linker | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_linker.lmdb | | DiffBP | linker | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_linker.lmdb | | TargetDiff | linker | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_linker.lmdb | | Pocket2Mol | scaffold | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_scaffold.lmdb | | GraphBP | scaffold | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_scaffold.lmdb | | DiffSBDD | scaffold | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_scaffold.lmdb | | DiffBP | scaffold | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_scaffold.lmdb | | TargetDiff | scaffold | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_scaffold.lmdb | | Pocket2Mol | side chain | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_sidechain.lmdb | | GraphBP | side chain | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_sidechain.lmdb | | DiffSBDD | side chain | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_sidechain.lmdb | | DiffBP | side chain | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_sidechain.lmdb | | TargetDiff | side chain | data/pl_decomp/crossdocked_v1.1_rmsd1.0_pocket10_processed_sidechain.lmdb |
(ii) Copy the datasets to ./data dir, in which the complete data file directory will look like
- CGBBench
- data
- pl
- crossdocked_name2id.pt
- crossdocked_v1.1_rmsd1.0_pocket10_processed_fullatom.lmdb
- crossdocked_v1.1_rmsd1.0_pocket10_processed_linker.lmdb
- pl_arfg
- crossdocked_name2id_arfuncgroup.pt
- crossdocked_v1.1_rmsd1.0_pocket10_processed_arfuncgroup.lmdb
- pl_decomp
- crossdocked_name2id_frag.pt
- crossdocked_name2id_linker.pt
- crossdocked_name2id_scaffold.pt
- crossdocked_name2id_sidechain.pt
- crossdocked_v1.1_rmsd1.0_pocket10_processed_frag.lmdb
- crossdocked_v1.1_rmsd1.0_pocket10_processed_linker.lmdb
- crossdocked_v1.1_rmsd1.0_pocket10_processed_scaffold.lmdb
- crossdocked_v1.1_rmsd1.0_pocket10_processed_sidechain.lmdb
- pl_fg
- crossdocked_name2id_funcgroup.pt
- crossdocked_v1.1_rmsd1.0_pocket10_processed_funcgroup.lmdb
2. Download raw data and prepare from scratch
(i) Download crossdocked_v1.1_rmsd1.0.tar.gz from TargetDiff Drive, and copy it to ./raw_data/ with
mkdir raw_data
tar -xzvf crossdocked_v1.1_rmsd1.0.tar.gz ./raw_data
(ii) Run the following:
python ./scripts/extract_pockets.py --source raw_data/crossdocked_v1.1_rmsd1.0 --dest raw_data/crossdocked_v1.1_rmsd1.0_pocket10
(iii) In training, the Dataset will be prepared for each task. Pleas
Related Skills
clearshot
Structured screenshot analysis for UI implementation and critique. Analyzes every UI screenshot with a 5×5 spatial grid, full element inventory, and design system extraction — facts and taste together, every time. Escalates to full implementation blueprint when building. Trigger on any digital interface image file (png, jpg, gif, webp — websites, apps, dashboards, mockups, wireframes) or commands like 'analyse this screenshot,' 'rebuild this,' 'match this design,' 'clone this.' Skip for non-UI images (photos, memes, charts) unless the user explicitly wants to build a UI from them. Does NOT trigger on HTML source code, CSS, SVGs, or any code pasted as text.
openpencil
2.0kThe world's first open-source AI-native vector design tool and the first to feature concurrent Agent Teams. Design-as-Code. Turn prompts into UI directly on the live canvas. A modern alternative to Pencil.
HappyColorBlend
HappyColorBlendVibe Project Guidelines Project Overview HappyColorBlendVibe is a Figma plugin for color palette generation with advanced tint/shade blending capabilities. It allows designers to
Flyaro-waffle-app
Waffle Delight - Full Stack MERN Application Rules & Documentation Project Overview A comprehensive waffle delivery application built with MERN stack featuring premium UI/UX, admin management, a
