READRetro

Official code repository for the paper READRetro: Natural Product Biosynthesis Planning with Retrieval-Augmented Dual-View Retrosynthesis

Generate Convert Improve

Install / Use

/learn @SeulLee05/READRetro

About this skill

Quality Score

0/100

README

READRetro: Natural Product Biosynthesis Planning with Retrieval-Augmented Dual-View Retrosynthesis

This is the official code repository for the paper READRetro: Natural Product Biosynthesis Planning with Retrieval-Augmented Dual-View Retrosynthesis (bioRxiv, 2023). We also provide a web version for ease of use.

Data

Download the necessary data folder READRetro_data from Zenodo to ensure proper execution of the code and demonstrations in this repository.

The directory structure of READRetro_data is as follows:

READRetro_data
    ├── data.sh
    ├── data
    │   ├── model_train_data
    │   └── multistep_data
    ├── model
    │   ├── bionavi
    │   ├── g2s
    │   │   └── saved_models
    │   ├── megan
    │   └── retroformer
    │       └── saved_models
    ├── result
    └── scripts

Place READRetro_data into the READRetro directory (i.e., READRetro/READRetro_data) and run sh data.sh in READRetro_data to set up the data.

Ensure the data is correctly located in READRetro. Verify the following:

READRetro/retroformer/saved_models should match READRetro_data/model/retroformer/saved_models.
READRetro/g2s/saved_models should match READRetro_data/model/g2s/saved_models.
READRetro/data should match READRetro_data/data/multistep_data.
READRetro/result should match READRetro_data/result.
READRetro/scripts should match READRetro_data/scripts.

The directories READRetro_data/model/bionavi, READRetro_data/model/megan, and READRetro_data/data/model_train_data are required for reproducing the values in the manuscript.

Installation

Run the following commands to install the dependencies:

conda create -n readretro python=3.8
conda activate readretro
conda install pytorch==1.12.0 cudatoolkit=11.3 -c pytorch
pip install easydict pandas tqdm numpy==1.22 OpenNMT-py==2.3.0 networkx==2.5
conda install -c conda-forge rdkit=2019.09

Alternatively, you can install the readretro package through pip:

conda create -n readretro python=3.8 -y
conda activate readretro
pip install readretro==1.2.0

Model Preparation

We provide the trained models through Zenodo. You can use your own models trained using the official codes (https://github.com/coleygroup/Graph2SMILES and https://github.com/yuewan2/Retroformer). More detailed instructions can be found in demo.ipynb.

Single-step Planning and Evaluation

Run the following commands to evaluate the single-step performance of the models:

CUDA_VISIBLE_DEVICES=${gpu_id} python eval_single.py                    # ensemble
CUDA_VISIBLE_DEVICES=${gpu_id} python eval_single.py -m retroformer     # Retroformer
CUDA_VISIBLE_DEVICES=${gpu_id} python eval_single.py -m g2s -s 200      # Graph2SMILES

Multi-step Planning

Run the following command to plan paths of multiple products using multiprocessing:

CUDA_VISIBLE_DEVICES=${gpu_id} python run_mp.py
# e.g., CUDA_VISIBLE_DEVICES=0 python run_mp.py

You can modify other hyperparameters described in run_mp.py. Lower num_threads if you run out of GPU capacity.

Run the following command to plan the retrosynthesis path of your own molecule:

CUDA_VISIBLE_DEVICES=${gpu_id} python run.py ${product}
# e.g., CUDA_VISIBLE_DEVICES=0 python run.py 'O=C1C=C2C=CC(O)CC2O1'

Using the command from pip

run_readretro -rc ${retroformer_ckpt} -gc ${g2s_ckpt} ${product}
# e.g., run_readretro -rc retroformer/saved_models/biochem.pt -gc g2s/saved_models/biochem.pt 'O=C1C=C2C=CC(O)CC2O1'
# you can replace the checkpoints with your own trained checkpoints of retroformer and g2s
# you should set the corresponding vocab file as an option if you replace the checkpoints

You can modify other hyperparameters described in run.py.

Multi-step Evaluation

Run the following command to evaluate the planned paths of the test molecules:

python eval.py ${save_file}
# e.g., python eval.py result/debug.txt

Demo

You can reproduce the figures and tables presented in the paper or train your own models by utilizing the provided demo.ipynb.

Related Skills

node-connect

350.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

350.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

350.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。