READRetro
Official code repository for the paper READRetro: Natural Product Biosynthesis Planning with Retrieval-Augmented Dual-View Retrosynthesis
Install / Use
/learn @SeulLee05/READRetroREADME
READRetro: Natural Product Biosynthesis Planning with Retrieval-Augmented Dual-View Retrosynthesis
This is the official code repository for the paper READRetro: Natural Product Biosynthesis Planning with Retrieval-Augmented Dual-View Retrosynthesis (bioRxiv, 2023).<br> We also provide a web version for ease of use.
Data
Download the necessary data folder READRetro_data from Zenodo to ensure proper execution of the code and demonstrations in this repository.
The directory structure of READRetro_data is as follows:</br>
READRetro_data
├── data.sh
├── data
│ ├── model_train_data
│ └── multistep_data
├── model
│ ├── bionavi
│ ├── g2s
│ │ └── saved_models
│ ├── megan
│ └── retroformer
│ └── saved_models
├── result
└── scripts
Place READRetro_data into the READRetro directory (i.e., READRetro/READRetro_data) and run sh data.sh in READRetro_data to set up the data.</br>
Ensure the data is correctly located in READRetro. Verify the following:</br>
READRetro/retroformer/saved_modelsshould matchREADRetro_data/model/retroformer/saved_models.</br>READRetro/g2s/saved_modelsshould matchREADRetro_data/model/g2s/saved_models.</br>READRetro/datashould matchREADRetro_data/data/multistep_data.</br>READRetro/resultshould matchREADRetro_data/result.</br>READRetro/scriptsshould matchREADRetro_data/scripts.</br>
The directories READRetro_data/model/bionavi, READRetro_data/model/megan, and READRetro_data/data/model_train_data are required for reproducing the values in the manuscript.
Installation
Run the following commands to install the dependencies:
conda create -n readretro python=3.8
conda activate readretro
conda install pytorch==1.12.0 cudatoolkit=11.3 -c pytorch
pip install easydict pandas tqdm numpy==1.22 OpenNMT-py==2.3.0 networkx==2.5
conda install -c conda-forge rdkit=2019.09
Alternatively, you can install the readretro package through pip:
conda create -n readretro python=3.8 -y
conda activate readretro
pip install readretro==1.2.0
Model Preparation
We provide the trained models through Zenodo.<br>
You can use your own models trained using the official codes (https://github.com/coleygroup/Graph2SMILES and https://github.com/yuewan2/Retroformer).<br>
More detailed instructions can be found in demo.ipynb.
Single-step Planning and Evaluation
Run the following commands to evaluate the single-step performance of the models:
CUDA_VISIBLE_DEVICES=${gpu_id} python eval_single.py # ensemble
CUDA_VISIBLE_DEVICES=${gpu_id} python eval_single.py -m retroformer # Retroformer
CUDA_VISIBLE_DEVICES=${gpu_id} python eval_single.py -m g2s -s 200 # Graph2SMILES
Multi-step Planning
Run the following command to plan paths of multiple products using multiprocessing:
CUDA_VISIBLE_DEVICES=${gpu_id} python run_mp.py
# e.g., CUDA_VISIBLE_DEVICES=0 python run_mp.py
You can modify other hyperparameters described in run_mp.py.<br>
Lower num_threads if you run out of GPU capacity.
Run the following command to plan the retrosynthesis path of your own molecule:
CUDA_VISIBLE_DEVICES=${gpu_id} python run.py ${product}
# e.g., CUDA_VISIBLE_DEVICES=0 python run.py 'O=C1C=C2C=CC(O)CC2O1'
Using the command from pip
run_readretro -rc ${retroformer_ckpt} -gc ${g2s_ckpt} ${product}
# e.g., run_readretro -rc retroformer/saved_models/biochem.pt -gc g2s/saved_models/biochem.pt 'O=C1C=C2C=CC(O)CC2O1'
# you can replace the checkpoints with your own trained checkpoints of retroformer and g2s
# you should set the corresponding vocab file as an option if you replace the checkpoints
You can modify other hyperparameters described in run.py.
Multi-step Evaluation
Run the following command to evaluate the planned paths of the test molecules:
python eval.py ${save_file}
# e.g., python eval.py result/debug.txt
Demo
You can reproduce the figures and tables presented in the paper or train your own models by utilizing the provided demo.ipynb.
Related Skills
node-connect
350.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
