SemaTyP
The source code and data of our paper: "SemaTyP: a knowledge graph based literature mining method for drug discovery"
Install / Use
/learn @ShengtianSang/SemaTyPREADME
SemaTyP: a knowledge graph based literature mining method for drug discovery
This is the source code and data for the task of drug discovery as described in our paper: "SemaTyP: a knowledge graph based literature mining method for drug discovery"
Requirements
- scikit-learn
- numpy
- tqdm
Data
In order to use the code, you have to provide
- Theraputic Target Database You don't need to download by yourself, I have uploaded all the TTD 2016 version in <./data/TTD>.
- SemedDB You need to download from here with password:1234 to obtain the whole knowledge graph. After downloading the "predications.txt" file, please replace the file <./data/SemedDB/predications.txt>. with this new downloaded file.
Run the codes
Install the environment.
pip install -r requirements.txt
Construct training and test data.
python experimental_data.py
Train and test the model.
python main.py
Illustration of feature selection
<div align=center><img width="800" height="300" src="https://github.com/ShengtianSang/SemaTyP/blob/main/figures/Illustration_of_Feature_selection.jpg"/></div> <p align="center"> An illustration of the features constructed in our work. </p>File declaration
data/SemmedDB: contains all relations extracted from SemmedDB, which are used for constructing the Knowledge Graph in our experiment. The whole "predications.txt" contains 39,133,975 relations, we just leave a small sample "predications.txt" file here which contain 100 relation. The whole "predications.txt" file coule be downloaded from
data/TTD: contains the drug, target and disease relations retrieved from Theraputic Target Database.
experimental_data.py: constuct the drug-target-disease associations from TTD and Knowledge Graph.
knowledge_graph.py: construct the Knowledge Graph used in our experiment.
data_loader.py:used to load traing and test data.
main.py:used to train and test the models
Cite
Please cite our paper if you use this code in your own work:
@article{sang2018sematyp,
title={SemaTyP: a knowledge graph based literature mining method for drug discovery},
author={Sang, Shengtian and Yang, Zhihao and Wang, Lei and Liu, Xiaoxia and Lin, Hongfei and Wang, Jian},
journal={BMC bioinformatics},
volume={19},
number={1},
pages={1--11},
year={2018},
publisher={Springer}
}
Related Skills
node-connect
351.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
351.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
351.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
