PADME

This is the repository containing the source code for my Master's thesis research, about predicting drug-target interaction using deep learning.

Generate Convert Improve

Install / Use

/learn @simonfqy/PADME

About this skill

Quality Score

0/100

README

PADME: A Deep Learning-Based framework for Drug-Target Interaction Prediction

This is the repository containing the source code for my Master's thesis research, namely predicting drug-target interaction using Deep Neural Networks. The name PADME stands for "Protein And Drug Molecule interaction prEdiction", which also happened to be the heroine of Star Wars Prequel Trilogy. The paper can be found here: https://arxiv.org/abs/1807.09741

It currently depends on a version of DeepChem Python package released in November 2017. I will need to make major modifications to it such that it would be compatible with the current version of DeepChem after I am done with my first version of the current paper. The dcCustom folder is a package, inheriting some classes from DeepChem. Some of the implementations are customized, so I named it dcCustom, which means "Customized version of DeepChem".

The Python script driver.py at the top level is in charge of calling functions in dcCustom to execute the program. I assume using a Linux system, the .sh files call driver.py, each .sh file starts with the word drive, and specifies the different options that should be passed to the program. The options would include a dataset to be analyzed, model to be used, whether cross validation should be performed, etc. Like DeepChem, PADME cannot use multiple GPUs to parallelize the task, so using one GPU for one process is the most efficient choice, otherwise extra GPUs would have their memory completely occupied but not doing any useful work, only 1 GPU is the workhorse. For this purpose, CUDA_VISIBLE_DEVICES was specified in each .sh file, such that we can take advantage of multiple GPUs, each one running a specific process. To run the program, simply type the path to the corresponding shell script in the command line in Linux.

The protein descriptors used is PSC, Protein Sequence Composition descriptor, which are stored as files in the respective dataset folders, like /full_toxcast. You can specify the path of the protein sequence descriptor files in the .sh scripts.

Currently it works fine for graphconvreg, weave_regression, tf_regression, and mpnn. I will need updates to the classification models so that it would work correctly for them too, like weave, graphconv, etc.

You must first have DeepChem installed for PADME to work correctly, which in turn requires you to install TensorFlow.

Other folders like oldCode and phase1 are not related to PADME, they are for the first phase of my project. You can neglect them.

Built with

Python - Process data and constructing deep learning model

Author

simonfqy (Qingyuan Feng)

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

research_rules

Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

simonfqy

View profile

View on GitHub

GitHub Stars44

CategoryEducation

Updated5mo ago

Forks16

simonfqy/PADME

Languages

Python

Security Score

92/100

Audited on Oct 25, 2025

No findings