DeepDrugDomain

DeepDrugDomain: A versatile Python toolkit for streamlined preprocessing and accurate prediction of drug-target interactions and binding affinities, leveraging deep learning for advancing computational drug discovery.

Generate Convert Improve

Install / Use

/learn @yazdanimehdi/DeepDrugDomain

About this skill

Quality Score

0/100

README

DeepDrugDomain

DeepDrugDomain is a comprehensive Python toolkit aimed at simplifying and accelerating the process of drug-target interaction (DTI) and drug-target affinity (DTA) prediction using deep learning. With a flexible preprocessing pipeline and modular design, DeepDrugDomain supports innovative research and development in computational drug discovery.

Features

DeepDrugDomain is built with a suite of powerful features designed to empower researchers in the field of computational drug discovery. Below are some of the core capabilities that make DeepDrugDomain an indispensable tool:

Extensive Preprocessing Capabilities

Comprehensive Preparation Tools: Streamline your data preparation with our extensive suite of preprocessing tools.
Support for Diverse Data: Cater to a wide array of data formats prevalent in drug discovery, ensuring compatibility and ease of integration.

Modular Design for Flexibility

Customizable Components: Adapt the toolkit to meet your research needs with highly customizable components.
Simplified Model Creation: Our modular design principle makes model creation and experimentation a straightforward process, saving time and reducing complexity.

Stateful Evaluation Metrics

Consistent Performance Tracking: Integrated metrics provide a consistent framework for tracking the performance of models.
Reproducibility and Accuracy: These metrics are integral in ensuring the reproducibility of results and the accuracy of predictions.

Custom Activation Functions

Integration of Novel Functions: Introduce and integrate custom activation functions with ease to enhance your models.
Boost to Model Adaptability: This feature allows models to be more adaptable and effective in handling complex drug discovery tasks.

Comprehensive Task Support

Support for Core Tasks: DeepDrugDomain comes with built-in support for key tasks such as drug-target interaction (DTI) and drug-target affinity (DTA).
Tailored for Drug Discovery: The toolkit is crafted to meet the unique challenges faced in drug discovery, providing tailored support that drives innovation and progress.

Facilitation of Model Augmentation

Decorator Design: Augment models seamlessly with new inputs, enhancing the toolkit's utility and application scope.
Accuracy Improvement: With just a line of code, improve the accuracy of existing models, streamlining the refinement process.

Benchmarking

Built-in Benchmarks: Leverage the pre-implemented benchmark models to gauge performance and validate outcomes.
Customizability: Tailor the architecture of implemented models to meet specific research requirements, offering unparalleled flexibility.

Expandability

Continuous Development: Designed with the future in mind, DeepDrugDomain encourages and facilitates continuous expansion and incorporation of new features.
Custom Instantiation: Choose to instantiate components in their default configuration or customize them for a more tailored experience.

Ease of Use

Simplified Drug Discovery: Remove the complexity from drug discovery tasks. DeepDrugDomain comes with a comprehensive suite of tools for easy preprocessing of any generic data.
User-Friendly Model Training: Whether you're defining new models or utilizing pre-implemented ones, the process is straightforward and user-friendly, requiring minimal setup.

By integrating these advanced features, DeepDrugDomain stands out as a toolkit that not only meets the current demands of drug discovery but also adapts to its future challenges and opportunities.

Installation

For now you can use this environments for usage and development,

conda create --name deepdrugdomain python=3.11
conda activate deepdrugdomain
pip install dgl -f https://data.dgl.ai/wheels/repo.html
conda install -c conda-forge rdkit
pip install git+https://github.com/yazdanimehdi/deepdrugdomain.git

Quick Start

import deepdrugdomain as ddd

# setting device on GPU if available, else CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model = ModelFactory.create("attentionsitedti")
preprocesses = ddd.data.PreprocessingList(model.default_preprocess(
    "SMILES", "pdb_id", "Label"))
dataset = ddd.data.DatasetFactory.create(
    "human", file_paths="data/human/", preprocesses=preprocesses)
datasets = dataset(split_method="random_split",
                    frac=[0.8, 0.1, 0.1], seed=seed, sample=0.1)


collate_fn = model.collate

data_loader_train = DataLoader(
    datasets[0], batch_size=64, shuffle=True, num_workers=0, pin_memory=True, drop_last=True, collate_fn=collate_fn)

data_loader_val = DataLoader(datasets[1], drop_last=False, batch_size=32,
                                num_workers=4, pin_memory=False, collate_fn=collate_fn)
data_loader_test = DataLoader(datasets[2], drop_last=False, batch_size=32,
                                num_workers=4, pin_memory=False, collate_fn=collate_fn)
criterion = torch.nn.BCELoss()
optimizer = OptimizerFactory.create(
    "adam", model.parameters(), lr=1e-3, weight_decay=0.0)
scheduler = None
device = torch.device("cpu")
model.to(device)
train_evaluator = ddd.metrics.Evaluator(["accuracy_score"], threshold=0.5)
test_evaluator = ddd.metrics.Evaluator(
    ["accuracy_score", "f1_score", "auc", "precision_score", "recall_score"], threshold=0.5)
epochs = 3000
accum_iter = 1
print(model.evaluate(data_loader_val, device,
        criterion, evaluator=test_evaluator))
for epoch in range(epochs):
    print(f"Epoch {epoch}:")
    model.train_one_epoch(data_loader_train, device, criterion,
                            optimizer, num_epochs=200, scheduler=scheduler, evaluator=train_evaluator, grad_accum_steps=accum_iter)
    print(model.evaluate(data_loader_val, device,
                            criterion, evaluator=test_evaluator))

print(model.evaluate(data_loader_test, device,
                        criterion, evaluator=test_evaluator))

Examples

The example folder contains a collection of scripts and notebooks demonstrating various capabilities of DeepDrugDomain. Below is an overview of what each example covers:

Training Different Models

attentionsitedti.ipynb: Brief explanation of training AttentionSiteDTI with custom configurations and model tampering in this Jupyter Notebook.

Other Functionalities

Supported Preprocessings

The following table lists the preprocessing methods supported by the package, detailing the data conversion, settings options, and the models that use them:

Ligand Preprocessing Methods

| Method | Converts From | Converts To | Settings Options | Used in Models | |------------------------|-------------------|--------------------|--------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------| | smiles_to_encoding | SMILES | Encoding Tensor | one_hot: bool, embedding_dim: Optional[int], max_sequence_length: Optional[int], replacement_dict: Dict[str, str], token_regex: Optional[str], from_set: Optional[Dict[str, int]] | DrugVQA, AttentionDTA | | smile_to_graph | SMILES | Graph | node_featurizer: Callable, edge_featurizer: Optional[Callable], consider_hydrogen: bool, fragment: bool, hops: int | AMMVF, AttentionSiteDTI, FragXsiteDTI, CSDTI | | smile_to_fingerprint | SMILES | Fingerprint | method: str, Refer to Supported Fingerprinting Methods table for detailed settings. | AMMVF |

For detailed information on fingerprinting methods, please see the Supported Fingerprinting Methods section.

Supported Fingerprinting Methods

| Method Name | Description | Settings Options | |-----------------|---------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------| | RDKit | Converts SMILES to RDKit fingerprints, capturing molecular structure information. | radius: Optional[int], nBits: Optional[int] | | Morgan | Generates circular fingerprints, representing the environment of each atom in a molecule. | radius: Optional[int], nBits: Optional[int] | | Daylight | Traditional method to encode molecular features, focusing on specific substructure patterns. | nBits: Optional[int] | | ErG | Extended reduced graph-based approach, emphasizing molecular topology. | nBits: Optional[int], atom_dict: Optional[AtomDictType], bond_dict: Optional[BondDictType] | | RDKit2D | Two-dimensional variant of RDKit, detailing planar molecular structures. | nBits: Optional[int], atom_dict: Optional[AtomDictType], bond_dict: Optional[BondDictType]

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

400

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

last30days-skill

19.5k

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary