17 skills found
MorganCThomas / MolScoreAn automated scoring function to facilitate and standardize the evaluation of goal-directed generative models for de novo molecular design
pulimeng / EToxPredA simple tool to predict the general toxicity and calculate the synthesize accessibility (SA) score for small molecules.
dbkgroup / Prop GenA novel hybrid method for generating molecules with desired property scores.
prescient-design / FuncmolScore-based 3D molecule generation with neural fields - NeurIPS 2024
filipsPL / AnnapurnaAnnapuRNA: a scoring function for predicting RNA-small molecule interactions.
forlilab / BottchscoreCalculate Böttcher score on small molecules (doi.org/10.1021/acs.jcim.5b00723)
ncbi / Pubchem Align3dThis is a generic C++ library that can be used to rapidly align two small molecules in 3D space, with shape - and optionally color - Tanimoto scoring. Included is a sample RDKit application whereby any two SDF-format chemicals are read in and aligned.
DivyaKarade / Deep Learning Classification Based Model For Screening Compounds With HERG Inhibitory ActivityDeveloping a Deep learning classification-based model for screening pharmaceutical compounds with hERG inhibitory activity (cardiotoxicity) and using the model to screen CAS antiviral database to identify compounds with cardiotoxicity potential. The data is derived from "Drug Discovery Hackathon 2020: PS ID: DDT2-13" (https://innovateindia.mygov.in/ddh2020/problem-statements/) Details related to the project can also be derived from: (https://youtu.be/7tqaPmYQmCM) Note: The solution for the above problem statement is solved with Deep learning classification based model instead of linear discriminant analysis model as written in the problem statement. Details of the project: In silico prediction of cardiotoxicity with high sensitivity and specificity for potential drug molecules would be of immense value. Hence, building a classification-based machine learning models, capable of efficiently predicting cardiotoxicity will be critical. A data set of diverse pharmaceutical compounds with hERG channel inhibitory activity (blocker/non-blocker) is provided. The SMILES notations of all compounds are given. The set of compounds divided into a training set and a test set using 70:30 ratios. Simple, reproducible and easily transferable classification models developed from the training set compounds using 2D descriptors. The models were validated based on the test set compounds. The models is having the following quality: Training Set: ROC AUC for training set: 0.977280 Classification accuracy for training set: 0.986058 Precision for training set: 0.993124 Sensitivity/Recall for training set: 0.990235 F1 score for training set: 0.991677 Confusion matrix: [[ 892 33] [ 47 4766]] Test set: ROC AUC for test set: 0.649767 Classification accuracy for test set: 0.813670 Precision for test set: 0.883061 Sensitivity/Recall for test set: 0.990235 F1 score for test set: 0.889050 Confusion matrix: [[ 165 243] [ 215 1835]] The best model was also used to classify CAS antiviral database compounds for hERG channel inhibitory activity and a list of compounds with cardiotoxicity potential was being generated in the form of .csv file.
mizanur-rahman / Vaccine Degradation PredictionIn this competition,we tried to predict degradation rates at each base of an RNA molecule, trained on a subset of an Eterna dataset comprising over 3000 RNA molecules (which span a panoply of sequences and structures) and their degradation rates at each position. We will then score your models on a second generation of RNA sequences that have just been devised by Eterna players for COVID-19 mRNA vaccines. These final test sequences are currently being synthesized and experimentally characterized at Stanford University in parallel to your modeling efforts.
pwolle / MolGradScore-Based Generative Model for Molecules
Luckygyana / Corona Drug DiscoveryThis is a solution for the Possible Drugs for Covid-19 . Binding scores of leading existing drugs (HIV inhibitors) are around -10 to -11 and around -13 for the drug Remdesivir which recently entered clinical testing. More negative the binding score is, better the drug is. The goal is to create a novel small molecule which can bind with the coronavirus, using deep learning techniques for molecule generation and PyRx to evaluate binding affinities. By combining a Generative RNN model with techniques and principles from transfer learning and genetic algorithms, I was able to create several small molecule candidates which achieved binding scores approaching -18.
ASethi04 / Generation Of Novel Drug Molecules With Specific Protein Targets Through A Graph Network And Custom ABSTRACT Creation of novel drug molecules is a time consuming and expensive process. Current methods require manually synthesizing thousands of molecules to develop a single viable lead candidate. In silico prediction of drug–target interactions (DTI) is necessary for the development of new drugs. In this research, I developed a novel artificial intelligence model capable of predicting DTIs based on drug chemical structure data. Using 445 drugs in the DrugBank database, I created a graph network to represent the compound chemical structures and predicted DTIs based on structural similarity. Inferring over the graph with a modified nearest neighbors to predict a new drug’s protein interactions achieved an area under the receiver operating characteristic of 0.93. Furthermore, I developed a generative machine learning model to create novel drug molecules with specific DTI profiles. Using the same DrugBank data, I created a custom Conditional Variational Autoencoder (CVAE) to encode the string representation of drug compound structures and their associated DTIs. The DTI profiles are incorporated as conditions in the encoder and decoder of the model, allowing generation of novel drug molecules with specified DTIs. As proof of concept, I show that the CVAE can generate similar (but not identical) molecules that are still chemically valid. Novel drugs generated by the CVAE achieved an average similarity compound score of 0.70 relative to their corresponding molecules in the test set. This study advances the possibility of low-cost and efficient drug development by proposing an in-silico method for targeted lead candidate molecule generation.
FabianKruger / Molecule Optimization AgentLLM-driven molecular optimization agent built with LangGraph. Iteratively proposes and refines molecules using configurable oracles and objectives, with support for custom scoring functions and optimization targets.
yassinerabhi / A New Molecule Effective Against SARS CoV 2We combined a recurrent neural network generative model with transfer learning methods and active learning based algorithms to design novel small molecules capable of effectively inhibiting the 3CL protease in human cells. We then analyze these small molecules to find the correct binding site that matches the structure of the 3CL protease of our target virus as well as other analyses performed in this study. Based on these screening results, some molecules have achieved a good binding score close to -18 kcal/mol, which we can consider as good potential candidates for further synthesis and testing against SARS-CoV-2.
Klab-Bioinfo-Tools / GLM ScoreA set of empirical scoring functions for predicting Receptor-Ligand binding affinities (in pKd units): protein-DNA/RNA, protein-small molecule, and protein-protein complexes. Each scoring function was built using generalized linear models (GLMs) applied to specific and curated data sets.
Adam-maz / CNS MPO CalculatorHere I provide some code that enables users to calculate CNS MPO score for their molecules, based on the SMILES and known pKa values.
lakerrenhu / Cancer Inhibitor PredictionThe main purpose in this project is to develop reliable prediction model that be able to predict whether a given molecule is a CDK2 inhibitor. In this project a bunch of machine learning methods are applied to learn the prediction model of cancer inhibitor. These methods include elastic net, support vector classifier (SVC), K-nearest neighbors (KNN), Decision Tree (DT), Random Forest (RF), Huber Regressor (HR), Lasso, Lasso plus cross-validation (Lasso CV), least-angle regression (LAR), Bayesian Ridge (BR), stochastic gradient descent classifier (SGDC), Ridge regres-sion (RR), Logistic Regression (LR), Orthogonal Match Pursuit (OMP), Multilayer Perceptron (MLP), Qlattice, convolutional neural network (CNN). The performance metrics of the models are developed on test dataset, represented by precision, recall, f1-score, accuracy, and ROC.