PECAN
Help file for running the scripts to learn and evaluate graph convolution networks for epitope and paratope prediction
Install / Use
/learn @vamships/PECANREADME
******************************************************************************
Author : Srivamshi Pittala (srivamshi.pittala.gr@dartmouth.edu)
Advisor : Prof. Chris Bailey-Kellogg (cbk@cs.dartmouth.edu)
Project : PECAN: Paratope and Epitope Prediction with graph Convolution Attention Network
Description : Help file for running the scripts to learn and evaluate
graph convolution networks for epitope and paratope prediction
Cite : doi: https://doi.org/10.1101/658054
******************************************************************************
This code was developed using the code provided for Fout et al., Protein Interface Prediction using Graph Convolutional Networks published in NeurIPS, 2017.
Copyright (C) <2019> <Srivamshi Pittala>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see https://www.gnu.org/licenses/.
#*********************** Input Files (pickle format). Can be downloaded from zenodo here: https://zenodo.org/record/3885236.
- Paratope prediction: results_create_files_for_paratope/
- Epitope prediction: results_create_files_for_epitope/
The data format is the following:
- list of dictionaries, each corresponding to a protein
-
- Each dictionary has the following entries(keys):
-
-
- complex_code: str
-
-
-
- l_vertex: numpy matrix of residues in primary protein (dimensions: num_residues x 63)
-
-
-
-
- Indices [0,20] represent one hot encoding of the residue. Last entry for unresolved amino acid '*'.
-
-
-
-
-
- Indices [21,40] represent PSSM entry for the
-
-
-
-
-
- Indices [41:42] represent solvent accessibility
-
-
-
-
-
- Indices [43:62] represent neighborhood composition
-
-
-
-
- l_edge: numpy matrix of edges, currently empty as we don't consider edge features (dimensions: num_residues x 25 x 2)
-
-
-
- l_hood_indices: numpy matrix specifying spatial neighborhood for graph convolution (dimensions: num_residues x 25)
-
-
-
- label: numpy matrix with labels {1: interface, -1: non-interface} (dimensions: num_residues x 2)
-
-
-
- r_vertex: same as l_vertex, but for secondary protein
-
-
-
- r_edge: same as l_edge, but for secondary protein
-
-
-
- r_hood_indices: same as l_hood_indices, but for secondary protein
-
-
-
- label_r: same as label, but for secondary protein #***********************
-
#----------------------- Software (version) #-----------------------
- Python 2.7.15
- PyYAML 3.13
- Numpy 1.14.5
- Pandas 0.23.4
- Scikit-learn 0.19.1
- tensorflow 1.10.0 (gpu mode)
- cudatoolkit 8.0.61
- cudnn 5
#*********************** Training base network on proteins #***********************
- cd GCN_protein_base
- Train "No convolution" network: python sample_experiment_base.py
- Train "Convolution 1-layer" network: python sample_experiment_conv1.py
- Train "Convolution 1-layer + Attention" network: python sample_experiment_attn1.py
- Train "Convolution 2-layer" network: python sample_experiment_conv2.py
- Train "Convolution 2-layer + Attention" network: python sample_experiment_attn2.py
#*********************** For Epitope prediction #***********************
#******** Task-specific learning #********
- cd GCN_task
- Change 'mode_experiment, 'data_directory' and 'output_directory' in configuration.py for epitope data
- Train "No convolution" network: python sample_experiment_base.py
- Train "Convolution 1-layer" network: python sample_experiment_conv1.py
- Train "Convolution 1-layer + Attention" network: python sample_experiment_attn1.py
- Train "Convolution 2-layer" network: python sample_experiment_conv2.py
- Train "Convolution 2-layer + Attention" network: python sample_experiment_attn2.py
#******** Transfer learning (Make sure the base network on proteins was trained successfully) #********
- cd GCN_xTransfer
- Change 'mode_experiment, 'data_directory' and 'output_directory' in configuration.py for epitope data
- Train "No convolution" network: python sample_experiment_base.py
- Train "Convolution 1-layer" network: python sample_experiment_conv1.py
- Train "Convolution 1-layer + Attention" network: python sample_experiment_attn1.py
- Train "Convolution 2-layer" network: python sample_experiment_conv2.py
- Train "Convolution 2-layer + Attention" network: python sample_experiment_attn2.py
#*********************** For Paratope prediction #***********************
#******** Task-specific learning #********
- cd GCN_task
- Change 'mode_experiment, 'data_directory' and 'output_directory' in configuration.py for paratope data
- Train "No convolution" network: python sample_experiment_base.py
- Train "Convolution 1-layer" network: python sample_experiment_conv1.py
- Train "Convolution 1-layer + Attention" network: python sample_experiment_attn1.py
- Train "Convolution 2-layer" network: python sample_experiment_conv2.py
- Train "Convolution 2-layer + Attention" network: python sample_experiment_attn2.py
#******** Transfer learning (Make sure the base network on proteins was trained successfully) #********
- cd GCN_xTransfer
- Change 'mode_experiment, 'data_directory' and 'output_directory' in configuration.py for paratope data
- Train "No convolution" network: python sample_experiment_base.py
- Train "Convolution 1-layer" network: python sample_experiment_conv1.py
- Train "Convolution 1-layer + Attention" network: python sample_experiment_attn1.py
- Train "Convolution 2-layer" network: python sample_experiment_conv2.py
- Train "Convolution 2-layer + Attention" network: python sample_experiment_attn2.py
Related Skills
node-connect
350.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
