MPEPE
No description available
Install / Use
/learn @BRITian/MPEPEREADME
MPEPE
Introduction
MPEPE is a prediction method based on deep learning to improve E.coli protein expression. Here, we provided MPEPE's codonc4 model (MPEPE/MODELS-1027/*.h5), prediction example sequence (MPEPE/Example/*.fa) and its result example (MPEPE/Example/2021_11_5_Pred1027/*.res).
Note[1]:
- "codonc3" represents: Synonymous codon number;
- "aac3" represents: The specific amino acid;
- "codonc4" represents: Specific nucleotide combination.
System requirement
- Python 2.7
- tensorflow 1.15.0
- keras 2.1.5
- theano 1.0.5
Quick Start to install the required program
- Install the python 2.7 (from Anaconda https://www.anaconda.com/)
- pip install tensorflow==1.15.0 (python=2.7)
- pip install keras==2.1.5
- pip install theano==1.0.5
- git clone https://github.com/BRITian/MPEPE
Predict the soluble expression of the sequence in E.coli
Put the model folder (MPEPE/MODELS-1027/), the predicted python file (MPEPE_pred.codonc4.py) and the nucleic acid sequences file (FILE_NAME.fa, or fasta file with any extension) to be predicted in the same directory, and then enter the python=2.7 environment to run:
python MPEPE_pred.codonc4.py FILE_NAME.fa
The prediction result of the final model will be recorded in "Year_Month_Day_Pred1027/Pred_codonc4_FILE_NAME.res"
Result analysis
In addition to the comment("#") rows, there are three columns. The first column is the IDs of the predicted sequences, the second column is the average value of high expression probability (AVE) predicted by 10 models, and the third column is the average value (AVE) predicted by 10 models that the sequence is Standard deviation of high probability of expression (STD) :
# === Predicted the probability of highly expressed proteins === # (comment row)
# id AVE(High_expression) STD(High_expression) # (comment row)
AaR97-5 0.8260 0.0706
AbR19-5 0.7625 0.1067
AbR28-5 0.7391 0.1390
ZR348-5 0.7630 0.0840
ZR319-5 0.8499 0.0674
ZR310-5 0.8511 0.0575
As shown in the example (high_expression_seq.fa) results above, the larger the value in the second column (AVE), the better the expression of the sequence in E. coli.
Model building
Put the coding folder (MPEPE/coding-1027) and the python file (MPEPE/keras_train_cnn_lstm_v6-1027.py) in the dataset folder (MPEPE/Dataset ) in the same working directory, and then enter the python=2.7 environment to build the model according to your own needs.
If you want to build codonc3 models, run:
python keras_train_cnn_lstm_v6-1027.py codonc3
If you want to build aac3 models, run:
python keras_train_cnn_lstm_v6-1027.py aac3
If you want to build codonc4 models, run:
python keras_train_cnn_lstm_v6-1027.py codonc4
Eventually, a Log folder (LOGs-1027) and a Models folder (MODELs-1027) will be created in the working directory.
References
[1] Ding Z, Guan F, Xu G, Wang Y, Yan Y, Wu N, et al. MPEPE, a predictive approach to improve protein expression in E. coli based on deep learning.
Related Skills
node-connect
325.6kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
80.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
325.6kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
80.2kCommit, push, and open a PR
