Maggie
Bioinformatic approach to identify functional transcription factor binding motifs
Install / Use
/learn @zeyang-shen/MaggieREADME
MAGGIE
MAGGIE provides a framework for identifying DNA sequence motifs mediating transcription factor binding and function. By leveraging measurements and genetic variation information from different genotypes (human individuals, animal strains, or alleles), MAGGIE associates the mutation of DNA sequence motif with various types of epigenomic features, including but not limited to transcription factor binding, open chromatin, histone modification, and stimulus response of regulatory elements.
Here is the overview of the method:
<p align="center"> <img src="https://github.com/zeyang-shen/maggie/blob/master/image/method.png" width="900" height="280"> </p>Installation
First, copy the github folder and go into the "maggie" folder:
git clone https://github.com/zeyang-shen/maggie.git
cd maggie
Next, configure an environment where MAGGIE can work. Anaconda is required for environment setup and package installation (https://www.anaconda.com/download/#macos).
After installing the anaconda, run the following command to automatically create an environment named "maggie" with required dependencies:
conda env create --file environment.yml
After setting up the environment, activate it:
conda activate maggie
Now you are ready to run your own analysis under the "maggie" folder!
Quick Usage
All the executable scripts are stored in the bin/ directory. Here is the usage of MAGGIE on a toy example of CTCF allele-specific binding sites stored in FASTA files.
Let's first go into the cloned folder:
cd maggie
Then you can run the script for FASTA inputs as below:
python ./bin/maggie_fasta_input.py \
./data/AlleleSpecificBinding/CTCF_binding_alleles.fa \
./data/AlleleSpecificBinding/CTCF_nonbinding_alleles.fa \
-o ./data/AlleleSpecificBinding/maggie_output/ \
-p 8
After the job is done, open the "mergedSignificant.html" file at "data/AlleleSpecificBinding/maggie_output/" with your web browser and take a look at the significant motifs.
Alternatively, you can add the bin/ directory to your PATH in order to execute those scripts from anywhere:
export PATH=/path/to/your/cloned/maggie/bin:$PATH
Then you can execute the previous script by maggie_fasta_input.py directly.
Go to our tutorials for usage of MAGGIE in other cases.
Example output
MAGGIE will display significant motifs in the HTML format. Here is an example for CTCF allele-specific binding sites:
<p align="center"> <img src="https://github.com/zeyang-shen/maggie/blob/master/image/html_example.png" width="900" height="200"> </p>Header: total number of samples
Column 1: ranking based on absolute value of -log10(p-value)
Column 2: merged motifs based on a high correlation among changes of their motif scores
Column 3: PWM logo for the motif with lowest p-value
Column 4: signed -log10(p-value) and 90% confidence interval
Column 5: # and percentage of sequences with motif mutations
Column 6: # sequences with higher motif scores in the positive set and its fraction of Column 5
Column 7: # sequences with higher motif scores in the negative set and its fraction of Column 5
Column 8: median value of non-zero motif score differences
Column 9: mean value of non-zero motif score differences
Column 10: distribution of non-zero motif score differences
Documentation
Please go to our wiki page for more detailed usage of MAGGIE.
Citation
If you use our findings, the software, or the NF-kb ChIP-seq data at GEO:GSE144070, please cite
Contact
If you enconter a problem when using the software, you can
License
This project is licensed under GNU GPL v3
Contributors
MAGGIE was developed primarily by Zeyang Shen, with contributions and suggestions by Marten Hoeksema and Zhengyu Ouyang. Supervision for the project was provided by Christopher Glass and Christopher Benner.
Related Skills
node-connect
351.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
351.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
351.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
