MAGE
Analysis of gene expression and splicing diversity in a subset of samples from the 1000 Genomes Project, including eQTL and sQTL discovery and annotation.
Install / Use
/learn @mccoy-lab/MAGEREADME
MAGE: Multi-ancestry Analysis of Gene Expression
⚠️ 2026-03-27 IMPORTANT NOTICE - SAMPLE SWAP ⚠️
A sample swap has been detected:
- Library SRR19762530 is labeled as being derived from HG00237; it is actually derived from NA11919.
- This library was NOT used for downstream QTL mapping. The HG00237 library used for QTL mapping was SRR19762247.
- Library SRR19762653 is labled as being derived from NA11919; it is actually derived from HG00237.
- This library WAS used for downstream QTL mapping.
UPDATED STATUS 2026-04-03
- Two new runs have been added to BioProject PRJNA851328 to correct this sample swap:
- SRR37907959 derives from HG00237
- SRR37907958 derives from NA11919
- These runs replace the incorrect labeled SRR19762530 and SRR19762653 runs
- A small note: SRR19762530 and SRR19762653 were re-labeled on SRA before being replaced. So SRR19762530 will appear as being derived from NA11919 if you look it up on SRA now.
- The Zenodo repository has NOT been updated yet. Please continue to use these data with caution.
Thank you Dr. Steven M. Heaton for making us aware of this issue.
MAGE comprises RNA-seq data from lymphoblastoid cell lines derived from 731 individuals from the 1000 Genomes Project (1KGP), representing 26 globally-distributed populations across five continental groups. These data offer a large, geographically diverse, open access resource to facilitate studies of the distribution, genetic underpinnings, and evolution of variation in human transcriptomes and include data from several ancestry groups that were poorly represented in previous studies.
Data Access
Raw reads
Newly generated RNA sequencing data for the 731 individuals (779 total libraries) is available on the Sequence Read Archive (Accession: PRJNA851328).
Processed data
Processed gene expression matrices and QTL mapping results (as well as a host of other downstream data) are currently available on Zenodo (MAGEv1.0 Zenodo link) as well as Dropbox (MAGEv1.0 Dropbox link).
Briefly, this repo contains the following data:
- Sample metadata and sequencing metrics
- Gene expression and splicing matrices used for e/sQTL mapping and analyses of global trends of expression/splicing diversity
- cis-e/sQTL mapping results, including aFC estimates for cis-eQTLs
- Functional annotations of cis-e/sQTLs
- Results of colocalization analysis between MAGE e/sQTLs and complex trait GWAS from the PAGE study
- Results of analyses of global trends of expression/splicing diversity
- Jointly-generated top genotype PCs for samples in MAGE and other resources with paired WGS/RNA-seq data (Geuvadis, GTEx, AFGR)
READMEs are provided for all data in the repo.
If you are having trouble accessing these data, please feel free to contact us to explore other options (e.g., Globus).
Variant calls
The high-coverage variant calls used for QTL mapping were previously generated by the New York Genome Center (NYGC) and are available through the 1KGP FTP site.
Code
Code used for data processing and downstream analyses is made available in the analysis_pipeline/ directory, along with READMEs describing how each script is run.
Code used to produce major figures/panels in the manuscript is made available in the figure_generation/ directory.
The MAGE manuscript
For more information about the MAGE resource as well as analyses performed using this resource, please see our paper:
Sources of gene expression variation in a globally diverse human cohort<br> Dylan J. Taylor, Surya B. Chhetri, Michael G. Tassia, Arjun Biddanda, Stephanie M. Yan, Genevieve L. Wojcik, Alexis Battle, Rajiv C. McCoy
Citing MAGE
If you use MAGE data in your own work, please cite the paper linked above.
Related Skills
node-connect
352.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
