G2P
G2P: A Genome-Wide-Association-Study Simulation Tool for Genotype Simulation, Phenotype Simulation, and Power Evaluation
Install / Use
/learn @XiaoleiLiuBio/G2PREADME
G2P

A Genome-Wide-Association-Study Simulation Tool for Genotype Simulation, Phenotype Simulation, and Power Evaluation
<p align="center"> <a href="https://raw.githubusercontent.com/XiaoleiLiuBio/G2P/master/results/G2P_logo.png"> <img src="results/G2P_logo.png" height="250px" width="450px"> </a> </p>More abundant simulation functions could be referred to our newly developed package SIMER for simulation of life science and breeding
Authors:
You Tang and Xiaolei Liu
Contact:
[xiaoleiliu@mail.hzau.edu.cn](Xiaolei Liu)
Contents
<!-- TOC updateOnSave:false -->- Installation
- Data Preparation
- Genotype Simulation
- Phenotype Simulation
- Population Structure
- Quality Control
- GWAS
- Method Evaluation
- FAQ and Hints
Installation
Environment Setup
back to top
JDK1.8 should be installed and environment variables must be configured before using G2P (http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html)
Windows
back to top
GUI
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/gG2P_win_64 and double click the .jar file
Pipeline
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/kG2P_win_64
Mac
back to top
GUI
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/gG2P_mac and double click the .jar file
Pipeline
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/kG2P_mac
permission setting
$ chmod 777 gemma oldplink plink
Linux
back to top
GUI
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/gG2P_linux_x86_64
and run
$ Java -jar gG2P.jar
Pipeline
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/kG2P_linux_x86_64
permission setting
$ chmod 777 gemma oldplink plink
Data Preparation
All files should be prepared with the same prefix
ped
details see http://zzz.bwh.harvard.edu/plink/data.shtml#ped
back to top
|Family ID|Individual ID|Father ID|Mother ID|Sex|Trait|marker 1|marker 2|marker 3|marker 4|marker 5|marker 6| | :---: | :---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: | |1|33-16| 0| 0| 0| 2| 0 0| A A| A A| A G| A G| A G| |1|38-11| 0| 0| 0| 2| 0 0| A G| A G| A A| A G| A G| |1|4226 | 0| 0| 0| 2| 0 0| A G| A A| A A| A G| A G| |1|4722| 0| 0| 0| 2| 0 0| A G| A G| A A| A G| A G| |1|A188 | 0| 0| 0| 2| 0 0| A A| A A| A A| A G| A G| |1|A214N| 0| 0| 0| 2| 0 0| A G| A A| A G| A A| A G| |1|A239 | 0| 0| 0| 2| 0 0| A A| A A| A G| A G| A A|
|Family ID|Individual ID|Father ID|Mother ID|Sex|Trait|marker 1|marker 2|marker 3|marker 4|marker 5|marker 6| | :---: | :---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: | |1|33-16| 0| 0| 0| 2| 0 0| 1 1| 1 1| 1 3| 1 3| 1 3| |1|38-11| 0| 0| 0| 2| 0 0| 1 3| 1 3| 1 1| 1 3| 1 3| |1|4226 | 0| 0| 0| 2| 0 0| 1 3| 1 1| 1 1| 1 3| 1 3| |1|4722| 0| 0| 0| 2| 0 0| 1 3| 1 3| 1 1| 1 3| 1 3| |1|A188 | 0| 0| 0| 2| 0 0| 1 1| 1 1| 1 1| 1 3| 1 3| |1|A214N| 0| 0| 0| 2| 0 0| 1 3| 1 1| 1 3| 1 1| 1 3| |1|A239 | 0| 0| 0| 2| 0 0| 1 1| 1 1| 1 3| 1 3| 1 1|
map
details see http://zzz.bwh.harvard.edu/plink/data.shtml#map
back to top
|Chromosome ID|Marker ID|Genetic Distance|Physical Distance| | :---: | :---: |:---: |:---: | |1| PZB00859.1| 0| 157104| |1| PZA01271.1| 0| 1947984| |1| PZA03613.2| 0| 2914066| |1| PZA03613.1| 0| 2914171| |1| PZA03614.2| 0| 2915078| |1| PZA03614.1| 0| 2915242| |1| PZA00258.3| 0| 2973508|
pop
back to top
new samples will be generated using samples within sub-population
|Sample ID|sub-Population ID| | :---: | :---: | |33-16| 1| |38_11| 1| |4226| 1| |4722| 2| |A188| 2| |A214N| 2| |A239| 2| |A272| 2| |A441-5| 2| |A554| 3| |A556| 3| |A6| 3| |A619| 3|
qtn
back to top
each column represents simulated QTNs for each phenotype
|Phenotype 1|Phenotype 2|Phenotype 3|Phenotype 4|Phenotype 5| | :---: | :---: | :---: | :---: | :---: | |66 |67 |80 |83 |90| |9 |15 |52 |59 |135| |90 |96 |143 |147 |174| |3 |3 |15 |58 |89| |89 |118 |185 |203 |212| |69 |72 |72 |84 |110| |46 |59 |125 |204 |207| |14 |15 |19 |29 |39| |9 |23 |65 |111 |131| |19 |52 |74 |179 |194|
Genotype Simulation
Single Population _ GUI
<p align="center"> <a href="https://raw.githubusercontent.com/XiaoleiLiuBio/G2P/master/results/Single Population.png"> <img src="results/Single Population.png" height="400px" width="460px"> </a> </p>Ped: ped file
Map: map file
Path for output Ped/Map: path for output ped and map file
Block: Yes or No, if "Yes", the whole genome will be divided into blocks and shuffled to generate new samples
Number of SNPs in each block: Number of SNPs in each block
Mutation rate: the frequency of new mutations
Imputation: if TRUE, major allele will be used to impute missing values
Population size: simulated sample size
Single Population _ Pipeline
Windows
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100 --block 4 –impute
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100 --block 4 --mutation 0.0001 --impute
Linux/Mac
java -jar kG2P.jar --ped /root/data/AG.ped --map /root/data/AG.map --outgen /root/data/output --rn 100 --block 4 –impute
java -jar kG2P.jar --ped /root/data/AG.ped --map /root/data/AG.map --outgen /root/data/output --rn 100 --mutation 0.0001
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100 --block 4
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100 --impute
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100 --mutation 0.0001
jar: executive software
ped: ped file
map: map file
outgen: output path
block: number of SNPs in each block
rn: simulated sample size
impute: if 'impute' is added, major allele will be used to impute missing value
mutation: the frequency of new mutations
Multi Populations _ GUI
<p align="center"> <a href="https://raw.githubusercontent.com/XiaoleiLiuBio/G2P/master/results/Multi Populations.png"> <img src="results/Multi Populations.png" height="400px" width="460px"> </a> </p>Ped: ped file
Map: map file
Pop: pop file
Path for output Ped/Map: path for output ped and map file
Block: Yes, or No, if "Yes", the whole genome will be divided into blocks and shuffled to generate new samples
Number of SNPs in each block: Number of SNPs in each block
Mutation rate: the frequency of new mutations
Migration rate: the ratio of immigrants (or emigrants) for each group
Genetic drift: is the change in the frequency of an existing gene variant (allele) in a population due to random sampling of organisms
Imputation: if TRUE, major allele will be used to impute missing values
Sample size of each population: sample size of each newly simulated population
Population size: number or vector, simulated sample size
Multi Populations _ Pipeline
Windows
java -jar kG2P.jar --ped D:\data\AG.ped –map D:\data\AG.map --pop D:\data\AG.pop --outgen D:\data\output --block 4 --rn 100
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --pop D:\data\AG.pop --outgen D:\data\output --block 4 --rn 100 --mutation 0.0001 --mig 0.1 --genetic 0.001
Linux/Mac
java -jar kG2P.jar --ped /root/data/AG.ped --map /root/data/AG.map --pop /root/data/AG.pop --outgen /root/data/output --impute --block 4 --rn 100
java -jar kG2P.jar --ped /root/data/AG.ped --map /root/data/AG.map --pop /root/data/AG.pop --outgen /root/data/output --rn 100 --mutation 0.0001 --mig 0.1 --genetic 0.001
java -jar kG2P.jar --ped /root/data/AG.ped --map /root/data/AG.map --pop /root/data/AG.pop --outgen /root/data/output --rn 100 --genetic 0.001
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.
Security Score
Audited on Mar 18, 2026
