simplePHENOTYPES

Installation
Load Sample Data set
Single Trait
Multiple Traits: Pleiotropy Architecture
Multiple Traits: Partial Pleiotropy Architecture
Multiple Traits: Spurious Pleiotropy Architecture
Multiple Traits: Partial Pleiotropy Architecture with other useful parameters
Using Multiple Marker Data Files
Contact

This short tutorial presents some of the possible genetic settings one could simulate, but it certainly does not explore all the possibilities. For more information on specific input parameters, please check the help documentation (?create_phenotypes).

Installation

In order to install simplePHENOTYPES, the following r packages will also be installed:

From Bioconductor:
- SNPRelate
- gdsfmt
From CRAN:
- mvtnorm
- lqmm
- data.table

setRepositories(ind = 1:2)
devtools::install_github("samuelbfernandes/simplePHENOTYPES", build_vignettes = TRUE)

Load Sample Data set

Note that the data set used in all vignettes is already in numeric format. In addition to the numeric format, simplePHENOTYPES’ parameter geno_obj also takes an R object in HapMap format as input. Other input options are VCF, GDS, and Plink bed/ped. These last formats should be loaded from file with geno_file or geno_path.

library(simplePHENOTYPES)
data("SNP55K_maize282_maf04")
SNP55K_maize282_maf04[1:8, 1:10]

Single Trait

The simplest option is the simulation of univariate traits. In the example below, we are simulating ten single trait experiments with a heritability of 0.7. In this setting, the simulated trait is controlled by one large-effect QTN (big_add_QTN_effect = 0.9) and two small effect QTNs. The additive effects of these last two QTNs follow a geometric series starting with 0.2. Thus, the effect size of the first of these two QTNs is 0.2, and the effect size of the second is 0.22. Results are being saved at a temporary directory (home_dir = tempdir()). Please see help files (?create_phenotypes) to see which default values are being used.

create_phenotypes(
  geno_obj = SNP55K_maize282_maf04,
  add_QTN_num = 3,
  add_effect = 0.2,
  big_add_QTN_effect = 0.9,
  rep = 10,
  h2 = 0.7,
  model = "A",
  home_dir = tempdir())

Multiple Traits: Pleiotropy Architecture

simplePHENOTYPES provides three multi-trait simulation scenarios: pleiotropy, partial pleiotropy, and spurious pleiotropy. In this example, we are simulating three (ntraits = 3) pleiotropic (architecture = "pleiotropic") trait controlled by three additive and four dominance QTNs. The effect size of the largest-effect additive QTN is 0.3 for all traits (big_add_QTN_effect = c(0.3, 0.3, 0.3)), while the additive and dominance effect sizes are 0.04, 0.2, and 0.1 for each trait, respectively. Heritability for trait_1 is 0.2, while the heritability of the two correlated traits is 0.4. Each replicate is being recorded in a different file (output_format = "multi-file") in a folder named “Results_Pleiotropic”. In this setting, we do not specify the correlation between traits; instead, the observed (realized) correlation is an artifact of different allelic effects for each trait. The same QTNs are used to generate phenotypes in all ten replications (vary_QTN = FALSE)(default); alternatively, we could select different QTNs in each replicate using vary_QTN = TRUE. As mentioned above, the first QTN of each trait will get the effect provided by big_add_QTN_effect; all other QTNs will have the effect size assigned by add_effect and dom_effect. The vector add_effect contains one allelic effect for each trait, and a geometric series (default) is being used to generate allelic effects for each one of the two additive QTNs (add_QTN_num = 3) and three dominance QTNs (dom_QTN_num = 4). All results will be saved to file, and a data.frame with all phenotypes will be assigned to an object called “test1” (to_r = TRUE).

 test1 <-  create_phenotypes(
    geno_obj = SNP55K_maize282_maf04,
    add_QTN_num = 3,
    dom_QTN_num = 4,
    big_add_QTN_effect = c(0.3, 0.3, 0.3),
    h2 = c(0.2, 0.4, 0.4),
    add_effect = c(0.04,0.2,0.1),
    dom_effect = c(0.04,0.2,0.1),
    ntraits = 3,
    rep = 10,
    vary_QTN = FALSE,
    output_format = "multi-file",
    architecture = "pleiotropic",
    output_dir = "Results_Pleiotropic",
    to_r = TRUE,
    seed = 10,
    model = "AD",
    sim_method = "geometric",
  home_dir = tempdir()
  )

Optionally, we may input a list of allelic effects (sim_method = "custom"). In the example below, a geometric series (custom_geometric) is being assigned and should generate the same simulated data as the previous example (all.equal(test1, test2)). Notice that since big_add_QTN_effect is non-NULL, we only need to provide effects for two out of the three simulated additive QTNs. On the other hand, all four dominance QTN must have an effect assigned on the custom_geometric_d list. Importantly, the allelic effects are assigned to each trait based on the order they appear in the list and not based on the names, i.e., ‘trait_1’, ‘trait_2’, and ‘trait_3’.

 custom_geometric_a <- list(trait_1 = c(0.04, 0.0016),
                         trait_2 = c(0.2, 0.04),
                         trait_3 = c(0.1, 0.01))
 custom_geometric_d <- list(trait_1 = c(0.04, 0.0016, 6.4e-05, 2.56e-06),
                         trait_2 = c(0.2, 0.04, 0.008, 0.0016),
                         trait_3 = c(0.1, 0.01, 0.001, 1e-04))

 test2 <-  create_phenotypes(
   geno_obj = SNP55K_maize282_maf04,
   add_QTN_num = 3,
   dom_QTN_num = 4,
   big_add_QTN_effect = c(0.3, 0.3, 0.3),
   h2 = c(0.2,0.4, 0.4),
   add_effect = custom_geometric_a,
   dom_effect = custom_geometric_d,
   ntraits = 3,
   rep = 10,
   vary_QTN = FALSE,
   output_format = "multi-file",
   architecture = "pleiotropic",
   output_dir = "Results_Pleiotropic",
   to_r = T,
   sim_method = "custom",
   seed = 10,
   model = "AD",
  home_dir = tempdir()
 )
 
 all.equal(test1, test2)

Multiple Traits: Partial Pleiotropy Architecture

In this example, we simulate 20 replicates of three partially pleiotropic traits (architecture = "partially"), which are respectively controlled by seven, 13, and four QTNs. All QTNs will have additive effects that follow a geometric series, where the effect size of the ith QTN is add_effect^i. For instance, trait_2 is controlled by three pleiotropic additive QTNs and ten trait-specific additive QTNs; consequently, the first pleiotropic additive QTN will have an additive effect of 0.33 and the 13th trait-specific additive QTN will have an effect of 0.3313. Correlation among traits is assigned to be equal to the cor_matrix object. All 20 replicates of these three simulated traits will be saved in one file, specifically in a long format and with an additional column named “Rep”. Results will be saved in a directory called “Results_Partially”. In this example, the genotype file will also be saved in numeric format.

cor_matrix <- matrix(c(   1, 0.3, -0.9,
                        0.3,   1,  -0.5,
                       -0.9, -0.5,    1 ), 3)

sim_results <- create_phenotypes(
  geno_obj = SNP55K_maize282_maf04,
  ntraits = 3,
  pleio_a = 3,
  pleio_e = 2,
  same_add_dom_QTN = TRUE,
  degree_of_dom = 0.5,
  trait_spec_a_QTN_num = c(4, 10, 1),
  trait_spec_e_QTN_num = c(3, 2, 5),
  h2 = c(0.2, 0.4, 0.8),
  add_effect = c(0.5, 0.33, 0.2),
  epi_effect = c(0.3, 0.3, 0.3),
  epi_interaction = 2,
  cor = cor_matrix,
  rep = 20,
  output_dir = "Results_Partially",
  output_format = "long",
  architecture = "partially",
  out_geno = "numeric",
  to_r = TRUE,
  model = "AE",
  home_dir = tempdir()
)

Multiple Traits: Spurious Pleiotropy Architecture

Another architecture implemented is Spurious Pleiotropy. In this case, we have two options: direct or indirect LD (type_of_ld = "indirect"). In the example below, we simulate a case of indirect LD with five replicates of two traits controlled by three additive QTNs each. For each QTN, a marker is first selected (intermediate marker), and then two separate markers (one upstream and another downstream) are picked to be QTNs for each of the two traits. This QTN selection is based on an r2 threshold of at most 0.8 (ld_max=0.8) with the intermediate marker. The three QTNs will have additive effects that follow a geometric series, where the effect size of the ith QTN is 0.02i for one trait and 0.05i for the other trait. Starting seed number is 200, and output phenotypes are saved in one file, but in a “wide” format with each replicate of two traits being added as additional column

SimplePHENOTYPES

Install / Use

README