BGWAS
R package to perform Bayesian Genome-Wide Association Studies
Install / Use
/learn @n-mounier/BGWASREADME
## ✔ Setting active project to
## '/Users/nmounier/Documents/SGG/Projects/Packaging/bGWAS'
bGWAS <img src="inst/Figures/logo.png" align="right" height=180/>
<!--- # https://github.com/GuangchuangYu/hexSticker library(hexSticker) imgurl <- "inst/Figures/PriorEstimation.jpg" sticker(imgurl, package="bGWAS", p_size=8, p_color="#B4CE4E", h_fill="white", h_color="#A7E4F8", s_x=1, s_y=.8, s_width=.75, filename="inst/Figures/logo.png", dpi=2000) --->:arrow_right: ESHG poster is available here.
:information_source: bGWAS has been updated to version 1.0.3.
This update should solve the compatibility issues that arose with more
recent R versions, but does not affect the analyses results. Note that
you might need to update some packages to be able to continue using
bGWAS.
:warning: 28/10/2019 : The variance of the prior effects has been
modified. If you used a previous version of the package, please re-run
your analysis using this new version to get more accurate results.
Check the NEWS to learn more about what has been modified!
:warning: If you downloaded the Z-Matrix files before 20/08/2019, they
are now obsolete and you will not be able to use them with the newest
version of the package.
Note: some Prior GWASs have been removed, you can find more details
here.
Overview
bGWAS is an R-package to perform a Bayesian GWAS (Genome Wide
Association Study), using summary statistics from a conventional GWAS as
input. The aim of the approach is to increase power by leveraging
information from related traits and by comparing the observed Z-scores
from the focal phenotype (provided as input) to prior effects. These
prior effects are directly estimated from publicly available GWASs
(currently, a set of 38 studies, last update 20-08-2019 - hereinafter
referred to as “prior GWASs” or “risk factors”). Only prior GWASs having
a significant causal effect on the focal phenotype, identified using a
multivariable Mendelian Randomization (MR) approach, are used to
calculate the prior effects. Causal effects are estimated masking the
focal chromosome to ensure independence, and the prior effects are
estimated as described in the figure below.
<img src="inst/Figures/PriorEstimation.jpg" align="center" height=300/>
Observed and prior effects are compared using Bayes Factors. Significance is assessed by calculating the probability of observing a value larger than the observed BF (P-value) given the prior distribution. This is done by decomposing the analytical form of the BFs and using an approximation for most BFs to make the computation faster. Prior, posterior and direct effects, alongside BFs and p-values are returned. Note that prior, posterior and direct effects are estimated on the Z-score scale, but are automatically rescaled to beta scale if possible.
The principal functions available are:
-
bGWAS()
main function that calculates prior effects from prior GWASs, compares them to observed Z-scores and returns an object of class bGWAS -
list_priorGWASs()
directly returns information about the prior GWASs that can be used to calculate prior effects -
select_priorGWASs()
allows a quick selection of prior GWASs (to include/exclude specific studies when calculating prior effects) -
extract_results_bGWAS()
returns results (prior, posterior and direct estimate / standard-error + p-value from BF for SNPs) from an object of class bGWAS -
manhattan_plot_bGWAS()
creates a Manhattan Plot from an object of class bGWAS -
extract_MRcoeffs_bGWAS()
returns multivariable MR coefficients (1 estimate using all chromosomes + 22 estimates with 1 chromosome masked) from an object of class bGWAS -
coefficients_plot_bGWAS()
creates a Coefficients Plot (causal effect of each prior GWASs on the focal phenotype) from an object of class bGWAS -
heatmap_bGWAS()
creates a heatmap to represent, for each significant SNP, the contribution of each prior GWAS to the estimated prior effect from an object of class bGWAS
All the functions available and more details about their usage can be found in the manual.
Installation
You can install the current version of bGWAS with:
# Directly install the package from github
# install.packages("remotes")
remotes::install_github("n-mounier/bGWAS")
library(bGWAS)
<!--- Note: using remotes instead of devtools leads to re-build the package
and apparently, it may be a problem with R 3.4 and macOS,
see https://stackoverflow.com/questions/43595457/alternate-compiler-for-installing-r-packages-clang-error-unsupported-option/43943631#43943631 --->
Usage
To run the analysis with bGWAS two inputs are needed:
1. The GWAS results to be tested
Can be a regular (space/tab/comma-separated) file or a gzipped file
(.gz) or a data.frame. Must contain the following columns, which can
have alternative names:
If you want the prior/posterior/corrected effects to be rescaled, please make sure to provide effect sizes and standard errors instead of (or in addition to) Z-statistics.
2. Prior GWASs - Z-Matrix files
These files should be downloaded separately and stored in ~/ZMatrices
or in the folder specified when launching the analysis. These files
contains the Z-scores for all prior GWASs :
You can download these files using this link or following the instructions below. Please note that your input GWAS will be merged with the Z-Matrix files (using rsid and alleles to align effects), and that the results reported will use the Z-Matrix files chr:pos information (GRCh37 - since UK10K data has been used to imputed the prior GWASs).
- On UNIX/MACOSX, from a terminal:
wget https://drive.switch.ch/index.php/s/jvSwoIxRgCKUSI8/download -O ZMatrices.tar.gz
tar xzvf ZMatrices.tar.gz
<!--- - On WINDOWS, from a terminal:
``` bash
...
``` --->
<font color="grey"><small> If you want to use your own set of prior GWASs, please have a look here to see how you can modify the files.
<!---We focused on including prior GWASs that do not come from UKBB, assuming that the focal phenotype results are more likely to be obtained from UKBB. Sample overlap between the focal phenotype and the prior GWASs is not accounted for by our method, so we did not include any UKBB results in the prior GWASs. ---></font> </small>
Study Selection
Before running your analysis, you can select the prior GWASs you want to
include. You can use the function list_priorGWASs() to get some
information about the prior GWASs available.
You should remove traits that by definition are not independent from
your trait of interest. For example, before analysing BMI results, make
sure to exclude “Height” from the prior GWASs used. You can use the
function select_priorGWASs() to automatically exclude/include some
traits or some files. You should also check for sample overlap, and
remove prior GWASs that come from the same consortium as your data. If
there are individuals in common between your conventional GWAS and prior
GWASs, it might induce some bias.
# Obtain the list of prior GWASs
AllStudies = list_priorGWASs()
# Select only the ones for specific traits
# select_priorGWASs will return the IDs of the files that are kept
MyStudies = select_priorGWASs(include_traits=c("Heart Rate", "Body Mass Index", "Smoking"))
# Match these IDs against the ones in the list of prior GWASs
AllStudies[AllStudies$ID %in% MyStudies, ]
## # A tibble: 6 × 10
## File
## <chr>
## 1 All_ancestries_SNP_gwas_mc_merge_nogc.tbl.uniq.gz
## 2 META_STAGE1_GWASHR_SUMSTATS.txt
## 3 tag.cpd.tbl.gz
## 4 tag.evrsmk.tbl.gz
## 5 tag.former.tbl.gz
## 6 tag.logonset.tbl.gz
## Name ID Trait Consortium
## <chr> <dbl> <chr> <chr>
## 1 Body Mass Index (GIANT) 1 Body Mass Index GIANT
## 2 Heart Rate (HRgene) 23 Heart Rate HRgene
## 3 Smoking - cigarettes per day (TAG) 35 Smoking TAG
## 4 Smoking - ever smoked (TAG)
