GRETTA
GRETTA (Genetic inteRaction and EssenTiality neTwork mApper): An R package for mapping genetic interaction and essentiality networks
Install / Use
/learn @ytakemon/GRETTAREADME

Introduction
Genetic inteRaction and EssenTiality mApper (GRETTA) is an R package that leverages data generated by the Cancer Dependency Map (DepMap) project to perform in-silico genetic knockout screens and map essentiality networks. A manuscript describing this tool is available at bioinformatics (Takemon, Y. and Marra, MA., 2023).
The DepMap data used in this tutorial is version 22Q2. This version along with all versions provided in this repository were downloaded through the DepMap data portal, which was distributed and used under the terms and conditions of CC Attribution 4.0 license.
Maintainer
This repository is maintained by Yuka Takemon, research associate in Dr. Marco Marra’s laboratory at Canada’s Michael Smith Genome Sciences Centre.
Citations
When using GRETTA, please cite the manuscript describing GRETTA: Yuka Takemon, Marco A Marra, GRETTA: an R package for mapping in silico genetic interaction and essentiality networks, Bioinformatics, Volume 39, Issue 6, June 2023, btad381, https://doi.org/10.1093/bioinformatics/btad381
Please also cite the DepMap project and the appropriate data version found on https://depmap.org/portal/: Tsherniak A, Vazquez F, Montgomery PG, Weir BA, Kryukov G, Cowley GS, Gill S, Harrington WF, Pantel S, Krill-Burger JM, Meyers RM, Ali L, Goodale A, Lee Y, Jiang G, Hsiao J, Gerath WFJ, Howell S, Merkel E, Ghandi M, Garraway LA, Root DE, Golub TR, Boehm JS, Hahn WC. Defining a Cancer Dependency Map. Cell. 2017 Jul 27;170(3):564-576.
Questions
Please check the FAQ section for additional information and if you cannot find your answer there or have a request please submit an issue.
Requirements
- GRETTA is supported and compatible for R versions >= 4.2.0.
- 12G of space to store one DepMap data set with and an additional 11G of temporary space to for .tar.gz prior to extraction.
Installation
Warning The new version of dbplyr (v2.4.0) is currently incompatable with another library used in GRETTA. If you encounter an error message like the one below. Please install the previous working version also shown below.
Error message:
Error in `collect()`: ! Failed to collect lazy table. Caused by error in `db_collect()`: ! Arguments in `...` must be used. ✖ Problematic argument: • ..1 = Inf ℹ Did you misspell an argument name?Solution:
install.packages("devtools") devtools::install_version("dbplyr", version = "2.3.4")`
You can install the GRETTA package from GitHub with:
install.packages(c("devtools", "dplyr","forcats","ggplot2"))
devtools::install_github("ytakemon/GRETTA")
DepMap 22Q2 data and the data documentation files are provided above and can be extracted directly in terminal using the following bash code (not in R/RStudio). For other DepMap data versions please refer to the FAQ section.
# Make a new directory/folder called GRETTA_project and go into directory
mkdir GRETTA_project
cd GRETTA_project
# Download data from the web
wget https://www.bcgsc.ca/downloads/ytakemon/GRETTA/22Q2/GRETTA_DepMap_22Q2_data.tar.gz
# Extract data and data documentation
tar -zxvf GRETTA_DepMap_22Q2_data.tar.gz
A singularity container has also been provided and instructions can be found here.
Additional DepMap versions
In this example we use DepMap’s 2022 data release (22Q2). However, we
also provide previous data released in 2020 (v20Q1) and 2021 (v21Q4),
which are available at
:https://www.bcgsc.ca/downloads/ytakemon/GRETTA/. We are hoping to
make new data sets available as the are released by DepMap.
Workflows
Genetic interaction mapping
- Install
GRETTAand download accompanying data. - Select mutant cell lines that carry mutations in the gene of
interest and control cell lines.
- (optional specifications) can be used to select cell lines based on disease type, disease subtype, or amino acid change.
- Determine differential expression between mutant and control cell
line groups.
- (optional but recommended).
- Perform in silico genetic screen.
- Visualize results.
Co-essential network mapping
- Install
GRETTAand download accompanying data. - Run correlation coefficient analysis.
- (optional specifications) can be used to perform analysis on cell lines of a specific disease type(s).
- Calculate inflection points of negative/positive curve to determine a threshold.
- Apply threshold.
- Visualize results.
Example: Identifying ARID1A genetic interactions
ARID1A encodes a member of the chromatin remodeling SWItch/Sucrose
Non-Fermentable (SWI/SNF) complex and is a frequently mutated gene in
cancer. It is known that ARID1A and its homolog, ARID1B, are
synthetic lethal to one another: The dual loss of ARID1A and its
homolog, ARID1B, in a cell is lethal; however, the loss of either gene
alone is not (Helming et al., 2014).
This example will demonstrate how we can identify synthetic lethal
interactors of ARID1A using GRETTA and predict this known
interaction.
For this example you will need to call the following libraries. If you
they are not installed yet use install.packages() (eg.
install.packages("dplyr")).
# Load library
library(tidyverse)
#> ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
#> ✔ dplyr 1.1.4 ✔ readr 2.1.5
#> ✔ forcats 1.0.0 ✔ stringr 1.5.1
#> ✔ ggplot2 4.0.0 ✔ tibble 3.2.1
#> ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
#> ✔ purrr 1.0.2
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(GRETTA)
#>
#> _______ .______ _______ .___________.___________. ___
#> / _____|| _ \ | ____|| | | / \
#> | | __ | |_) | | |__ `---| |----`---| |----` / ^ \
#> | | |_ | | / | __| | | | | / /_\ \
#> | |__| | | |\ \----.| |____ | | | | / _____ \
#> \______| | _| `._____||_______| |__| |__| /__/ \__\
#>
#> Welcome to GRETTA! The version loaded is: 4.0.0
#> The latest DepMap dataset accompanying this package is v24Q3.
#> Please refer to our tutorial on GitHub for loading DepMap data and details: https://github.com/ytakemon/GRETTA
Download example data
A small data set has been created for this tutorial and can be downloaded using the following code.
path <- getwd()
download_example_data(path)
#> Warning in dir.create(paste0(path, "/GRETTA_example")):
#> '/projects/marralab/ytakemon_prj/DepMap/GRETTA/GRETTA_example' already exists
#> Warning in dir.create(paste0(path, "/GRETTA_example_output")):
#> '/projects/marralab/ytakemon_prj/DepMap/GRETTA/GRETTA_example_output' already
#> exists
#> Data saved to: /projects/marralab/ytakemon_prj/DepMap/GRETTA/GRETTA_example/
Then, assign variable that point to where the .rda files are stored
and where result files should go.
gretta_data_dir <- paste0(path,"/GRETTA_example/")
gretta_output_dir <- paste0(path,"/GRETTA_example_output/")
Exploring cell lines
One way to explore cell lines that are available in DepMap is through
their portal. However, there are some
simple built-in methods in GRETTA to provide users with a way to glimpse
the data using the series of list_available functions:
list_mutations(), list_cancer_types(), list_cancer_subtypes()
Current DepMap data used by default is version 22Q2, which contains
whole-genome sequencing or whole-exome sequencing annotations for 1771
cancer cell lines (1406 cell line
