SomaDataIO
The SomaDataIO package loads and exports 'SomaScan' data via the 'SomaLogic Operating Co., Inc.' proprietary data file, called an ADAT ('*.adat'). The package also exports auxiliary functions for manipulating, wrangling, and extracting relevant information from an ADAT object once in memory.
Install / Use
/learn @SomaLogic/SomaDataIOREADME
SomaDataIO <a href="https://somalogic.github.io/SomaDataIO/"><img src="man/figures/logo.png" align="right" height="138" alt="SomaDataIO website" /></a>
<!-- badges: start --> <!-- badges: end -->The SomaDataIO R package loads and exports ‘SomaScan’ data via the
SomaLogic Operating Co., Inc. structured text file called an ADAT
(*.adat). The package also exports auxiliary functions for
manipulating, wrangling, and extracting relevant information from an
ADAT object once in memory. Basic familiarity with the R environment is
assumed, as is the ability to install contributed packages from the
Comprehensive R Archive Network (CRAN).
If you run into any issues/problems with SomaDataIO full documentation
of the most recent
release can be found
at our website of articles and
workflows. If the issue
persists we encourage you to consult the
issues page and, if
appropriate, submit an issue and/or feature request.
Usage
The SomaDataIO package is licensed under the
MIT
license and is intended solely for research use only (“RUO”) purposes.
The code contained herein may not be used for diagnostic, clinical,
therapeutic, or other commercial purposes.
Installation
The easiest way to install SomaDataIO is to install directly from
CRAN:
install.packages("SomaDataIO")
Alternatively from GitHub:
remotes::install_github("SomaLogic/SomaDataIO")
which installs the most current “development” version from the
repository HEAD. To install the most recent release, use:
remotes::install_github("SomaLogic/SomaDataIO@*release")
To install a specific tagged release, use:
remotes::install_github("SomaLogic/SomaDataIO@v5.3.0")
Package Dependencies
The SomaDataIO package was intentionally developed to contain a
limited number of dependencies from CRAN. This makes the package more
stable to external software design changes but also limits its contained
feature set. With this in mind, SomaDataIO aims to strike a balance
providing long(er)-term stability and a limited set of features. Below
are the package dependencies (see also the
DESCRIPTION
file):
Biobase
The Biobase package is suggested, being required by only two
functions, pivotExpressionSet() and adat2eSet().
Biobase
must be installed separately from
Bioconductor by entering the following
from the R Console:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("Biobase", version = remotes::bioc_version())
Information about Bioconductor can be found here: https://bioconductor.org/install/
Loading
Upon successful installation, load SomaDataIO as normal:
library(SomaDataIO)
For an index of available commands:
library(help = SomaDataIO)
Objects and Data
The SomaDataIO package comes with five (5) objects available to users
to run canned examples (or analyses). They can be accessed once
SomaDataIO has been attached via library(). They are:
-
example_data: the original ‘SomaScan’ file (example_data.adat) can be found here or downloaded directly via:wget https://raw.githubusercontent.com/SomaLogic/SomaLogic-Data/main/example_data.adat-
within
SomaDataIOit has been replaced by an abbreviated, light-weight version containing only the first 10 samples:dir(system.file("extdata", package = "SomaDataIO"), full.names = TRUE)
-
-
ex_analytes: the analyte (feature) variables inexample_data -
ex_anno_tbl: the annotations table associated withexample_data -
ex_target_names: a mapping object for analyte -> target -
ex_clin_data: a table containing variablesSampleId,smoking_statusandalcohol_useto demonstrate merging clinical sample annotation information to asoma_adatobject -
See also
?SomaScanObjects
Main (I/O) Features
- Loading data (Import)
- parse and import a
*.adattext file into anRsession as asoma_adatobject.
- parse and import a
- Wrangling data (manipulation)
- subset, reorder, and list various fields of a
soma_adatobject. ?SeqIdanalyte (feature) matching.- dplyr and
tidyr verb S3 methods for the
soma_adatclass. ?rownameshelpers that do not breaksoma_adatattributes.- please see the article Loading and Wrangling ‘SomaScan’
- subset, reorder, and list various fields of a
- Exporting data (Output)
- write out a
soma_adatobject as a*.adattext file.
- write out a
Loading an ADAT
Loading an ADAT text file is simple using read_adat():
# Note: This `system.file()` command returns a filepath to the `example_data10`
# object in the `SomaDataIO` package
adat_path <- system.file("extdata", "example_data10.adat",
package = "SomaDataIO", mustWork = TRUE)
adat_path
#> [1] "/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/SomaDataIO/extdata/example_data10.adat"
# `adat_path` should be the elaborated path and file name of the *.adat file to
# be loaded into the R workspace from your local file system
# (e.g. adat_path = "PATH_TO_ADAT/my_adat.adat")
my_adat <- read_adat(file = adat_path)
# test object class
is.soma_adat(my_adat)
#> [1] TRUE
# S3 print method (forwards -> tibble)
my_adat
#> ══ SomaScan Data ═══════════════════════════════════════════════════════════════
#> SomaScan version V4 (5k)
#> Signal Space 5k
#> Attributes intact ✓
#> Rows 10
#> Columns 5318
#> Clinical Data 34
#> Features 5284
#> ── Column Meta ─────────────────────────────────────────────────────────────────
#> ℹ SeqId, SeqIdVersion, SomaId, TargetFullName, Target, UniProt, EntrezGeneID,
#> ℹ EntrezGeneSymbol, Organism, Units, Type, Dilution, PlateScale_Reference,
#> ℹ CalReference, Cal_Example_Adat_Set001, ColCheck,
#> ℹ CalQcRatio_Example_Adat_Set001_170255, QcReference_170255,
#> ℹ Cal_Example_Adat_Set002, CalQcRatio_Example_Adat_Set002_170255, Dilution2
#> ── Tibble ──────────────────────────────────────────────────────────────────────
#> # A tibble: 10 × 5,319
#> row_names PlateId PlateRunDate ScannerID PlatePosition SlideId Subarray
#> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
#> 1 258495800012_3 Example… 2020-06-18 SG152144… H9 2.58e11 3
#> 2 258495800004_7 Example… 2020-06-18 SG152144… H8 2.58e11 7
#> 3 258495800010_8 Example… 2020-06-18 SG152144… H7 2.58e11 8
#> 4 258495800003_4 Example… 2020-06-18 SG152144… H6 2.58e11 4
#> 5 258495800009_4 Example… 2020-06-18 SG152144… H5 2.58e11 4
#> 6 258495800012_8 Example… 2020-06-18 SG152144… H4 2.58e11 8
#> 7 258495800001_3 Example… 2020-06-18 SG152144… H3 2.58e11 3
#> 8 258495800004_8 Example… 2020-06-18 SG152144… H2 2.58e11 8
#> 9 258495800001_8 Example… 2020-06-18 SG152144… H12 2.58e11 8
#> 10 258495800004_3 Example… 2020-06-18 SG152144… H11 2.58e11 3
#> # ℹ 5,312 more variables: SampleId <chr>, SampleType <chr>,
#> # PercentDilution <int>, SampleMatrix <chr>, Barcode <lgl>, Barcode2d <chr>,
#> # SampleName <lgl>, SampleNotes <lgl>, AliquotingNotes <lgl>,
#> # SampleDescription <chr>, …
#> ════════════════════════════════════════════════════════════════════════════════
Please see the article Loading and Wrangling SomaScan for more details and options.
Wrangling
The soma_adat class comes with numerous class-specific S3 methods to
the most popular dplyr and
