editor_options: markdown: wrap: 72

SplineOmics

The R package SplineOmics finds the significant features (hits) of time-series -omics data by using splines and limma for hypothesis testing. It then clusters the hits based on the spline shape while showing all results in summary HTML reports.

The graphical abstract below shows the full workflow streamlined by SplineOmics:

Graphical Abstract of SplineOmics Workflow

</figcaption> </figure>

📘 Introduction
🔧 Installation
- 🐳 Docker Container
▶️ Usage
📦 Dependencies
📚 Further Reading
❓ Getting Help
💬 Feedback
📜 License
🎓 Citation
🌟 Contributors
🙏 Acknowledgements
⭐ Liking this project

📘 Introduction

Welcome to SplineOmics, an R package designed to streamline the analysis of -omics time-series data, followed by automated HTML report generation.

Is the SplineOmics package of use for me?

If you have -omics data over time, the package will help you to run limma with splines, perform the clustering, run ORA and show result plots in HTML reports. Any time-series data that is a valid input to the limma package is also a valid input to the SplineOmics package (such as transcriptomics, proteomics, phosphoproteomics, metabolomics, glycan fractional abundances, etc.).

What do I need precisely?

Data: A data matrix where each row is a feature (e.g., protein, metabolite, etc.) and each column is a sample taken at a specific time. The data must have no NA values, should have normally distributed features and no dependence between the samples.
Meta: A table with metadata on the columns/samples of the data matrix (e.g., batch, time point, etc.)
Annotation (optional): A table with identifiers on the rows/features of the data matrix (e.g., gene and protein name).

Capabilities

With SplineOmics, you can:

Automatically perform exploratory data analysis:

The explore_data() function generates an HTML report, containing various plots, such as density, PCA, and correlation heatmap plots (example report).
Perform limma spline analysis:

Use the run_limma_splines() function to perform the limma analysis with splines once the optimal hyperparameters are identified (example report).
Find jumps and drops in the timecourse:

Use the find_pvc() function for that (example report).
Cluster significant features:

Cluster the significant features (hits) identified in the spline analysis with the cluster_hits() function (example report).
Run ORA with clustered hits:

Perform over-representation analysis (ORA) using the clustered hits with the run_ora() function (example report).

🔧 Installation

Follow the steps below to install the SplineOmics package from the GitHub repository into your R environment.

Note Carefully read the terminal messages of the installations. It can happen that installations fail due to missing dependencies, which you then must resolve using other commands not necessarily written down here.

Prerequisites

Ensure R is installed on your system. If not, download and install it from CRAN.
RStudio is recommended for a more user-friendly experience with R. Download and install RStudio from posit.co.

Installation Steps

Note for Windows Users:

During the installation on Windows, you might see a message indicating that Rtools is not installed, which is typically required for compiling R packages from source. However, for this installation, Rtools is not necessary, and you can safely ignore this message.

Open RStudio or your R console in a new or existing project folder.
Create a virtual environment with renv

renv::init()

Install BiocManager for Bioconductor dependencies (if not already installed)

install.packages("BiocManager")

Install required Bioconductor packages

BiocManager::install(
  c("ComplexHeatmap", "limma", "variancePartition")
  # force = TRUE   # when encountering issues
)

Install remotes for GitHub package installation

install.packages("remotes")

Install the SplineOmics package from GitHub and all its non-Bioconductor dependencies, using remotes

remotes::install_github(
  "csbg/SplineOmics",   # GitHub repository
  ref = "<tag>",        # Version to install, e.g. v0.4.2 
  dependencies = TRUE,  # Install all dependencies
  upgrade = "always"    # Always upgrade dependencies
  # force = TRUE        # when encountering issues
)

Verify the installation of the SplineOmics package:

pkg <- "SplineOmics"
ver <- "0.4.2"

status <- tryCatch({
  library(pkg, character.only = TRUE)
  packageVersion(pkg) == ver
}, error = function(e) FALSE)

if (status) {
  message(sprintf("%s version %s is installed correctly.", pkg, ver))
} else {
  message(sprintf("%s version %s is NOT installed correctly.", pkg, ver))
}

📌 Note on documentation:
The website only contains the documentation for the most recent SplineOmics version. To get the documentation of any currently installed version, run:

help(package="SplineOmics")

Troubleshooting

If you encounter errors related to dependencies or package versions during installation, try updating your R and RStudio to the latest versions and repeat the installation steps.

For issues specifically related to the SplineOmics package, check the Issues section of the GitHub repository for similar problems or to post a new issue.

🐳 Docker Container

Alternatively, you can run your analysis in a Docker container. The underlying Docker image encapsulates the SplineOmics package together with the necessary environment and dependencies. This ensures higher levels of reproducibility because the analysis is carried out in a consistent environment, independent of the operating system and its custom configurations.

Please note that you must have the Docker Engine installed on your machine. For instructions on how to install it, consult the official Docker Engine installation guide.

More information about Docker containers can be found on the official Docker page.

For instructions on downloading the image of the SplineOmics package and running the container, please refer to the Docker instructions.

Troubleshooting

If you face “permission denied” issues on Linux distributions, check this vignette.

▶️ Usage

🎓 Tutorial

This tutorial covers a real CHO cell time-series proteomics example from start to end.

📋 Details

A detailed description of all arguments and outputs of all the functions in the package (exported and internal functions) can be found here.

Design `limma` design formula

A quick guide on how to design a limma design formula can be found here.

An explanation of the three different limma results is here.

🧬 RNA-seq and Glycan Data

RNA-seq data

Transcriptomics data must be preprocessed for limma. You need to provide an appropriate object, such as a voom object, in the rna_seq_data argument of the SplineOmics object (see documentation). Along with this, the normalized matrix (e.g., the $E slot of the voom object) must be passed to the data argument. This allows flexibility in preprocessing; you can use any method you prefer as long as the final object and matrix are compatible with limma. One way to preprocess your RNA-seq data is by using the preprocess_rna_seq_data() function included in the SplineOmics package (see [documentation](https://csbg.github.io/SplineOmics/reference/preprocess_r

SplineOmics

Install / Use

README

editor_options: markdown: wrap: 72

SplineOmics

Table of Contents

📘 Introduction

Is the SplineOmics package of use for me?

What do I need precisely?

Capabilities

🔧 Installation

Prerequisites

Installation Steps

Troubleshooting

🐳 Docker Container

Troubleshooting

▶️ Usage

🎓 Tutorial

📋 Details

Design `limma` design formula

🧬 RNA-seq and Glycan Data

RNA-seq data

SplineOmics

Install / Use

README

editor_options: markdown: wrap: 72

SplineOmics

Table of Contents

📘 Introduction

Is the SplineOmics package of use for me?

What do I need precisely?

Capabilities

🔧 Installation

Prerequisites

Installation Steps

Troubleshooting

🐳 Docker Container

Troubleshooting

▶️ Usage

🎓 Tutorial

📋 Details

Design limma design formula

🧬 RNA-seq and Glycan Data

RNA-seq data

Design `limma` design formula