Fantaxtic

Fantaxtic - Nested Bar Plots for Phyloseq Data

Generate Convert Improve

Install / Use

/learn @gmteunisse/Fantaxtic

About this skill

Quality Score

0/100

README

fantaxtic

fantaxtic contains a set of functions to identify and visualize the most abundant taxa in phyloseq objects. It allows users to identify top taxa using any metric and any grouping, and plot the (relative) abundances of the top taxa using a nested bar plot visualisation. In the nested bar plot, colours or fills signify a top taxonomic rank (e.g. Phylum), and a gradient of shades and tints signifies levels at a nested taxonomic rank (e.g. Species). It is particularly useful to present an overview of microbiome sequencing, amplicon sequencing or metabarcoding data.

Note that fantaxtic is essentially a wrapper around ggnested, with some accessory functions to identify top taxa and to ensure that the plot is useful. Thus, the output is ggplot2 object, and can be manipulated as such.

Keywords: nested bar plot, phyloseq, taxonomy, most abundant taxa, multiple levels, shades, tints, gradient, 16S, ITS ,18S, microbiome, amplicon sequencing, metabarcoding

Installation

if(!"devtools" %in% installed.packages()){
  install.packages("devtools")
}
devtools::install_github("gmteunisse/fantaxtic")

Basic usage

The workflow consists of two parts:

Identify top taxa using either top_taxa or nested_top_taxa
Visualise the top taxa using nested_bar_plot

For basic usage, only a few lines of R code are required. To identify and plot the top 10 most abundant ASVs by their mean relative abundance, using Phylum as the top rank and Species as the nested rank, run:

require("fantaxtic")
require("phyloseq")
require("tidyverse")
require("magrittr")
require("ggnested")
require("knitr")
require("gridExtra")

data(GlobalPatterns)
top_asv <- top_taxa(GlobalPatterns, n_taxa = 10)
plot_nested_bar(ps_obj = top_asv$ps_obj,
                top_level = "Phylum",
                nested_level = "Species")

To identify and plot the top 3 most abundant Phyla, and the top 3 most abundant species within those Phyla, run:

top_nested <- nested_top_taxa(GlobalPatterns,
                              top_tax_level = "Phylum",
                              nested_tax_level = "Species",
                              n_top_taxa = 3, 
                              n_nested_taxa = 3)
plot_nested_bar(ps_obj = top_nested$ps_obj,
                top_level = "Phylum",
                nested_level = "Species")

`top_taxa`

This function identifies the top n taxa by some metric (e.g. mean, median, variance, etc.) in a phyloseq object. It outputs a table with the top taxa, as well as a phyloseq object in which all other taxa have been merged into a single taxon.

Taxonomic rank

By default, top_taxa runs the analysis at the ASV level; however, if a tax_level is specified (e.g. Species), it first agglomerates the taxa in the phyloseq object at that rank and then runs the analysis. Note that taxonomic agglomeration makes the assumption that taxa with the same name at all ranks are identical. This also includes taxa with missing annotations (NA). By default, top_taxa does not considered taxa with an NA annotation at tax_level, but this can be overcome by setting include_na_taxa = T.

top_species <- top_taxa(GlobalPatterns,
                        n_taxa = 10, 
                        tax_level = "Species")
top_species$top_taxa %>%
  mutate(abundance = round(abundance, 3)) %>%
  kable(format = "markdown")

| tax_rank | taxid | abundance | Kingdom | Phylum | Class | Order | Family | Genus | Species | |---------:|:-------|----------:|:---------|:---------------|:--------------------|:------------------|:-------------------|:-----------------|:----------------------------| | 4 | 326977 | 0.010 | Bacteria | Actinobacteria | Actinobacteria | Bifidobacteriales | Bifidobacteriaceae | Bifidobacterium | Bifidobacteriumadolescentis | | 9 | 9514 | 0.005 | Bacteria | Proteobacteria | Gammaproteobacteria | Pasteurellales | Pasteurellaceae | Actinobacillus | Actinobacillusporcinus | | 1 | 94166 | 0.014 | Bacteria | Proteobacteria | Gammaproteobacteria | Pasteurellales | Pasteurellaceae | Haemophilus | Haemophilusparainfluenzae | | 8 | 469778 | 0.005 | Bacteria | Bacteroidetes | Bacteroidia | Bacteroidales | Bacteroidaceae | Bacteroides | Bacteroidescoprophilus | | 6 | 471122 | 0.006 | Bacteria | Bacteroidetes | Bacteroidia | Bacteroidales | Prevotellaceae | Prevotella | Prevotellamelaninogenica | | 10 | 248140 | 0.005 | Bacteria | Bacteroidetes | Bacteroidia | Bacteroidales | Bacteroidaceae | Bacteroides | Bacteroidescaccae | | 7 | 470973 | 0.005 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | Ruminococcus | Ruminococcustorques | | 3 | 171551 | 0.011 | Bacteria | Firmicutes | Clostridia | Clostridiales | Ruminococcaceae | Faecalibacterium | Faecalibacteriumprausnitzii | | 2 | 98605 | 0.013 | Bacteria | Firmicutes | Bacilli | Lactobacillales | Streptococcaceae | Streptococcus | Streptococcussanguinis | | 5 | 114821 | 0.009 | Bacteria | Firmicutes | Clostridia | Clostridiales | Veillonellaceae | Veillonella | Veillonellaparvula |

Grouping

Furthermore, if one or more grouping factors are specified in grouping, it will calculate the top n taxa using the samples in each group, rather than using all samples in the phyloseq object. This makes it possible to for example identify the top taxa in each sample, or the top taxa in each treatment group.

top_grouped <- top_taxa(GlobalPatterns,
                        n_taxa = 1,
                        grouping = "SampleType")
top_grouped$top_taxa %>%
  mutate(abundance = round(abundance, 3)) %>%
  kable(format = "markdown")

| SampleType | tax_rank | taxid | abundance | Kingdom | Phylum | Class | Order | Family | Genus | Species | |:-------------------|---------:|:-------|----------:|:---------|:---------------|:----------------------|:------------------|:-------------------|:---------------------|:-----------------------| | Freshwater (creek) | 1 | 549656 | 0.464 | Bacteria | Cyanobacteria | Chloroplast | Stramenopiles | NA | NA | NA | | Freshwater | 1 | 279599 | 0.216 | Bacteria | Cyanobacteria | Nostocophycideae | Nostocales | Nostocaceae | Dolichospermum | NA | | Ocean | 1 | 557211 | 0.071 | Bacteria | Cyanobacteria | Synechococcophycideae | Synechococcales | Synechococcaceae | Prochlorococcus | NA | | Tongue | 1 | 360229 | 0.145 | Bacteria | Proteobacteria | Betaproteobacteria | Neisseriales | Neisseriaceae | Neisseria | NA | | Mock | 1 | 550960 | 0.117 | Bacteria | Proteobacteria | Gammaproteobacteria | Enterobacteriales | Enterobacteriaceae | Providencia | NA | | Sediment (estuary) | 1 | 319044 | 0.080 | Bacteria | Proteobacteria | Deltaproteobacteria | Desulfobacterales | Desulfobulbaceae | NA | NA | | Feces | 1 | 331820 | 0.137 | Bacteria | Bacteroidetes | Bacteroidia | Bacteroidales | Bacteroidaceae | Bacteroides | NA | | Soil | 1 | 36155 | 0.013 | Bacteria | Acidobacteria | Solibacteres | Solibacterales | Solibacteraceae | CandidatusSolibacter | NA | | Skin | 1 | 98605 | 0.103 | Bacteria | Firmicutes | Bacilli | Lactobacillales | Streptococcaceae | Streptococcus | Streptococcussanguinis |

Ranking metric

Lastly, any metric can be used to rank taxa by specifying a function through FUN. The mean is used by default, but depending on your analysis, you might want to use the median, variance, maximum or any other function that takes as input a numeric vector and outputs a single number.

top_max <- top_taxa(GlobalPatterns,
                        n_taxa = 10,
                        FUN = max)
top_max$top_taxa %>%
  mutate(abundance = round(abundance, 3)) %>%
  kable(format = "markdown")

Related Skills

node-connect

341.0k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

84.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

341.0k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

84.4k

Commit, push, and open a PR