SkillAgentSearch skills...

Mvgam

{mvgam} R 📦 to fit Dynamic Bayesian Generalized Additive Models for multivariate modeling and forecasting

Install / Use

/learn @nicholasjclark/Mvgam

README

<!-- README.md is generated from README.Rmd. Please edit that file -->

<img src="man/figures/mvgam_logo.png" width = 120 alt="mvgam R package logo"/><img src="https://raw.githubusercontent.com/stan-dev/logos/master/logo_tm.png" align="right" width=120 alt="Stan Logo"/>

mvgam

MultiVariate (Dynamic) Generalized Additive Models

R-CMD-check Coverage
status Documentation Methods in Ecology &
Evolution CRAN
Version CRAN
Downloads

The mvgam 📦 fits Bayesian Dynamic Generalized Additive Models (DGAMs) that can include highly flexible nonlinear predictor effects, latent variables and multivariate time series models. The package does this by relying on functionalities from the impressive <a href="https://paulbuerkner.com/brms/" target="_blank"><code>brms</code></a> and <a href="https://cran.r-project.org/package=mgcv" target="_blank"><code>mgcv</code></a> packages. Parameters are estimated using the probabilistic programming language Stan, giving users access to the most advanced Bayesian inference algorithms available. This allows mvgam to fit a very wide range of models, including:

  • <a href="https://nicholasjclark.github.io/mvgam/articles/trend_formulas.html" target="_blank">Multivariate State-Space Time Series Models</a>
  • <a href="https://nicholasjclark.github.io/mvgam/reference/RW.html#ref-examples" target="_blank">Continuous-Time Autoregressive Time Series Models</a>
  • <a href="https://nicholasjclark.github.io/mvgam/articles/shared_states.html" target="_blank">Shared Signal Time Series Models</a>
  • <a href="https://nicholasjclark.github.io/mvgam/reference/lv_correlations.html" target="_blank">Dynamic Factor Models</a>
  • <a href="https://nicholasjclark.github.io/mvgam/articles/nmixtures.html" target="_blank">Hierarchical N-mixture Models</a>
  • <a href="https://www.youtube.com/watch?v=2POK_FVwCHk" target="_blank">Hierarchical Generalized Additive Models</a>
  • <a href="https://nicholasjclark.github.io/mvgam/reference/jsdgam.html" target="_blank">Joint Species Distribution Models</a>

Installation

You can install the stable package version from CRAN using: install.packages('mvgam'), or install the latest development version using: devtools::install_github("nicholasjclark/mvgam"). You will also need a working version of Stan installed (along with either rstan and/or cmdstanr). Please refer to installation links for Stan with rstan <a href="https://mc-stan.org/users/interfaces/rstan" target="_blank">here</a>, or for Stan with cmdstandr <a href="https://mc-stan.org/cmdstanr/" target="_blank">here</a>.

Cheatsheet

mvgam usage
cheatsheet

A simple example

We can explore the package’s primary functions using one of it’s built-in datasets. Use plot_mvgam_series() to inspect features for time series from <a href="https://portal.weecology.org/" target="_blank">the Portal Project</a>, which represent counts of baited captures for four desert rodent species over time (see ?portal_data for more details about the dataset).

data(portal_data)
plot_mvgam_series(
  data = portal_data, 
  y = 'captures',
  series = 'all'
)
<img src="man/figures/README-unnamed-chunk-4-1.png" alt="Visualizing multivariate time series in R using mvgam" width="100%" />
plot_mvgam_series(
  data = portal_data, 
  y = 'captures',
  series = 1
)
<img src="man/figures/README-unnamed-chunk-4-2.png" alt="Visualizing multivariate time series in R using mvgam" width="100%" />
plot_mvgam_series(
  data = portal_data, 
  y = 'captures',
  series = 4
)
<img src="man/figures/README-unnamed-chunk-4-3.png" alt="Visualizing multivariate time series in R using mvgam" width="100%" />

These plots show that the time series are count responses, with missing data, many zeroes, seasonality and temporal autocorrelation all present. These features make time series analysis and forecasting very difficult using conventional software. But mvgam shines in these tasks.

For most forecasting exercises, we’ll want to split the data into training and testing folds:

data_train <- portal_data %>%
  dplyr::filter(time <= 60)
data_test <- portal_data %>%
  dplyr::filter(time > 60 &
                  time <= 65)

Formulate an mvgam model; this model fits a State-Space GAM in which each species has its own intercept, linear association with ndvi_ma12 and potentially nonlinear association with mintemp. These effects are estimated jointly with a full time series model for the temporal dynamics (in this case a Vector Autoregressive process). We assume the outcome follows a Poisson distribution and will condition the model in Stan using MCMC sampling with Cmdstan:

mod <- mvgam(
  # Observation model is empty as we don't have any
  # covariates that impact observation error
  formula = captures ~ 0,
  
  # Process model contains varying intercepts, 
  # varying slopes of ndvi_ma12 and varying smooths 
  # of mintemp for each series. 
  # Temporal dynamics are modelled with a Vector 
  # Autoregression (VAR(1))
  trend_formula = ~ 
    trend +
    s(trend, bs = 're', by = ndvi_ma12) +
    s(mintemp, bs = 'bs', by = trend) - 1,
  trend_model = VAR(cor = TRUE),
  
  # Obvservations are conditionally Poisson
  family = poisson(),

  # Condition on the training data
  data = data_train,
  backend = 'cmdstanr'
)

Using print() returns a quick summary of the object:

mod
#> GAM observation formula:
#> captures ~ 1
#> 
#> GAM process formula:
#> ~trend + s(trend, bs = "re", by = ndvi_ma12) + s(mintemp, bs = "bs", 
#>     by = trend) - 1
#> 
#> Family:
#> poisson
#> 
#> Link function:
#> log
#> 
#> Trend model:
#> VAR(cor = TRUE)
#> 
#> 
#> N latent factors:
#> 4 
#> 
#> N series:
#> 4 
#> 
#> N timepoints:
#> 60 
#> 
#> Status:
#> Fitted using Stan 
#> 4 chains, each with iter = 2000; warmup = 1500; thin = 1 
#> Total post-warmup draws = 2000

Split Rhat and Effective Sample Size diagnostics show good convergence of model estimates

mcmc_plot(mod, 
          type = 'rhat_hist')
#> `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
<img src="man/figures/README-unnamed-chunk-9-1.png" alt="Rhats of parameters estimated with Stan in mvgam" width="100%" />
mcmc_plot(mod, 
          type = 'neff_hist')
#> `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
<img src="man/figures/README-unnamed-chunk-10-1.png" alt="Effective sample sizes of parameters estimated with Stan in mvgam" width="100%" />

Use conditional_effects() for a quick visualisation of the main terms in model formulae

conditional_effects(mod, 
                    type = 'link')

<img src="man/figures/README-unnamed-chunk-11-1.png" alt="Plotting GAM effects in mvgam and R" width="100%" /><img src="man/figures/README-unnamed-chunk-11-2.png" alt="Plotting GAM effects in mvgam and R" width="100%" /><img src="man/figures/README-unnamed-chunk-11-3.png" alt="Plotting GAM effects in mvgam and R" width="100%" />

If you have the gratia package installed, it can also be used to plot partial effects of smooths

require(gratia)
draw(mod, 
     trend_effects = TRUE)
<img src="man/figures/README-unnamed-chunk-12-1.png" alt="Plotting GAM smooth functions in mvgam using gratia" width="100%" />

Or design more targeted plots using plot_predictions() from the marginaleffects package

plot_predictions(
  mod,
  condition = c('ndvi_ma12',
                'series',
                'series'),
  type = 'link'
)
<img src="man/figures/README-unnamed-chunk-13-1.png" alt="Using marginaleffects and mvgam to plot GAM smooth functions in R" width="100%" />
plot_predictions(
  mod,
  condition = c('mintemp',
                'series',
                'series'),
  type = 'link'
)
<img src="man/figures/README-unnamed-chunk-14-1.png" alt="Using marginaleffects and mvgam to plot GAM smooth functions in R" width="100%" />

We can also view the model’s posterior predictions for the entire series (testing and training). Forecasts can be scored using a range of proper scoring rules. See ?score.mvgam_forecast for more details

fcs <- forecast(mod, 
                newdata = data_test)
plot(fcs, series = 1) +
  plot(fcs, series = 2) +
  plot(fcs, series = 3) +
  plot(fcs, series = 4)
#> Out of sample DRPS:
#> 8.451467
#> Out of sample DRPS:
#> 5.168817
#> Out of sample DRPS:
#> 8.52922325
#> Out of sample DRPS:
#> 3.60317975
<img src="man/figures/README-unnamed-chunk-15-1.png" alt="Plotting forecast distributions using mvgam in R" width="100%" />

For Vector Autoregressions fit in mvgam, we can inspect <a href="https

View on GitHub
GitHub Stars179
CategoryProduct
Updated16d ago
Forks17

Languages

R

Security Score

85/100

Audited on Mar 12, 2026

No findings