Cvms

R Package: Cross-validate one or multiple gaussian or binomial regression models at once. Perform repeated cross-validation. Returns results in a tibble for easy comparison, reporting and further analysis.

Generate Convert Improve

Install / Use

/learn @LudvigOlsen/Cvms

About this skill

Quality Score

0/100

README

cvms <a href='https://github.com/LudvigOlsen/cvms'><img src='man/figures/cvms_logo_242x280_250dpi.png' align="right" height="140" /></a>

Cross-Validation for Model Selection
Authors: Ludvig R. Olsen ( r-pkgs@ludvigolsen.dk ), Hugh Benjamin Zachariae License: MIT Started: October 2016

Overview

R package for model evaluation and comparison.

Cross-validate one or multiple regression or classification models with relevant evaluation metrics in a tidy format.
Validate the best model on a test set and compare it to a baseline evaluation.
Perform hyperparameter tuning with grid search.
Evaluate predictions from an external model.
Extract the observations that were the most challenging to predict.

Currently supports regression ('gaussian'), binary classification ('binomial'), and (some functions only) multiclass classification ('multinomial'). Many of the functions allow parallelization, e.g. through the doParallel package.

NEW: Our new application for plotting confusion matrices with plot_confusion_matrix() without any code is now available on Huggingface Spaces.

Main functions

| Function | Description | |:---|:---| | cross_validate() | Cross-validate linear models with lm()/lmer()/glm()/glmer() | | cross_validate_fn() | Cross-validate a custom model function | | validate() | Validate linear models with (lm/lmer/glm/glmer) | | validate_fn() | Validate a custom model function | | evaluate() | Evaluate predictions with a large set of metrics | | baseline()baseline_gaussian()baseline_binomial()baseline_multinomial() | Perform baseline evaluations of a dataset |

Evaluation utilities

| Function | Description | |:---|:---| | confusion_matrix() | Create a confusion matrix from predictions and targets | | evaluate_residuals() | Evaluate residuals from a regression task | | most_challenging() | Find the observations that were the most challenging to predict | | summarize_metrics() | Summarize numeric columns with a set of descriptors |

Formula utilities

| Function | Description | |:---|:---| | combine_predictors() | Generate model formulas from a list of predictors | | reconstruct_formulas() | Extract formulas from output tibble | | simplify_formula() | Remove inline functions with more from a formula object |

Plotting utilities

| Function | Description | |:---|:---| | plot_confusion_matrix() | Plot a confusion matrix (see also our no-code application) | | plot_metric_density() | Create a density plot for a metric column | | font() | Set font settings for plotting functions (currently only plot_confusion_matrix()) | | sum_tile_settings() | Set settings for sum tiles in plot_confusion_matrix() |

Custom functions

| Function | Description | |:---|:---| | model_functions() | Example model functions for cross_validate_fn() | | predict_functions() | Example predict functions for cross_validate_fn() | | preprocess_functions() | Example preprocess functions for cross_validate_fn() | | update_hyperparameters() | Manage hyperparameters in custom model functions |

Other utilities

| Function | Description | |:---|:---| | select_metrics() | Select the metric columns from the output | | select_definitions() | Select the model-defining columns from the output | | gaussian_metrics() binomial_metrics() multinomial_metrics() | Create list of metrics for the common metrics argument | | multiclass_probability_tibble() | Generate a multiclass probability tibble |

Datasets

| Name | Description | |:---|:---| | participant.scores | Made-up experiment data with 10 participants and two diagnoses | | wines | A list of wine varieties in an approximately Zipfian distribution | | musicians | Made-up data on 60 musicians in 4 groups for multiclass classification | | predicted.musicians | Predictions by 3 classifiers of the 4 classes in the musicians dataset | | precomputed.formulas | Fixed effect combinations for model formulas with/without two- and three-way interactions | | compatible.formula.terms | 162,660 pairs of compatible terms for building model formulas with up to 15 fixed effects |

cvms
Examples

Important News

Check NEWS.md for the full list of changes.

Version 1.2.0 contained multiple breaking changes. Please see NEWS.md. (18th of October 2020)

Installation

CRAN:

install.packages("cvms")

Development version:

install.packages("devtools")

devtools::install_github("LudvigOlsen/groupdata2")

devtools::install_github("LudvigOlsen/cvms")

Vignettes

cvms contains a number of vignettes with relevant use cases and descriptions:

vignette(package = "cvms") # for an overview

Examples

Attach packages

library(cvms)
library(groupdata2)   # fold() partition()
library(knitr)        # kable()
library(dplyr)        # %>% arrange()

Load data

The dataset participant.scores comes with cvms:

data <- participant.scores

Fold data

Create a grouping factor for subsetting of folds using groupdata2::fold(). Order the dataset by the folds:

# Set seed for reproducibility
set.seed(7)

# Fold data 
data <- fold(
  data = data, k = 4,
  cat_col = 'diagnosis',
  id_col = 'participant') %>% 
  arrange(.folds)

# Show first 15 rows of data
data %>% head(15) %>% kable()

| participant | age | diagnosis | score | session | .folds | |:------------|----:|----------:|------:|--------:|:-------| | 9 | 34 | 0 | 33 | 1 | 1 | | 9 | 34 | 0 | 53 | 2 | 1 | | 9 | 34 | 0 | 66 | 3 | 1 | | 8 | 21 | 1 | 16 | 1 | 1 | | 8 | 21 | 1 | 32 | 2 | 1 | | 8 | 21 | 1 | 44 | 3 | 1 | | 2 | 23 | 0 | 24 | 1 | 2 | | 2 | 23 | 0 | 40 | 2 | 2 | | 2 | 23 | 0 | 67 | 3 | 2 | | 1 | 20 | 1 | 10 | 1 | 2 | | 1 | 20 | 1 | 24 | 2 | 2 | | 1 | 20 | 1 | 45 | 3 | 2 | | 6 | 31 | 1 | 14 | 1 | 2 | | 6 | 31 | 1 | 25 | 2 | 2 | | 6 | 31 | 1 | 30 | 3 | 2 |

Cross-validate a single model

Gaussian

CV1 <- cross_validate(
  data = data,
  formulas = "score ~ diagnosis",
  fold_cols = '.folds',
  family = 'gaussian',
  REML = FALSE
)

# Show results
CV1
#> # A tibble: 1 × 21
#>   Fixed  RMSE   MAE `NRMSE(IQR)`  RRSE   RAE RMSLE   AIC  AICc   BIC Predictions
#>   <chr> <dbl> <dbl>        <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list>     
#> 1 diag…  16.4  13.8

Related Skills

node-connect

347.6k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

108.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

347.6k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

347.6k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。

LudvigOlsen

View profile

View on GitHub

GitHub Stars39

CategoryDevelopment

Updated1mo ago

Forks7

LudvigOlsen/cvms

Languages

Security Score

75/100

Audited on Feb 26, 2026

No findings

Cvms

Install / Use

README

cvms <a href='https://github.com/LudvigOlsen/cvms'><img src='man/figures/cvms_logo_242x280_250dpi.png' align="right" height="140" /></a>

Overview

Main functions

Evaluation utilities

Formula utilities

Plotting utilities

Custom functions

Other utilities

Datasets

Table of Contents

Important News

Installation

Vignettes

Examples

Attach packages

Load data

Fold data

Cross-validate a single model

Gaussian

Related Skills