CorrectR
R package for computing corrected test statistics for comparing machine learning models on correlated samples
Install / Use
/learn @hendersontrent/CorrectRREADME
correctR <img src="man/figures/correctR.png" align="right" width="120" />
Corrected test statistics for comparing machine learning models on correlated samples
Installation
You can install the stable version of correctR from CRAN:
install.packages("correctR")
You can install the development version of correctR from GitHub:
devtools::install_github("hendersontrent/correctR")
General purpose
Often in machine learning, we want to compare the performance of
different models to determine if one statistically outperforms another.
However, the methods used (e.g., data resampling, $k$-fold
cross-validation) to obtain these performance metrics (e.g.,
classification accuracy) violate the assumptions of traditional
statistical tests such as a $t$-test. The purpose of these methods is to
either aid generalisability of findings (i.e., through quantification of
error as they produce multiple values for each model instead of just
one) or to optimise model hyperparameters. This makes them invaluable,
but unusable with traditional tests, as Dietterich
(1998) found that the
standard $t$-test underestimates the variance, therefore driving a high
Type I error. correctR is a lightweight package that implements a
small number of corrected test statistics for cases when samples are not
independent (and therefore are correlated), such as in the case of
resampling, $k$-fold cross-validation, and repeated $k$-fold
cross-validation. These corrections were all originally proposed by
Nadeau and Bengio
(2003).
Currently, only cases where two models are to be compared are supported.
Related Skills
node-connect
349.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
