Rethinking
Statistical Rethinking course and book package
Install / Use
/learn @rmcelreath/RethinkingREADME
rethinking
This R package accompanies a course and book on Bayesian data analysis: McElreath 2020. Statistical Rethinking, 2nd edition, CRC Press. If you are using it with the first edition of the book, please see the notes at the bottom of this file.
It contains tools for conducting both quick quadratic approximation of the posterior distribution as well as Hamiltonian Monte Carlo (through RStan or cmdstanr - mc-stan.org). Many packages do this. The signature difference of this package is that it forces the user to specify the model as a list of explicit distributional assumptions. This is more tedious than typical formula-based tools, but it is also much more flexible and powerful and---most important---useful for teaching and learning. When students have to write out every detail of the model, they actually learn the model.
For example, a simple Gaussian model could be specified with this list of formulas:
f <- alist(
y ~ dnorm( mu , sigma ),
mu ~ dnorm( 0 , 10 ),
sigma ~ dexp( 1 )
)
The first formula in the list is the probability of the outcome (likelihood); the second is the prior for mu; the third is the prior for sigma.
Installation
There are three steps. (1) Install the C++ toolchain, (2) install cmdstanr, (3) install rethinking. Details follow.
First, install the C++ toolchain. Go to https://mc-stan.org/docs/cmdstan-guide/cmdstan-installation.html#cpp-toolchain and follow the instructions for your platform.
Second, install the cmdstanr package. Visit https://mc-stan.org/cmdstanr/. The first time you install cmdstanr, you will also need compile the libraries with cmdstanr::install_cmdstan(). All this of this bother is worth it. You just have to do it once. If you don't want to use MCMC, you don't have to complete this step.
Third, you can install rethinking from within R using:
install.packages(c("coda","mvtnorm","devtools","loo","dagitty","shape"))
devtools::install_github("rmcelreath/rethinking")
Note that the rethinking package is not on CRAN, just on github. The rethinking package is never going to be on CRAN. So if you get an error about rethinking not being available for your version of R, it is because you tried to install from CRAN. Use the github code above.
rethinking slim - no MCMC
If you just want to work through the first half of the course, without bothering with MCMC and Stan installs, you can install the 'slim' version of the rethinking package. Do this:
install.packages(c("coda","mvtnorm","devtools","loo","dagitty"))
devtools::install_github("rmcelreath/rethinking@slim")
The quap function and related helper functions should still work, and you'll be able to work through Chapter 8 before you need to install the full version with Stan.
Quadratic Approximation with quap
Almost any ordinary generalized linear model can be specified with quap. To use quadratic approximation:
library(rethinking)
f <- alist(
y ~ dnorm( mu , sigma ),
mu ~ dnorm( 0 , 10 ),
sigma ~ dexp( 1 )
)
fit <- quap(
f ,
data=list(y=c(-1,1)) ,
start=list(mu=0,sigma=1)
)
The object fit holds the result. For a summary of marginal posterior distributions, use summary(fit) or precis(fit):
mean sd 5.5% 94.5%
mu 0.00 0.59 -0.95 0.95
sigma 0.84 0.33 0.31 1.36
It also supports vectorized parameters, which is convenient for categories. See examples ?quap.
In the first edition of the textbook, this function was called map. It can still be used with that alias. It was renamed, because the name map was misleading. This function produces quadratic approximations of the posterior distribution, not just maximum a posteriori (MAP) estimates.
Hamiltonian Monte Carlo with ulam (and map2stan)
The same formula list can be compiled into a Stan (mc-stan.org) model using one of two tools: ulam or map2stan. For simple models, they are identical. ulam is the newer tool that allows for much more flexibility, including explicit variable types and custom distributions. map2stan is the original tool from the first edition of the package and textbook. Going forward, new features will be added to ulam.
ulam is named after Stanisław Ulam, who was one of the parents of the Monte Carlo method and is the namesake of the Stan project as well. It is pronounced something like [OO-lahm], not like [YOU-lamm].
Both tools take the same kind of input as quap:
fit_stan <- ulam( f , data=list(y=c(-1,1)) )
The chain runs automatically, provided rstan is installed. Chain diagnostics are displayed in the precis(fit_stan) output:
mean sd 5.5% 94.5% n_eff Rhat
sigma 1.45 0.72 0.67 2.84 145 1
mu 0.12 1.04 -1.46 1.59 163 1
For ulam models, plot displays the same information as precis and traceplot displays the chains.
extract.samples returns samples in a list. extract.prior samples from the prior and returns the samples in a list as well.
The stanfit object itself is in the @stanfit slot. Anything you'd do with a Stan model can be done with that slot directly.
The Stan code can be accessed by using stancode(fit_stan):
data{
real y[2];
}
parameters{
real<lower=0> sigma;
real mu;
}
model{
sigma ~ exponential( 1 );
mu ~ normal( 0 , 10 );
y ~ normal( mu , sigma );
}
Note that ulam doesn't care about R distribution names. You can instead use Stan-style names:
fit_stan <- ulam(
alist(
y ~ normal( mu , sigma ),
mu ~ normal( 0 , 10 ),
sigma ~ exponential( 1 )
), data=list(y=c(-1,1)) )
Posterior prediction
All quap, ulam, and map2stan objects can be post-processed to produce posterior predictive distributions.
link is used to compute values of any linear models over samples from the posterior distribution.
sim is used to simulate posterior predictive distributions, simulating outcomes over samples from the posterior distribution of parameters. sim can also be used to simulate prior predictives.
See ?link and ?sim for details.
postcheck automatically computes posterior predictive (retrodictive?) checks. It merely uses link and sim.
Multilevel model formulas
While quap is limited to fixed effects models for the most part, ulam can specify multilevel models, even quite complex ones. For example, a simple varying intercepts model looks like:
# prep data
data( UCBadmit )
UCBadmit$male <- as.integer(UCBadmit$applicant.gender=="male")
UCBadmit$dept <- rep( 1:6 , each=2 )
UCBadmit$applicant.gender <- NULL
# varying intercepts model
m_glmm1 <- ulam(
alist(
admit ~ binomial(applications,p),
logit(p) <- a[dept] + b*male,
a[dept] ~ normal( abar , sigma ),
abar ~ normal( 0 , 4 ),
sigma ~ half_normal(0,1),
b ~ normal(0,1)
), data=UCBadmit )
The analogous varying slopes model is:
m_glmm2 <- ulam(
alist(
admit ~ binomial(applications,p),
logit(p) <- a[dept] + b[dept]*male,
c( a , b )[dept] ~ multi_normal( c(abar,bbar) , Rho , sigma ),
abar ~ normal( 0 , 4 ),
bbar ~ normal(0,1),
sigma ~ half_normal(0,1),
Rho ~ lkjcorr(2)
),
data=UCBadmit )
Another way to express the varying slopes model is with a vector of varying effects. This is made possible by using an explicit vector declaration inside the formula:
m_glmm3 <- ulam(
alist(
admit ~ binomial(applications,p),
logit(p) <- v[dept,1] + v[dept,2]*male,
vector[2]:v[dept] ~ multi_normal( c(abar,bbar) , Rho , sigma ),
abar ~ normal( 0 , 4 ),
bbar ~ normal(0,1),
sigma ~ half_normal(0,1),
Rho ~ lkjcorr(2)
),
data=UCBadmit )
That vector[2]:v[dept] means "declare a vector of length two for each unique dept". To access the elements of these vectors, the linear model uses multiple indexes inside the brackets: [dept,1].
This strategy can be taken one step further and the means can be declared as a vector as well:
m_glmm4 <- ulam(
alist(
admit ~ binomial(applications,p),
logit(p) <- v[dept,1] + v[dept,2]*male,
vector[2]:v[dept] ~ multi_normal( v_mu , Rho , sigma ),
vector[2]:v_mu ~ normal(0,1),
sigma[1] ~ half_normal(0,1),
sigma[2] ~ half_normal(0,2),
Rho ~ lkjcorr(2)
),
data=UCBadmit )
And a completely non-centered parameterization can be coded directly as well:
m_glmm5 <- ulam(
alist(
admit ~ binomial(applications,p),
logit(p) <- v_mu[1] + v[dept,1] + (v_mu[2] + v[dept,2])*male,
matrix[dept,2]: v <- t(diag_pre_multiply( sigma , L_Rho ) * z),
matrix[2,dept]: z ~ normal( 0 , 1 ),
vector[2]: v_mu[[1]] ~ normal(0,4),
vector[2]: v_mu[[2]] ~ normal(0,1),
vector[2]: sigma ~ half_normal(0,1),
cholesky_factor_corr[2]: L_Rho ~ lkj_corr_cholesky( 2 )
),
data=UCBadmit )
In the above, the varying effects matrix v is constructed from a matrix of z-scores z and a covariance structure contained in sigma and a Cholesky factor L_Rho. Note the double-bracket notation v_mu[[1]] allowing distinct priors for each index of a vector.
log-likelihood calculations for WAIC and LOOCV
ulam can optionally return pointwise log-likelihood values. These are needed for computing WAIC and PSIS-LOO. The log_lik argument toggles this on:
m_glmm1 <- ulam(
alist(
admit ~ binomial(applications,p),
logit(p) <- a[dept] + b*male,
a[dept] ~ normal( abar , sigma ),
abar ~ normal( 0 , 4 ),
sigma ~ half_normal(0,1),
b ~ normal(0,1)
), data=UCBadmit , log_lik=TRUE )
WAIC(m_glmm1)
The additional code has been added to the generated quantities block of the Stan model (see this with stancode(m_glmm1)):
generated quantities{
Related Skills
node-connect
341.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
341.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.4kCommit, push, and open a PR
