SkillAgentSearch skills...

ProbitSUN

Conjugate Bayes for probit regression via unified skew-normal distributions

Install / Use

/learn @danieledurante/ProbitSUN
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Conjugate Bayes for Probit Regression via Unified Skew-Normal Random Variables

This repository is associated with the article Durante (2019). Conjugate Bayes for Probit Regression via Unified Skew-Normal Distributions. The key contribution of this paper is outlined below.

When the focus is on Bayesian probit regression with Gaussian priors for the coefficients, the posterior is available and belongs to the class of unified skew-normal random variables. The same is true more generally when the prior is, itself, a unified skew-normal.

This repository provides codes and tutorials to implement the inference methods associated with such a new result. Here, the focus is on two illustrative applications. One case-study is outlined in Section 3 of the paper, whereas the other is meant to provide further insights. More information can be found below.

  • genes_tutorial.md. This tutorial is discussed in Section 3 of the paper and focuses on a large p and small n genomic study available at Cancer SAGE. The goal is to compare the Algorithm 1 proposed in the paper—which provides independent and identically distributed samples from the unified skew-normal posterior—with state-of-the-art Markov Chain Monte Carlo (MCMC) competitors. These include the data augmentation Gibbs sampler by Albert and Chib (1993) (R package bayesm), the Hamiltonian no u-turn sampler by Hoffman and Gelman (2014) (R package rstan) and the adaptive Metropolis-Hastings in Haario et al. (2001) (R package LaplacesDemon). This last algorithm is also tuned via expectation-propagation estimates obtained from the R package EPGLM (version 1.1.2) which needs to be downloaded at CRAN Archive and then installed locally.

  • voice_tutorial.md. This tutorial implements the algorithms for posterior inference discussed above on a dataset with lower p and larger n. Specifically, the illustrative application considered here refers to a voice rehabilitation study available at UCI Machine Learning Repository. As discussed in the article, when p decreases and n increases, the MCMC methods in bayesm, rstan and LaplacesDemon are expected to progressively improve performance, whereas Algorithm 1 may face more evident issues in computational time. This behavior is partially observed in this tutorial, although Algorithm 1 is still competitive.

In genes_tutorial.md and voice_tutorial.md, the inference performance based on sampling from the posterior is also compared with the exact methods proposed in Section 2.3.

All the analyses are performed with a MacBook Pro (OS X El Capitan, version 10.11.6), using a R version 3.4.1.

IMPORTANT: Although a seed is set at the beginning of each sampling scheme, the final output reported in Tables and Figures of genes_tutorial.md and voice_tutorial.md may be subject to slight variations depending on which version of the R packages (especially bayesm, rstan and LaplacesDemon) has been used in the implementation of the code. This is due to possible internal changes of certain functions when the package version has been updated. However, the magnitude of these minor variations is negligible and does not affect the final conclusions.

Related Skills

View on GitHub
GitHub Stars5
CategoryDevelopment
Updated2mo ago
Forks2

Security Score

85/100

Audited on Jan 29, 2026

No findings