Epig
Bayesian active learning with EPIG data acquisition
Install / Use
/learn @fbickfordsmith/EpigREADME
Bayesian active learning with EPIG data acquisition
This repo contains code for two papers:
Prediction-oriented Bayesian active learning (AISTATS 2023)
Freddie Bickford Smith*, Andreas Kirsch*, Sebastian Farquhar, Yarin Gal, Adam Foster, Tom Rainforth
Information-theoretic approaches to active learning have traditionally focused on maximising the information gathered about the model parameters, most commonly by optimising the BALD score. We highlight that this can be suboptimal from the perspective of predictive performance. For example, BALD lacks a notion of an input distribution and so is prone to prioritise data of limited relevance. To address this we propose the expected predictive information gain (EPIG), an acquisition function that measures information gain in the space of predictions rather than parameters. We find that using EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models, and thus provides an appealing drop-in replacement.
Making better use of unlabelled data in Bayesian active learning (AISTATS 2024)
Freddie Bickford Smith, Adam Foster, Tom Rainforth
Fully supervised models are predominant in Bayesian active learning. We argue that their neglect of the information present in unlabelled data harms not just predictive performance but also decisions about what data to acquire. Our proposed solution is a simple framework for semi-supervised Bayesian active learning. We find it produces better-performing models than either conventional Bayesian active learning or semi-supervised learning with randomly acquired data. It is also easier to scale up than the conventional approach. As well as supporting a shift towards semi-supervised models, our findings highlight the importance of studying models and acquisition methods in conjunction.
Getting set up
Clone the repo and move into it:
git clone https://github.com/fbickfordsmith/epig.git && cd epig
Create an environment using Mamba (or Conda, replacing mamba with conda below) and activate it:
mamba env create --file environment_cuda.yaml && mamba activate epig
Running active learning
Run active learning with the default config:
python main.py
See jobs/ for the commands used to run the active-learning experiments in the papers.
Each of the semi-supervised models we use in the AISTATS 2024 paper comprises an encoder and a prediction head.
Because we use fixed, deterministic encoders, we can compute the encoders' embeddings of all our inputs once up front and then save them to storage.
These embeddings just need to be moved into data/ within this repo, and can be obtained from msn-embeddings, simclr-embeddings, ssl-embeddings and vae-embeddings.
Getting in touch
Contact Freddie if you have any questions about this research or encounter any problems using the code. This repo is a partial release of a bigger internal repo, and it's possible that errors were introduced when preparing this repo for release.
Citing this work
@article{bickfordsmith2023prediction,
author = {{Bickford Smith}, Freddie and Kirsch, Andreas and Farquhar, Sebastian and Gal, Yarin and Foster, Adam and Rainforth, Tom},
year = {2023},
title = {Prediction-oriented {Bayesian} active learning},
journal = {International Conference on Artificial Intelligence and Statistics},
}
@article{bickfordsmith2024making,
author = {{Bickford Smith}, Freddie and Foster, Adam and Rainforth, Tom},
year = {2024},
title = {Making better use of unlabelled data in {Bayesian} active learning},
journal = {International Conference on Artificial Intelligence and Statistics},
}
Contributors
Andreas Kirsch wrote the original versions of the BALD and EPIG functions in this repo, along with the dropout layers, and advised on the code in general. Adam Foster and Joost van Amersfoort advised on the Gaussian-process implementation. Jannik Kossen provided a repo template and advised on the code in general.
Credit for the unsupervised encoders we use in our semi-supervised models goes to the authors of disentangling-vae, lightly, msn and solo-learn, as well as the designers of the pretraining methods used.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
13.8kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
000-main-rules
Project Context - Name: Interactive Developer Portfolio - Stack: Next.js (App Router), TypeScript, React, Tailwind CSS, Three.js - Architecture: Component-driven UI with a strict separation of conce
