Rebias

Official Pytorch implementation of ReBias (Learning De-biased Representations with Biased Representations), ICML 2020

Generate Convert Improve

Install / Use

/learn @clovaai/Rebias

About this skill

Quality Score

0/100

README

Learning De-biased Representations with Biased Representations (ICML 2020)

Official Pytorch implementation of ReBias | Paper

Hyojin Bahng1, Sanghyuk Chun2, Sangdoo Yun2, Jaegul Choo3, Seong Joon Oh2

1 Korea university
2 Clova AI Research, NAVER Corp.
3 KAIST

Many machine learning algorithms are trained and evaluated by splitting data from a single source into training and test sets. While such focus on in-distribution learning scenarios has led to interesting advancement, it has not been able to tell if models are relying on dataset biases as shortcuts for successful prediction (e.g., using snow cues for recognising snowmobiles), resulting in biased models that fail to generalise when the bias shifts to a different class. The cross-bias generalisation problem has been addressed by de-biasing training data through augmentation or re-sampling, which are often prohibitive due to the data collection cost (e.g., collecting images of a snowmobile on a desert) and the difficulty of quantifying or expressing biases in the first place. In this work, we propose a novel framework to train a de-biased representation by encouraging it to be different from a set of representations that are biased by design. This tactic is feasible in many scenarios where it is much easier to define a set of biased representations than to define and quantify bias. We demonstrate the efficacy of our method across a variety of synthetic and real-world biases; our experiments show that the method discourages models from taking bias shortcuts, resulting in improved generalisation.

Updates

26 Jun, 2020: Initial upload.

Summary of code contributions

The code repository contains the implementations of our method (ReBias) as well as prior de-biasing methods empirically compared in the paper. Specifically, we provide codes for:

ReBias (ours): Hilbert Schmidt Independence Criterion (HSIC) based minimax optimization. See criterions/hsic.py
Vanilla and Biased architectures. See models/mnist_models.py, models/imagenet_models.py, and models/action_models/ResNet3D.py.
Learned Mixin ([1] Clark, et al. 2019): criterions/comparison_methods.py
RUBi ([2] Cadene, et al. 2019): criterions/comparison_methods.py

We support training and evaluation of above methods on the three diverse datasets and tasks. See trainer.py and evaluator.py for the unified framework. Supported datasets and tasks are:

Biased MNIST (Section 4.2): main_biased_mnist.py and datasets/colour_mnist.py
ImageNet (Section 4.3): main_imagenet.py, datasets/imagenet.py and make_clusters.py
Action recognition (Section 4.4): main_action.py and datasets/kinetics.py

In this implementation, we set Adam as the default optimiser for the reproducibility. However, we strongly recommend using a better optimiser AdamP [3] by --optim AdamP for future researches. We refer the official repository of AdamP for interested users.

Installation

MNIST and ImageNet experiments

Our implementations are tested on the following libraries with Python3.7.

fire
munch
torch==1.1.0
torchvision==0.2.2.post3
adamp

Install dependencies using the following command.

pip install -r requirements.txt

Action recognition experiments

For action recoginition tasks, we implement the baselines upon the official implementation of SlowFast.

NOTE: We will not handle the issues from action recognition experiments.

Please follow the official SlowFast instruction: https://github.com/facebookresearch/SlowFast/blob/master/INSTALL.md

Dataset preparation

Biased MNIST

Biased MNIST is a colour-biased version of the original MNIST. datasets/colour_mnist.py downloads the original MNIST and applies colour biases on images by itself. No extra preparation is needed on the user side.

ImageNet

We do not provide a detailed instruction for collecting the ImageNet (ILSVRC2015) dataset. Please follow the usual practice.

ImageNet-A and ImageNet-C

To further measure the generalisability of de-biasing methods, we perform evaluations on the ImageNet-A ([4] Hendrycks, et al. 2019) and ImageNet-C ([5] Hendrycks, et al. 2019) as well. The datasets are available at https://github.com/hendrycks/natural-adv-examples and https://github.com/hendrycks/robustness, respectively.

NOTE: We implement the ImageNet-C evaluator separately to this implementation, and do not provide the implementation here. Please refer to [5] for details.

Kinetics and Mimetics

We use two datasets for action recognition: Kinetics and Mimetics ([6] Weinzaepfel, et al. 2019).

Kinetics and Mimetics datasets are available at:

Kinetics: https://github.com/facebookresearch/SlowFast/blob/master/slowfast/datasets/DATASET.md
Mimetics: https://europe.naverlabs.com/research/computer-vision/mimetics/

NOTE: We will not handle the issues from action recognition experiments.

How to run

Biased MNIST results in Table 1

Table1

Main experiments for the Biased MNIST are configured in main_biased_mnist.py. Note that we have reported the average of three runs in the main paper; the standard deviations are reported in the appendix.

NOTE: We do not provide HEX [7] implementation which is significantly different from the other baselines. It does not require any biased model but containes pre-defined handcrafted feature extractor named NGLCM. Thus, instead of providing HEX under the unified framework, we have implemented it separately. Please refer to official HEX implementation for details.

ReBias (ours)

For the better results with AdamP

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.999 --optim AdamP

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.997 --optim AdamP

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.995 --optim AdamP

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.99 --optim AdamP

For the original numbers,

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.999

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.997

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.995

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.99

Vanilla & Biased

By setting f_lambda_outer and g_lambda_inner to 0, f and g are trained separately without minimax optimization.

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.999
    --f_lambda_outer 0
    --g_lambda_inner 0

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.997
    --f_lambda_outer 0
    --g_lambda_inner 0

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.995
    --f_lambda_outer 0
    --g_lambda_inner 0

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.99
    --f_lambda_outer 0
    --g_lambda_inner 0

Learned Mixin

In our experiments, we first pretrain g networks for the Learned Mixin and optimize F using the fixed g. Hence, n_g_pretrain_epochs and n_g_update are set to 5 and 0, respectively.

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.999
    --outer_criterion LearnedMixin
    --g_lambda_inner 0
    --n_g_pretrain_epochs 5
    --n_g_update 0

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.997
    --outer_criterion LearnedMixin
    --g_lambda_inner 0
    --n_g_pretrain_epochs 5
    --n_g_update 0

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.995
    --outer_criterion LearnedMixin
    --g_lambda_inner 0
    --n_g_pretrain_epochs 5
    --n_g_update 0

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.99
    --outer_criterion LearnedMixin
    --g_lambda_inner 0
    --n_g_pretrain_epochs 5
    --n_g_update 0

RUBi

RUBi updates F and g simultaneously but separately. We set g_lambda_inner to 0 for only updating g network using the classification loss.

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.999
    --outer_criterion RUBi
    --g_lambda_inner 0

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.997
    --outer_criterion RUBi
    --g_lambda_inner 0

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.995
    --outer_criterion RUBi
    --g_lambda_inner 0

python main_biased_mnist.py --root /path/to/your/dataset
    --train_correlation 0.99
    --outer_criterion RUBi
    --g_lambda_inner 0

ImageNet results in Table 2

| Model | Biased (Standard acc) | Unbiased (Texture clustering) | ImageNet-A [4] | |-----------------------|-----------------------|-------------------------------|----------------| | Van

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

400

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

last30days-skill

19.9k

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary