CSVAL

[MIDL 2023] Official Imeplementation of "Making Your First Choice: To Address Cold Start Problem in Vision Active Learning"

Generate Convert Improve

Install / Use

/learn @cliangyu/CSVAL

About this skill

Quality Score

0/100

README

Cold Start Problem in Vision Active Learning

Cold Start Problem

Active learning promises to improve annotation efficiency by iteratively selecting the most important data to be annotated first. However, we uncover a striking contradiction to this promise: active learning fails to select data as efficiently as random selection at the first few choices. We identify this as the cold start problem in vision active learning. We seek to address the cold start problem by exploiting the three advantages of contrastive learning: (1) no annotation is required; (2) label diversity is ensured by pseudo-labels to mitigate bias; (3) typical data is determined by contrastive features to reduce outliers.

Paper

This repository provides the official implementation of the following paper:

Making Your First Choice: To Address Cold Start Problem in Vision Active Learning Liangyu Chen1, Yutong Bai2, Siyu Huang3, Yongyi Lu2, Bihan Wen1, Alan L. Yuille2, and Zongwei Zhou2 1 Nanyang Technological University, 2 Johns Hopkins University, 3 Harvard University Medical Imaging with Deep Learning (MIDL), 2023 NeurIPS Workshop on Human in the Loop Learning, 2022 paper | code | poster

If you find this repo useful, please consider citing our paper:

@article{chen2022making,
  title={Making Your First Choice: To Address Cold Start Problem in Vision Active Learning},
  author={Chen, Liangyu and Bai, Yutong and Huang, Siyu and Lu, Yongyi and Wen, Bihan and Yuille, Alan L and Zhou, Zongwei},
  journal={arXiv preprint arXiv:2210.02442},
  year={2022}
}

Installation

The selection part of code is developed on the basis of open-mmlab/mmselfsup. Please see mmselfsup installation.

Dataset preparation

All datasets can be auto downloaded in this repo.

MedMNIST can also be downloaded at MedMNIST v2.

CIFAR-10-LT is generated in this repo with a fixed seed.

Pretrain

⚠️ Note: the current implementation is based on MMSelfSup which is not maintained. I updated the new version of pretraining and selection code in v2, based on MMPreTrain. However, this repository is reserved as a minimal and compatible implementation : )

Pretrain on all MedMNIST datasets

cd selection
bash tools/medmnist_pretrain.sh

Pretrain on CIFAR-10-LT

bash tools/cifar_pretrain.sh

Select initial query

Select initial queries on all MedMNIST datasets

bash tools/medmnist_postprocess.sh

Select initial queries on CIFAR-10-LT

bash tools/cifar_postprocess.sh

Train with selected initial query

cd training
cd medmnist_active_selection # for selecting with actively selected initial queries
# cd medmnist_random_selection # for selecting with randomly selected initial queries
# cd medmnist_uniform_active_selection # for selecting with class-balanced actively selected initial queries
# cd medmnist_uniform_random_selection # for selecting with class-balanced randomly selected initial queries
for mnist in { bloodmnist breastmnist dermamnist octmnist organamnist organcmnist organsmnist pathmnist pneumoniamnist retinamnist tissuemnist }; do bash run.sh $mnist; done

Related Skills

proje

Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

400

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

cliangyu

View profile

View on GitHub

GitHub Stars36

CategoryEducation

Updated1mo ago

Forks2

cliangyu/CSVAL

Languages

Python

Security Score

90/100

Audited on Mar 7, 2026

No findings