Libcll

Complementary-label learning in Pytorch

Generate Convert Improve

Install / Use

/learn @ntucllab/Libcll

About this skill

Quality Score

0/100

README

libcll: Complementary Label Learning Benchmark

libcll is a Python library designed to simplify complementary-label learning (CLL) for researchers tackling real-world challenges. The package implements a wide range of popular CLL strategies, including CPE, the state-of-the-art algorithm as of 2023. Additionally, it includes unique datasets like CLImage and ACLImage, which feature complementary labels collected from human annotators and VLM annotators. To foster extensibility, libcll provides a unified interface for integrating additional strategies, datasets, and models, making it a versatile tool for advancing CLL research. For more details, refer to the associated technical report on arXiv.

Installation

Python version >= 3.8, <= 3.12
Pytorch version >= 1.11, <= 2.0
Pytorch Lightning version >= 2.0
To install libcll and develop locally:

git clone git@github.com:ntucllab/libcll.git
cd libcll
pip install -e .

Running

Supported Strategies

| Strategies | Type | Description | | ---------------------------------------------------------- | ---------------- | ------------------------------------------------------------ | | PC | None | Pairwise-Comparison Loss | | SCL | NL, EXP | Surrogate Complementary Loss with the negative log loss (NL) or with the exponential loss (EXP) | | URE | NN, GA, TNN, TGA | Unbiased Risk Estimator whether with gradient ascent (GA) or empirical transition matrix (T) | | FWD | None | Forward Correction | | DM | None | Discriminative Models with Weighted Loss | | CPE | I, F, T | Complementary Probability Estimates with different transition matrices (I, F, T) | | MCL | MAE, EXP, LOG | Multiple Complementary Label learning with different errors (MAE, EXP, LOG) | | OP | None | Order-Preserving Loss | | SCARCE | None | Selected-Completely-At-Random Complementary-label learning |

Supported Datasets

| Dataset | Number of Classes | Input Size | Description | | ----------- | --------------- | ----------- | ------------------------------------------------------------ | | MNIST | 10 | 28 x 28 | Grayscale images of handwritten digits (0 to 9). | | FMNIST | 10 | 28 x 28 | Grayscale images of fashion items. | | KMNIST | 10 | 28 x 28 | Grayscale images of cursive Japanese (“Kuzushiji”) characters. | | Yeast | 10 | 8 | Features of different localization sites of protein. | | Texture | 11 | 40 | Features of different textures. | | Dermatology | 6 | 130 | Clinical Attributes of different diseases. | | Control | 6 | 60 | Features of synthetically generated control charts. | | CIFAR10 | 10 | 3 x 32 x 32 | Colored images of different objects. | | CIFAR20 | 20 | 3 x 32 x 32 | Colored images of different objects. | | Micro ImageNet10 | 10 | 3 x 64 x 64 | Contains images of 10 classes designed for computer vision research. | | Micro ImageNet20 | 20 | 3 x 64 x 64 | Contains images of 20 classes designed for computer vision research. | | CLCIFAR10 | 10 | 3 x 32 x 32 | Colored images of distinct objects paired with complementary labels annotated by humans. | | CLCIFAR20 | 20 | 3 x 32 x 32 | Colored images of distinct objects paired with complementary labels annotated by humans. | | CLMicro ImageNet10 | 10 | 3 x 64 x 64 | Contains images of 10 classes designed for computer vision research paired with complementary labels annotated by humans. | | CLMicro ImageNet20 | 20 | 3 x 64 x 64 | Contains images of 20 classes designed for computer vision research paired with complementary labels annotated by humans. | | ACLCIFAR10 | 10 | 3 x 32 x 32 | Colored images of distinct objects paired with complementary labels annotated by Visual-Language Models. | | ACLCIFAR20 | 20 | 3 x 32 x 32 | Colored images of distinct objects paired with complementary labels annotated by Visual-Language Models. | | ACLMicro ImageNet10 | 10 | 3 x 64 x 64 | Contains images of 10 classes designed for computer vision research paired with complementary labels annotated by Visual-Language Models. | | ACLMicro ImageNet20 | 20 | 3 x 64 x 64 | Contains images of 20 classes designed for computer vision research paired with complementary labels annotated by Visual-Language Models. |

Quick Start: Complementary Label Learning on MNIST

To reproduce training results with the SCL-NL method on MNIST for each distribution:

Uniform Distribution

python scripts/train.py \
  --do_train \
  --do_predict \
  --strategy SCL \
  --type NL \
  --model MLP \
  --dataset MNIST \
  --lr 1e-4 \
  --batch_size 256 \
  --valid_type Accuracy \

Biased Distribution (Weak Deviation)

python scripts/train.py \
  --do_train \
  --do_predict \
  --strategy SCL \
  --type NL \
  --model MLP \
  --dataset MNIST \
  --lr 1e-4 \
  --batch_size 256 \
  --valid_type Accuracy \
  --transition_matrix weak

Biased Distribution (Strong Deviation)

python scripts/train.py \
  --do_train \
  --do_predict \
  --strategy SCL \
  --type NL \
  --model MLP \
  --dataset MNIST \
  --lr 1e-4 \
  --batch_size 256 \
  --valid_type Accuracy \
  --transition_matrix strong

Noisy Distribution

python scripts/train.py \
  --do_train \
  --do_predict \
  --strategy SCL \
  --type NL \
  --model MLP \
  --dataset MNIST \
  --lr 1e-4 \
  --batch_size 256 \
  --valid_type Accuracy \
  --transition_matrix noisy
  --noise 0.1

Multiple Complementary Label Learning

python scripts/train.py \
  --do_train \
  --do_predict \
  --strategy SCL \
  --type NL \
  --model MLP \
  --dataset MNIST \
  --lr 1e-4 \
  --batch_size 256 \
  --valid_type Accuracy \
  --num_cl 3

Run all the settings in the survey paper

The following scripts reproduce the results for one strategy presented in the survey paper. They include a grid search over learning rates from {1e-3, 5e-4, 1e-4, 5e-5, 1e-5}, followed by training with the best learning rate using four different random seeds.

./scripts/uniform.sh <strategy> <type>
./scripts/biased.sh <strategy> <type>
./scripts/noisy.sh <strategy> <type>
./scripts/multi.sh <strategy> <type>
./scripts/multi_hard.sh <strategy> <type>

For example:

./scripts/uniform.sh SCL NL
./scripts/biased.sh SCL NL
./scripts/noisy.sh SCL NL
./scripts/multi.sh SCL NL
./scripts/multi_hard.sh SCL NL

Documentation

The documentation for the latest release is available on readthedocs. Feedback, questions, and suggestions are highly encouraged. Contributions to improve the documentation are warmly welcomed and greatly appreciated!

Citing

If you find this package useful, please cite both the original works associated with each strategy and the following:

@techreport{libcll2024,
  author = {Nai-Xuan Ye and Tan-Ha Mai and Hsiu-Hsuan Wang and Wei-I Lin and Hsuan-Tien Lin},
  title = {libcll: an Extendable Python Toolkit for Complementary-Label Learning},
  institution = {National Taiwan University},
  url = {https://github.com/ntucllab/libcll},
  note = {available as arXiv preprint \url{https://arxiv.org/abs/2411.12276}},
  month = nov,
  year = 2024
}

Acknowledgment

We would like to express our gratitude to the following repositories for sharing their code, which greatly facilitated the development of libcll:

Related Skills

proje

Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

groundhog

400

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

workshop-rules

Materials used to teach the summer camp <Data Science for Kids>

ntucllab

View profile

View on GitHub

GitHub Stars27

CategoryEducation

Updated5mo ago

Forks10

ntucllab/libcll

Languages

Python

Security Score

87/100

Audited on Oct 29, 2025

No findings