Wrench

[NeurIPS 2021] WRENCH: Weak supeRvision bENCHmark

Generate Convert Improve

Install / Use

/learn @JieyuZ2/Wrench

About this skill

Quality Score

0/100

README

GitHub stars GitHub forks

🔧 New

1/25/23

Add Hyper label model, please find more details in our paper.

4/20/22

Add WS explainer, please find more details in our paper.

4/20/22

We have updated the setup.py to make installation more flexible.

Please use pip install ws-benchmark==1.1.2rc0 to install the latest version. We strongly suggest create a new environment to install wrench. We will bring better compatibility in the next stable release. If you have any problems with installation, please let us know.

Known incompatibilities:

tensorflow==2.8.0, albumentations==0.1.12

3/18/22

Wrench is available on ws-benchmark now, using pip install ws-benchmark to qucik install.

2/13/22

Add script to generate LFs for any tabular dataset as well as 5 new tabular datasets, namely, mushroom, spambase, PhishingWebsites, Bioresponse, and bank-marketing.

11/04/21

(beta) Add parallel_fit for torch model to support pytorch DistributedDataParallel-example

10/15/21

A branch of new methods: WeaSEL, ImplyLoss, ASTRA, MeanTeacher, Meta-Weight-Net, Learning-to-Reweight
Support image classification (dataset class / torchvision backbone) as well as DomainNet/Animals-with-Attributes2 datasets (check out the datasets folder)

🔧 What is it?

Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development and evaluation of your own weak supervision models within the benchmark.

For more information, checkout our publications:

WRENCH: A Comprehensive Benchmark for Weak Supervision (NeurIPS 2021)
A Survey on Programmatic Weak Supervision

If you find this repository helpful, feel free to cite our publication:

@inproceedings{
zhang2021wrench,
title={{WRENCH}: A Comprehensive Benchmark for Weak Supervision},
author={Jieyu Zhang and Yue Yu and Yinghao Li and Yujing Wang and Yaming Yang and Mao Yang and Alexander Ratner},
booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year={2021},
url={https://openreview.net/forum?id=Q9SKS5k8io}
}

🔧 What is weak supervision?

Weak Supervision is a paradigm for automated training data creation without manual annotations.

For a brief overview, please check out this blog.

For more context, please check out this survey.

To track recent advances in weak supervision, please follow this repo.

🔧 Installation

[1] Install anaconda: Instructions here: https://www.anaconda.com/download/

[2] Clone the repository:

git clone https://github.com/JieyuZ2/wrench.git
cd wrench

[3] Create virtual environment:

conda env create -f environment.yml
source activate wrench

If this not working or you want to use only a subset of modules of Wrench, check out this wiki page

[4] Download datasets:

from huggingface_hub import snapshot_download
path = "path to local dir"
snapshot_download(repo_id="jieyuz2/WRENCH", repo_type="dataset", local_dir=path)

🔧 Available Datasets

Note that some datasets may have more training examples than what is reported in README/paper because we include the dev set, whose indices can be found in labeled_id.json if exists.

A documentation of dataset format and usage can be found in this wiki-page

classification:

| Name | Task | # class | # LF | # train | # validation | # test | data source | LF source | |:--------|:---------|:------|:------|:------|:-------------|:-------|:------|:------| | Census | income classification | 2 | 83 | 10083 | 5561 | 16281 | link |link | | Youtube | spam classification | 2 | 10 | 1586 | 120 | 250 | link | link | | SMS | spam classification | 2 | 73 | 4571 | 500 | 500 | link | link | | IMDB | sentiment classification | 2 | 8 | 20000 | 2500 | 2500 | link | link | | Yelp | sentiment classification | 2 | 8 | 30400 | 3800 | 3800 | link | link | | AGNews | topic classification | 4 | 9 | 96000 | 12000 | 12000 | link | link | | TREC | question classification | 6 | 68 | 4965 | 500 | 500 | link | link | | Spouse | relation classification | 2 | 9 | 22254 | 2801 | 2701 | link | link | | SemEval | relation classification | 9 | 164 | 1749 | 178 | 600 | link | link | | CDR | bio relation classification | 2 | 33 | 8430 | 920 | 4673 | link | link | | Chemprot | chemical relation classification | 10 | 26 | 12861 | 1607 | 1607 | link | link | | Commercial | video frame classification | 2 | 4 | 64130 | 9479 | 7496 | link | link | | Tennis Rally | video frame classification | 2 | 6 | 6959 | 746 | 1098 | link | link | | Basketball | video frame classification | 2 | 4 | 17970 | 1064 | 1222 | link | link | | DomainNet | image classification | - | - | - | - | - | link | link |

sequence tagging:

| Name | # class | # LF | # train | # validation | # test | data source | LF source | |:--------|:---------|:------|:------|:------|:------|:------|:------| | CoNLL-03 | 4 | 16 | 14041 | 3250 | 3453 | link | link | | WikiGold | 4 | 16 | 1355 | 169 | 170 | link | link | | OntoNotes 5.0 | 18 | 17 | 115812 | 5000 | 22897 | link | link | | BC5CDR | 2 | 9 | 500 | 500 | 500 | link | link | | NCBI-Disease | 1 | 5 | 592 | 99 | 99 | link | link | | Laptop-Review | 1 | 3 | 2436 | 609 | 800 | link | link | | MIT-Restaurant | 8 | 16 | 7159 | 500 | 1521 | link | link | | MIT-Movies | 12 | 7 | 9241 | 500 | 2441 | link | link |

The detailed documentation is coming soon.

🔧 Available Models

If you find any of the implementations is wrong/problematic, don't hesitate to raise issue/pull request, we really appreciate it!

TODO-list: check this out!

classification:

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

flutter-tutor

Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d

groundhog

400

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

workshop-rules

Materials used to teach the summer camp <Data Science for Kids>

JieyuZ2

View profile

View on GitHub

GitHub Stars227

CategoryEducation

Updated1mo ago

Forks34

JieyuZ2/wrench

Languages

Python

Security Score

100/100

Audited on Feb 16, 2026

No findings