MBEM
Learning From Noisy Singly-labeled Data
Install / Use
/learn @khetan2/MBEMREADME
MBEM
This repository gives an implementation of the MBEM algorithm proposed in the paper Learning From Noisy Singly-labeled Data published at ICLR 2018.
Model Bootstrapped Expectation Maximization (MBEM) is a new algorithm for training a deep learning model using noisy data collected from crowdsourcing platforms such as Amazon Mechanical Turk. MBEM outperforms classical crowdsourcing algorithm "majority vote". In this repo, we provide code to run MBEM on CIFAR-10 dataset. We synthetically generate noisy labels given the true labels and using hammer-spammer worker distribution for worker qualities that is explained in the paper. Under the setting when the total annotation budget is fixed, that is we choose whether to collect "1" noisy label for each of the "n" training samples or collect "r" noisy labels for each of the "n/r" training examples.
we show empirically that it is better to choose the former case, that is collect "1" noisy label per example for as many training examples as possible when the total annotation budget is fixed. We use ResNet deep learning model for training a classifier for CIFAR-10. We use the ResNet MXNET implementation.
For running the code call "python MBEM.py". The code requires Python2, Apache MXNet, numpy and scipy packages. If a GPU is available, change line 34 in MBEM.py from gpus = None to gpus = '0'.
Numerical Results on ImageNet dataset
The ImageNet-1K dataset contains 1.2M training examples and 50K validation examples. We divide test set in two parts: 10K for validation and 40K for test. Each example belongs to one of the possible 1000 classes. We implement our algorithms using a ResNet-18 that achieves top-1 accuracy of 69.5% and top-5 accuracy of 89% on ground truth labels. We use m=1000 simulated workers. Although in general, a worker can mislabel an example to one of the 1000 possible classes, our simulated workers mislabel an example to only one of the 10 possible classes. This captures the intuition that even with a larger number of classes, perhaps only a small number are easily confused for each other. Therefore, each workers' confusion matrix is of size 10 X 10. Note that without this assumption, there is little hope of estimating a 1000 X 1000 confusion matrix for each worker by collecting only approximately 1200 noisy labels from a worker. For rest of the settings, please refer to the Learning From Noisy Singly-labeled Data paper. In the figure below, we fix total annotation budget to be 1.2M and vary redundancy from 1 to 9. When redundancy is 9, we have only (1.2/9)M training examples, each labeled by 9 workers. MBEM outperforms baselines achieving the minimum generalization error with many singly annotated training examples.
On x-axis is the redundancy, the number of labels collected for each example. y-axis shows the generalization error for the three algorithms. Solid lines represent top-5 error, dashed-lines represent top-1 error.

Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
399Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
18.8kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
