MultiDigitMNIST
Combine multiple MNIST digits to create datasets with 100/1000 classes for few-shot learning/meta-learning
Install / Use
/learn @shaohua0116/MultiDigitMNISTREADME
Multi-digit MNIST for Few-shot Learning
<p align="center"> <img src="asset/examples/multimnist_example.png" weight="1024"/> </p>Cite this repository
@misc{mulitdigitmnist,
author = {Sun, Shao-Hua},
title = {Multi-digit MNIST for Few-shot Learning},
year = {2019},
journal = {GitHub repository},
url = {https://github.com/shaohua0116/MultiDigitMNIST},
}
Papers that use this dataset:
- MetaSDF: Meta-learning Signed Distance Functions (NeurIPS 2020): Paper, Project page, Code
- Regularizing Deep Multi-Task Networks using Orthogonal Gradients: Paper
- GMAIR: Unsupervised Object Detection Based on Spatial Attention and Gaussian Mixture: Paper
- Data-free meta learning via knowledge distillation from multiple teachers: Thesis
Description
Multi-digit MNIST generator creates datasets consisting of handwritten digit images from MNIST for few-shot image classification and meta-learning. It simply samples images from MNIST dataset and put digits together to create images with multiple digits. It also creates training/validation/testing splits (64/20/16 classes for DoubleMNIST and 640/200/160 for TripleMNIST).
You can generate customized by following the cammands provided in Usage to change the number of images in each class, the image size, etc. You can also download generated datasets from Datasets.
This repository benchmarks the performance of MAML (Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks) using datasets created via the generation script in a variety of settings.
Some examples of images from the datasets are as follows.
- Double MNIST Datasets (100 classes:
00to99)
| Class | 10 | 48 | 59 | 62 | 73 |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Image |
|
|
|
|
|
- Triple MNIST Datasets (1000 classes:
000to999)
| Class | 039 | 146 | 258 | 512 | 874 |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Image |
|
|
|
|
|
Prerequisites
Usage
Generate a DoubleMNIST dataset with 1k images for each class
python generator.py --num_image_per_class 1000 --multimnist_path ./dataset/double_mnist --num_digit 2 --image_size 64 64
Generate a TripleMNIST dataset with 1k images for each class
python generator.py --num_image_per_class 1000 --multimnist_path ./dataset/triple_mnist --num_digit 3 --image_size 84 84
Arguments
--mnist_path: the path to the MNIST dataset (download it if not found)--multimnist_path: the path to the output Multi-digit MNIST dataset--num_digit: how many digits in an image--train_val_test_ratio: determine how many classes for train, val, and test--image_size: the size of images. Note that the width needs to be larger thannum_digit*mnist_width--num_image_per_class: how many images for each class--random_seed: numpy random seed
Datasets
You can download the generated datasets
| Dataset | Image size | Train/Val/Test classes | # of images per class | File size | link | | :---: | :---: | :---: | :---: |--- | :---: | | DoubleMNIST | (64, 64) | 64, 16, 20 | 1000 | 69MB | Google Drive | | TripleMNIST | (84, 84) | 640, 160, 200 | 1000 | 883MB | Google Drive |
Benchmark
This repository benchmarks training MAML (Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks) using datasets created via this generation script in a variety of settings.
| Dataset/Setup | 5-way 1-shot | 5-way 5-shot | 20-way 1-shot | 20-way 1-shot | | :------------: | :----------: | :----------: | :-----------: | :-----------: | | Double MNIST | 97.046% | in progress | 85.461% | in progress | | Triple MNIST | 98.813% | in progress | 96.251% | in progress | | Omniglot | 98.7% | 99.9% | 95.8% | 98.9% |
Hyperparameters
slow learning rate: 1e-3fast learning rate: 0.4number of gradient steps: 1meta batch size: 12number of conv layers: 4iterations: 100k
Training
<p align="center"> <img src="asset/training.png" width="800"/> </p>*The trainings have not fully converged and the new results will be reported once they are finished.
Author
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
399Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
18.8kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
