SkillAgentSearch skills...

Doraemon

A powerful baseline for image classification, face recognition and image retrieval with Pytorch

Install / Use

/learn @wuji3/Doraemon

README

<div align="center">DORAEMON: Deep Object Recognition And Embedding Model Of Networks</div>

<p align="center"> <img src="./misc/doraemon.jpg"> </p> <p align="center"> <img src="https://img.shields.io/badge/doraemon-0.0.4a0-brightgreen.svg"> <img src="https://img.shields.io/badge/python-3.10-blue.svg"> <img src="https://img.shields.io/badge/pytorch-2.0+-orange.svg"> <img src="https://img.shields.io/badge/torchmetrics-0.11.4-green.svg"> <img src="https://img.shields.io/badge/timm-0.9.16-red.svg"> <img src="https://img.shields.io/badge/opencv-4.7.0-lightgrey.svg"> <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg"></a> </p>

🚀 Quick Start

<summary><b>Installation Guide</b></summary>
# Create and activate environment
python -m venv doraemon
source doraemon/bin/activate

# Install Doraemon
pip install doraemon-torch

# If you need to install in editable mode (for development)
pip install -e .

📢 What's New

  • 🎁 2025.11.07: Doraemon paper paper released; welcome to <a href='#citation'> cite our paper </a> if you find the project useful for your research or development.
  • 🎁 2025.03.16: Doraemon v0.1.0 released
  • 🎁 2024.10.01: Content-Based Image Retrieval (CBIR): We collect a product dataset from Kaggle & TianChi with a complete pipeline for training, end-to-end validation, and visualization. Please check ImageRetrieval.md
  • 🎁 2024.04.01: Face Recognition: Based on a cleaned MS-Celeb-1M-v1c with over 70,000 IDs and 3.6 million images, validated with LFW. Includes loss functions like ArcFace, CircleLoss, and MagFace.
  • 🎁 2023.06.01: Image Classification (IC): Given the Oxford-IIIT Pet dataset. Supports different learning rates for different layers, hard example mining, multi-label and single-label training, bad case analysis, GradCAM visualization, automatic labeling to aid semi-supervised training, and category-specific data augmentation. Refer to ImageClassification.md

✨ Highlights

  • Optimization Algorithms: Various optimization techniques to enhance model training efficiency, including SGD, Adam, and SAM (Sharpness-Aware Minimization).

  • Data Augmentation: A variety of data augmentation techniques to improve model robustness, such as CutOut, Color-Jitter, and Copy-Paste etc.

  • Regularization: Techniques to prevent overfitting and improve model generalization, including Label Smoothing, OHEM, Focal Loss, and Mixup.

  • Visualization: Integrated visualization tool to understand model decision-making, featuring GradCAM.

  • Personalized Data Augmentation: Apply exclusive data augmentation to specific classes with Class-Specific Augmentation.

  • Personalized Hyperparameter Tuning: Apply different learning rates to specific layers using Layer-Specific Learning Rates.

🚀 Deployment API

Doraemon offers incredibly simple yet powerful deployment options:

  • Local API Inference: Deploy models with just a single weight file (*.pt) - one command setup for high-performance local inference
  • Seamless HuggingFace Integration: Effortlessly deploy to the Huggingface ecosystem with full support for:
    • AutoModel.from_pretrained()
    • AutoProcessor.from_pretrained()
    • And all standard Hugging Face API interfaces

For detailed deployment instructions and ready-to-use examples, see our Deployment Guide.

📚 Tutorials

For detailed guidance on specific tasks, please refer to the following resources:

  • Image Classification: If you are working on image classification tasks, please refer to Doc: Image Classification.

  • Image Retrieval: For image retrieval tasks, please refer to Doc: Image Retrieval.

  • Face Recognition: Stay tuned.

📊 Datasets

Doraemon integrates the following datasets, allowing users to quickly start training:

🧩 Supported Models

Doraemon now supports 1000+ models through integration with Timm:

  • All models from timm.list_models(pretrained=True)
  • Including CLIP, SigLIP, DeiT, BEiT, MAE, EVA, DINO and more

Model Performance Benchmarks can help you select the most suitable model by comparing:

  • Inference speed
  • Training efficiency
  • Accuracy across different datasets
  • Parameter count vs performance trade-offs

For detailed benchmark results, see @huggingface/pytorch-image-models#1933

Citation

<span id='citation'/> If you find **Doraemon** useful for your research or development, please cite the following <a href="https://arxiv.org/abs/2511.04394" target="_blank">paper</a>:
@misc{du2025visual,
      title={DORAEMON: A Unified Library for Visual Object Modeling and Representation Learning at Scale}, 
      author={Ke Du and Yimin Peng and Chao Gao and Fan Zhou and Siqiao Xue},
      year={2025},
      journal={arXiv preprint arXiv:2511.04394},
      url={https://arxiv.org/abs/2511.04394}, 
}
View on GitHub
GitHub Stars583
CategoryEducation
Updated3mo ago
Forks60

Languages

Python

Security Score

97/100

Audited on Dec 4, 2025

No findings