SkillAgentSearch skills...

Models

A collection of pre-trained, state-of-the-art models in the ONNX format

Install / Use

/learn @onnx/Models
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<!--- SPDX-License-Identifier: Apache-2.0 -->

Deprecation Notice: We sincerely thank the community for participating in the ONNX Model Zoo effort. As the machine learning ecosystem has evolved, much of the novel model sharing has successfully transitioned to Hugging Face, which maintains a vibrant and healthy state. We are preserving the ONNX Model Zoo repository for historical purposes only. Please note that models will no longer be available for LFS download starting July 1st, 2025. You can still get access to the models that were originally available on this repository by going to https://huggingface.co/onnxmodelzoo.

ONNX Model Zoo

Introduction

Welcome to the ONNX Model Zoo! The Open Neural Network Exchange (ONNX) is an open standard format created to represent machine learning models. Supported by a robust community of partners, ONNX defines a common set of operators and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.

This repository is a curated collection of pre-trained, state-of-the-art models in the ONNX format. These models are sourced from prominent open-source repositories and have been contributed by a diverse group of community members. Our aim is to facilitate the spread and usage of machine learning models among a wider audience of developers, researchers, and enthusiasts.

To handle ONNX model files, which can be large, we use Git LFS (Large File Storage).

Models

Currently, we are expanding the ONNX Model Zoo by incorporating additional models from the following categories. As we are rigorously validating the new models for accuracy, refer to the validated models below that have been successfully validated for accuracy:

  • Computer Vision
  • Natural Language Processing (NLP)
  • Generative AI
  • Graph Machine Learning

These models are sourced from prominent open-source repositories such as timm, torchvision, torch_hub, and transformers, and exported into the ONNX format using the open-source TurnkeyML toolchain.

Validated Models

Vision

Language

Other

Read the Usage section below for more details on the file formats in the ONNX Model Zoo (.onnx, .pb, .npz), downloading multiple ONNX models through Git LFS command line, and starter Python code for validating your ONNX model using test data.

INT8 models are generated by Intel® Neural Compressor. Intel® Neural Compressor is an open-source Python library which supports automatic accuracy-driven tuning strategies to help user quickly find out the best quantized model. It implements dynamic and static quantization for ONNX models and can represent quantized ONNX models with operator oriented as well as tensor oriented (QDQ) ways. Users can use web-based UI service or python code to do quantization. Read the Introduction for more details.

Image Classification <a name="image_classification"/>

This collection of models take images as input, then classifies the major objects in the images into 1000 object categories such as keyboard, mouse, pencil, and many animals.

|Model Class |Reference |Description |Huggingface Spaces| |-|-|-|-| |<b>MobileNet</b>|Sandler et al.|Light-weight deep neural network best suited for mobile and embedded vision applications. <br>Top-5 error from paper - ~10%| |<b>ResNet</b>|He et al.|A CNN model (up to 152 layers). Uses shortcut connections to achieve higher accuracy when classifying images. <br> Top-5 error from paper - ~3.6%| Hugging Face Spaces | |<b>SqueezeNet</b>|Iandola et al.|A light-weight CNN model providing AlexNet level accuracy with 50x fewer parameters. <br>Top-5 error from paper - ~20%| Hugging Face Spaces | |<b>VGG</b>|Simonyan et al.|Deep CNN model(up to 19 layers). Similar to AlexNet but uses multiple smaller kernel-sized filters that provides more accuracy when classifying images. <br>Top-5 error from paper - ~8%| Hugging Face Spaces | |<b>AlexNet</b>|Krizhevsky et al.|A Deep CNN model (up to 8 layers) where the input is an image and the output is a vector of 1000 numbers. <br> Top-5 error from paper - ~15%| Hugging Face Spaces | |<b>GoogleNet</b>|Szegedy et al.|Deep CNN model(up to 22 layers). Comparatively smaller and faster than VGG and more accurate in detailing than AlexNet. <br> Top-5 error from paper - ~6.7%| Hugging Face Spaces | |<b>CaffeNet</b>|Krizhevsky et al.|Deep CNN variation of AlexNet for Image Classification in Caffe where the max pooling precedes the local response normalization (LRN) so that the LRN takes less compute and memory.| Hugging Face Spaces | |<b>RCNN_ILSVRC13</b>|Girshick et al.|Pure Caffe implementation of R-CNN for image classification. This model uses localization of regions to classify and extract features from images.| |<b>DenseNet-121</b>|Huang et al.|Model that has every layer connected to every other layer and passes on its own feature providing strong gradient flow and more diversified features.| Hugging Face Spaces | |<b>Inception_V1</b>|Szegedy et al.|This model is same as GoogLeNet, implemented through Caffe2 that has improved utilization of the computing resources inside the network and helps with the vanishing gradient problem. <br> Top-5 error from paper - ~6.7%| Hugging Face Spaces | |<b>Inception_V2</b>|Szegedy et al.|Deep CNN model for Image Classification as an adaptation to Inception v1 with batch normalization. This model has reduced computational cost and improved image resolution compared to Inception v1. <br> Top-5 error from paper ~4.82%| |<b>ShuffleNet_V1</b>|Zhang et al.|Extremely computation efficient CNN model that is designed specifically for mobile devices. This model greatly reduces the computational cost and provides a ~13x speedup over AlexNet on ARM-based mobile devices. Compared to MobileNet, ShuffleNet achieves superior performance by a significant margin due to it's efficient structure. <br> Top-1 error from paper - ~32.6%| |<b>ShuffleNet_V2</b>|Zhang et al.|Extremely computation efficient CNN model that is designed specifically for mobile devices. This network architecture design considers direct metric such as speed, instead of indirect metric like FLOP. <br> Top-1 error from paper - ~30.6%| |<b>ZFNet-512</b>|Zeiler et al.|Deep CNN model (up to 8 layers) that increased the number of features that the network is capable of detecting that helps to pick image features at a finer level of resolution. <br> Top-5 error from paper - ~14.3%| Hugging Face Spaces | |<b>EfficientNet-Lite4</b>|Tan et al.|CNN model with an order of magnitude of few computations and parameters, while still acheiving state-of-the-art

Related Skills

View on GitHub
GitHub Stars9.5k
CategoryEducation
Updated3h ago
Forks1.6k

Languages

Jupyter Notebook

Security Score

100/100

Audited on Apr 1, 2026

No findings