<div align="center">  <h2><a href="https://arxiv.org/abs/2211.03295">MogaNet: Multi-order Gated Aggregation Network (ICLR 2024)</a></h2>

Siyuan Li<sup>*,1,2</sup>, Zedong Wang<sup>*,1</sup>, Zicheng Liu<sup>1,2</sup>, Chen Tan<sup>1,2</sup>, Haitao Lin<sup>1,2</sup>, Di Wu<sup>1,2</sup>, Zhiyuan Chen<sup>1</sup>, Jiangbin Zheng<sup>1,2</sup>, Stan Z. Li<sup>†,1</sup>

<sup>1</sup>Westlake University, <sup>2</sup>Zhejiang University

</div> <p align="center"> <a href="https://arxiv.org/abs/2211.03295" alt="arXiv"> <img src="https://img.shields.io/badge/arXiv-2211.03295-b31b1b.svg?style=flat" /></a> <a href="https://github.com/Westlake-AI/MogaNet/blob/main/LICENSE" alt="license"> <img src="https://img.shields.io/badge/license-Apache--2.0-%23B7A800" /></a> <a href="https://colab.research.google.com/github/Westlake-AI/MogaNet/blob/main/demo.ipynb" alt="Colab"> <img src="https://colab.research.google.com/assets/colab-badge.svg" /></a> <a href="https://huggingface.co/MogaNet" alt="Huggingface"> <img src="https://img.shields.io/badge/huggingface-MogaNet-blueviolet" /></a> </p> <p align="center"> <img src="https://user-images.githubusercontent.com/44519745/202308950-00708e25-9ac7-48f0-af12-224d927ac1ae.jpg" width=100% height=100% class="center"> </p>

We propose MogaNet, a new family of efficient ConvNets designed through the lens of multi-order game-theoretic interaction, to pursue informative context mining with preferable complexity-performance trade-offs. It shows excellent scalability and attains competitive results among state-of-the-art models with more efficient use of model parameters on ImageNet and multifarious typical vision benchmarks, including COCO object detection, ADE20K semantic segmentation, 2D&3D human pose estimation, and video prediction.

This repository contains PyTorch implementation for MogaNet (ICLR 2024).

<details> <summary>Table of Contents</summary> <ol> <li><a href="#catalog">Catalog</a></li> <li><a href="#image-classification">Image Classification</a></li> <li><a href="#license">License</a></li> <li><a href="#acknowledgement">Acknowledgement</a></li> <li><a href="#citation">Citation</a></li> </ol> </details>

Catalog

We plan to release implementations of MogaNet in a few months. Please watch us for the latest release. Currently, this repo is reimplemented according to our official implementations in OpenMixup, and we are working on cleaning up experimental results and code implementations. Models are released in GitHub / Baidu Cloud / Hugging Face.

[x] ImageNet-1K Training and Validation Code with timm [code] [models] [Hugging Face 🤗]
[x] ImageNet-1K Training and Validation Code in OpenMixup / MMPretrain (TODO)
[x] Downstream Transfer to Object Detection and Instance Segmentation on COCO [code] [models] [demo]
[x] Downstream Transfer to Semantic Segmentation on ADE20K [code] [models] [demo]
[x] Downstream Transfer to 2D Human Pose Estimation on COCO [code] (baselines supported) [models] [demo]
[x] Downstream Transfer to 3D Human Pose Estimation (baselines supported) [code] [models]
[x] Downstream Transfer to Video Prediction on MMNIST Variants [code] (baselines supported)
[x] Image Classification on Google Colab and Notebook Demo [demo]

Image Classification

1. Installation

Please check INSTALL.md for installation instructions.

2. Training and Validation

See TRAINING.md for ImageNet-1K training and validation instructions, or refer to our OpenMixup implementations. We released pre-trained models on OpenMixup in moganet-in1k-weights. We have also reproduced ImageNet results with this repo and released args.yaml / summary.csv / model.pth.tar in moganet-in1k-weights. The parameters in the trained model can be extracted by code.

Here is a notebook demo of MogaNet which run the steps to perform inference with MogaNet for image classification.

3. ImageNet-1K Trained Models

| Model | Resolution | Params (M) | Flops (G) | Top-1 / top-5 (%) | Script | Download | |---|:---:|:---:|:---:|:---:|:---:|:---:| | MogaNet-XT | 224x224 | 2.97 | 0.80 | 76.5 | 93.4 | args | script | model | log | | MogaNet-XT | 256x256 | 2.97 | 1.04 | 77.2 | 93.8 | args | script | model | log | | MogaNet-T | 224x224 | 5.20 | 1.10 | 79.0 | 94.6 | args | script | model | log | | MogaNet-T | 256x256 | 5.20 | 1.44 | 79.6 | 94.9 | args | script | model | log | | MogaNet-T* | 256x256 | 5.20 | 1.44 | 80.0 | 95.0 | config | script | model | log | | MogaNet-S | 224x224 | 25.3 | 4.97 | 83.4 | 96.9 | args | script | model | log | | MogaNet-B | 224x224 | 43.9 | 9.93 | 84.3 | 97.0 | args | script | model | log | | MogaNet-L | 224x224 | 82.5 | 15.9 | 84.7 | 97.1 | args | script | model | [log](https://github.com/Westlake-AI/MogaNet/releases/download/moganet-in1k-weights/

MogaNet

Install / Use

README

Catalog

Image Classification

1. Installation

2. Training and Validation

3. ImageNet-1K Trained Models