ImageNetModel

Official ImageNet Model repository

Generate Convert Improve

Install / Use

/learn @YehLi/ImageNetModel

About this skill

Quality Score

0/100

README

Introduction

This repository contains the official implementation of the following papers:

<a href="https://github.com/JDAI-CV/CoTNet/blob/master/README.md">CoTNet</a> Contextual transformer networks for visual recognition, TPAMI 2022
 <img src="images/CoTNet_framework.jpg" width="800"/>
<a href="classification">Wave-ViT</a> Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning, ECCV 2022
 <img src="images/WaveVit_framework.jpg" width="800"/>
<a href="classification">Dual-ViT</a> Dual Vision Transformer
 <img src="images/DualVit_framework.jpg" width="800"/>

Getting Started

For Image Classification, please see classification.
For Object Detection and Instance Segmentation, please see object_detection.
For Semantic Segmentation, please see semantic_segmentation.

Citation

CoTNet

@article{cotnet2022,
  title={Contextual transformer networks for visual recognition},
  author={Li, Yehao and Yao, Ting and Pan, Yingwei and Mei, Tao},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2022},
  publisher={IEEE}
}

Wave-ViT

@inproceedings{wavevit2022,
    title     = {Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning},
    author    = {Yao, Ting and Pan, Yingwei and Li, Yehao and Ngo, Chong-Wah and Mei, Tao},
    booktitle = {Proceedings of the European conference on computer vision (ECCV)},
    year      = {2022},
}

Dual-ViT

@article{dualvit2022,
  title={Dual Vision Transformer},
  author={Yao, Ting and Li, Yehao and Pan, Yingwei and Wang, Yu and Zhang, Xiao-Ping and Mei, Tao},
  journal={arXiv preprint arXiv:2207.04976},
  year={2022}
}

Acknowledgements

Thanks the contribution of timm, pvt and volo.

Related Skills

node-connect

339.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

83.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

339.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

83.9k

Commit, push, and open a PR