ImageNetModel
Official ImageNet Model repository
Install / Use
/learn @YehLi/ImageNetModelREADME
Introduction
This repository contains the official implementation of the following papers:
-
<a href="https://github.com/JDAI-CV/CoTNet/blob/master/README.md">CoTNet</a> Contextual transformer networks for visual recognition, TPAMI 2022
<p align="center"> <img src="images/CoTNet_framework.jpg" width="800"/> </p> -
<a href="classification">Wave-ViT</a> Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning, ECCV 2022
<p align="center"> <img src="images/WaveVit_framework.jpg" width="800"/> </p> -
<a href="classification">Dual-ViT</a> Dual Vision Transformer
<p align="center"> <img src="images/DualVit_framework.jpg" width="800"/> </p>
Getting Started
- For Image Classification, please see classification.
- For Object Detection and Instance Segmentation, please see object_detection.
- For Semantic Segmentation, please see semantic_segmentation.
Citation
CoTNet
@article{cotnet2022,
title={Contextual transformer networks for visual recognition},
author={Li, Yehao and Yao, Ting and Pan, Yingwei and Mei, Tao},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2022},
publisher={IEEE}
}
Wave-ViT
@inproceedings{wavevit2022,
title = {Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning},
author = {Yao, Ting and Pan, Yingwei and Li, Yehao and Ngo, Chong-Wah and Mei, Tao},
booktitle = {Proceedings of the European conference on computer vision (ECCV)},
year = {2022},
}
Dual-ViT
@article{dualvit2022,
title={Dual Vision Transformer},
author={Yao, Ting and Li, Yehao and Pan, Yingwei and Wang, Yu and Zhang, Xiao-Ping and Mei, Tao},
journal={arXiv preprint arXiv:2207.04976},
year={2022}
}
Acknowledgements
Related Skills
node-connect
339.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
339.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.9kCommit, push, and open a PR
