ViTAE

The official pytorch implementation of ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

Generate Convert Improve

Install / Use

/learn @Annbless/ViTAE

About this skill

Quality Score

0/100

README

<h1 align="left">ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias <a href="https://arxiv.org/pdf/2106.03348.pdf"><img src="https://img.shields.io/badge/arXiv-Paper-<COLOR>.svg" ></a> </a> </h1>

By Yufei Xu*, Qiming Zhang*, Jing Zhang, and Dacheng Tao, accepted by Neurips 2021.

The code and pretrained models (ViTAE and ViTAEv2) have been moved to Link. Please try it and have fun!

We have also provided the codes of using ViTAE for

Detection

Segmentation

Pose Estimation

Matting

Remote Sensing

Citing ViTAE and ViTAEv2

@article{xu2021vitae,
  title={Vitae: Vision transformer advanced by exploring intrinsic inductive bias},
  author={Xu, Yufei and Zhang, Qiming and Zhang, Jing and Tao, Dacheng},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}
@article{zhang2022vitaev2,
  title={ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond},
  author={Zhang, Qiming and Xu, Yufei and Zhang, Jing and Tao, Dacheng},
  journal={arXiv preprint arXiv:2202.10108},
  year={2022}
}

Related Skills

node-connect

350.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

110.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

350.8k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

350.8k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。