SPViT
[TPAMI 2024] This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.
Install / Use
/learn @ziplab/SPViTREADME
This is the official repository for our paper: Pruning Self-attentions into Convolutional Layers in Single Path by Haoyu He, Jianfei Cai, Jing liu, Zizheng Pan, Jing Zhang, Dacheng Tao and Bohan Zhuang.
<h3><strong><i>🚀 News</i></strong></h3>[2023-12-29]: Accepted by TPAMI!
[2023-06-09]: Update distillation configurations and pre-trained checkpoints.
[2021-12-04]: Release pre-trained models.
[2021-11-25]: Release code.
Introduction:
To reduce the massive computational resource consumption for ViTs and add convolutional inductive bias, our SPViT prunes pre-trained ViT models into accurate and compact hybrid models by pruning self-attentions into convolutional layers. Thanks to the proposed weight-sharing scheme between self-attention and convolutional layers that cast the search problem as finding which subset of parameters to use, our SPViT has significantly reduced search cost.
Experimental results:
We provide experimental results and pre-trained models for SPViT:
| Name | Acc@1 | Acc@5 | # parameters | FLOPs | Model | | :------------ | :---: | :---: | ------------ | ----- | ------------------------------------------------------------ | | SPViT-DeiT-Ti | 70.7 | 90.3 | 4.9M | 1.0G | Model | | SPViT-DeiT-Ti* | 73.2 | 91.4 | 4.9M | 1.0G | Model | | SPViT-DeiT-S | 78.3 | 94.3 | 16.4M | 3.3G | Model | | SPViT-DeiT-S* | 80.3 | 95.1 | 16.4M | 3.3G | Model | | SPViT-DeiT-B | 81.5 | 95.7 | 46.2M | 8.3G | Model | | SPViT-DeiT-B* | 82.4 | 96.1 | 46.2M | 8.3G | Model |
| Name | Acc@1 | Acc@5 | # parameters | FLOPs | Model | | :------------ | :---: | :---: | ------------ | ----- | ------------------------------------------------------------ | | SPViT-Swin-Ti | 80.1 | 94.9 | 26.3M | 3.3G | Model | | SPViT-Swin-Ti* | 81.0 | 95.3 | 26.3M | 3.3G | Model | | SPViT-Swin-S | 82.4 | 96.0 | 39.2M | 6.1G | Model | | SPViT-Swin-S* | 83.0 | 96.4 | 39.2M | 6.1G | Model |
* indicates knowledge distillation.
Getting started:
In this repository, we provide code for pruning two representative ViT models.
- SPViT-DeiT that prunes DeiT. Please see SPViT_DeiT/README.md for details.
- SPViT-Swin that prunes Swin. Please see SPViT_Swin/README.md for details.
If you find our paper useful, please consider cite:
@article{he2024Pruning,
title={Pruning Self-attentions into Convolutional Layers in Single Path},
author={He, Haoyu and Liu, Jing and Pan, Zizheng and Cai, Jianfei and Zhang, Jing and Tao, Dacheng and Zhuang, Bohan},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2024},
publisher={IEEE}
}
Related Skills
node-connect
344.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
99.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
