MDP
[CVPR 2025] MDP: Multidimensional Vision Model Pruning with Latency Constraint
Install / Use
/learn @NVlabs/MDPREADME
🎯 MDP: Multidimensional Vision Model Pruning with Latency Constraint
<!-- [](https://opensource.org/licenses/MIT) -->This repository contains the official implementation of MDP method introduced in our CVPR 2025 paper:
MDP: Multidimensional Vision Model Pruning with Latency Constraint
Xinglong Sun, Barath Lakshmanan, Maying Shen, Shiyi Lan, Jingde Chen, Jose M. Alvarez
📋 Table of Contents
📄 License
Please check the LICENSE file. HALP may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact researchinquiries@nvidia.com.
📢 News
- [2025/09] Release license obtained. ResNet50 and ablation study code are now available; remaining code will be cleaned up and released soon.
- [2025/06] I presented MDP in a CVPR 2025 tutorial on Full-Stack, GPU-based Acceleration of Deep Learning and Foundation Models. You can watch the tutorial video here!
📝 Introduction
Current structural pruning methods face two significant limitations:
- They often limit pruning to finer-grained levels like channels, making aggressive parameter reduction challenging
- They focus heavily on parameter and FLOP reduction, with existing latency-aware methods frequently relying on simplistic, suboptimal linear models that fail to generalize well to transformers
In this paper, we address both limitations by introducing Multi-Dimensional Pruning (MDP), a novel paradigm that:
- Jointly optimizes across various pruning granularities (channels, query, key, heads, embeddings, and blocks)
- Employs advanced latency modeling to accurately capture latency variations
- Reformulates pruning as a Mixed-Integer Nonlinear Program (MINLP)
- Supports both CNNs and transformers
🎨 Framework
<div align="center"> <img src="Figs/framework.png" width="100%"> <p><i>Overview of our MDP method</i></p> </div>📊 Results
Our extensive experiments demonstrate MDP's superior performance:
ImageNet Classification
- ResNet50: 28% speed increase with +1.4 Top-1 accuracy improvement over prior art
- DEIT-Base: 37% additional acceleration with +0.7 Top-1 accuracy improvement over prior art
3D Object Detection
- Higher speed (×1.18) and mAP (0.451 vs. 0.449) compared to dense baseline
🚀 Installation
Please check README within the folder for the task you want to run!
💻 Usage
Please check README within the folder for the task you want to run!
🙏 Acknowledgements
Some of the infrastructure, data loading, and foundational code are adapted from HALP and NVIT works. We sincerely thank the authors of these works for their contributions.
📚 Citation
If you find this repository useful for your research, please cite our paper:
@misc{sun2025mdpmultidimensionalvisionmodel,
title={MDP: Multidimensional Vision Model Pruning with Latency Constraint},
author={Xinglong Sun and Barath Lakshmanan and Maying Shen and Shiyi Lan and Jingde Chen and Jose M. Alvarez},
year={2025},
eprint={2504.02168},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2504.02168}
}
