207 skills found · Page 1 of 7
Tencent / TNNTNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.
VainF / Torch Pruning[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.
NVIDIA / Model OptimizerA unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.
666DZY666 / Micronetmicronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
Lam1360 / YOLOv3 Model Pruning在 oxford hand 数据集上对 YOLOv3 做模型剪枝(network slimming)
tensorflow / Model OptimizationA toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
horseee / LLM Pruner[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
princeton-nlp / LLM Shearing[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
BenWhetton / Keras SurgeonPruning and other network surgery for trained Keras models.
airaria / TextPrunerA PyTorch-based model pruning toolkit for pre-trained language models
NVlabs / MinitronA family of compressed models obtained via pruning and knowledge distillation
datawhalechina / Awesome Compression模型压缩的小白入门教程,PDF下载地址 https://github.com/datawhalechina/awesome-compression/releases
czg1225 / SlimSAM[NeurIPS 2024] SlimSAM: 0.1% Data Makes Segment Anything Slim
mehtadushy / SelecSLS PytorchReference ImageNet implementation of SelecSLS CNN architecture proposed in the SIGGRAPH 2020 paper "XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera". The repository also includes code for pruning the model based on implicit sparsity emerging from adaptive gradient descent methods, as detailed in the CVPR 2019 paper "On implicit filter level sparsity in Convolutional Neural Networks".
arcee-ai / PruneMeAutomated Identification of Redundant Layer Blocks for Pruning in Large Language Models
VainF / Diff Pruning[NeurIPS 2023] Structural Pruning for Diffusion Models
princeton-nlp / CoFiPruning[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
NVlabs / MDP[CVPR 2025] MDP: Multidimensional Vision Model Pruning with Latency Constraint
marcoancona / TorchPrunerOn-the-fly Structured Pruning for PyTorch models. This library implements several attributions metrics and structured pruning utils for neural networks in PyTorch.
shekkizh / TensorflowProjectsDeep learning using tensorflow