Results for "int8-quantization"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

46 skills found · Page 1 of 2

Lightning-AI / Lit Llama

6.1k

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

universal

Updated 2d ago

intel / Neural Compressor

2.6k

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

universal

auto-tuningawqfp4+14

Updated 19h ago

666DZY666 / Micronet

2.3k

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape

universal

batch-normalization-fusebnnconvolutional-networks+17

Updated 10d ago

DerryHub / BEVFormer Tensorrt

565

BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).

universal

bevformercudaint8-inference+3

Updated 1h ago

BUG1989 / Caffe Int8 Convert Tools

518

Generate a quantization parameter file for ncnn framework int8 inference

zed

caffedeeplearning-aiint8-inference+2

Updated 1mo ago

PINTO0309 / Tflite2tensorflow

273

Generate saved_model, tfjs, tf-trt, EdgeTPU, CoreML, quantized tflite, ONNX, OpenVINO, Myriad Inference Engine blob and .pb from .tflite. Support for building environments with Docker. It is possible to directly access the host PC GUI and the camera to verify the operation. NVIDIA GPU (dGPU) support. Intel iHD GPU (iGPU) support. Supports inverse quantization of INT8 quantization model.

zed

convertercoremldepthai+15

Updated 1mo ago

TNTWEN / OpenVINO YOLOV4

238

This is implementation of YOLOv4,YOLOv4-relu,YOLOv4-tiny,YOLOv4-tiny-3l,Scaled-YOLOv4 and INT8 Quantization in OpenVINO2021.3

universal

openvinoscaledyolov4tensorflow+4

Updated 4mo ago

willard-yuan / Cvt

167

CVT, a Computer Vision Toolkit.

universal

cbircovdetfasttext+6

Updated 7mo ago

jundaf2 / INT8 Flash Attention FMHA Quantization

162

No description available

universal

Updated 5d ago

NJU-Jet / SR Mobile Quantization

158

Winner solution of mobile AI (CVPRW 2021).

universal

int8-quantizationnetwork-quantizationresidual-learning+1

Updated 1mo ago

GiorgosXou / NeuralNetworks

145

A header-only neural network library for microcontrollers, with partial bare-metal & native-os support.

universal

arduinoattinyavr+17

Updated 5d ago

clovaai / Frostnet

105

FrostNet: Towards Quantization-Aware Network Architecture Search

universal

classificationcomputer-visiondeep-learning+11

Updated 3mo ago

jahongir7174 / YOLOv8 Qat

Quantization Aware Training

universal

int8-inferenceint8-quantizationobject-detection+4

Updated 2d ago

TianzhongSong / Tensorflow Quantization Test

Tensorflow quantization (float32-->int8) inference test

universal

Updated 3mo ago

xuanandsix / Tensorrt Int8 Quantization Pipline

a simple pipline of int8 quantization based on tensorrt.

universal

classifactionint8quantization+2

Updated 5mo ago

caslabai / Yolov3tiny Tensorflow Int8 Quantized

yolov3_tiny implement on tensoeflow for int8 quantization (tflite)

zed

Updated 1y ago

BoumedineBillal / Esp32 P4 Vehicle Classifier

Production-ready vehicle classification on ESP32-P4 with MobileNetV2 INT8 quantization. 3 optimized variants: 70ms-459ms latency. Hardware-validated, ready-to-flash projects included.

zed

Updated 9d ago

Howell-Yang / Onnx2trt

将端上模型部署过程中，常见的问题以及解决办法记录并汇总，希望能给其他人带来一点帮助。

universal

calibratorint8-inferenceint8-quantization+2

Updated 6d ago

GiorgosXou / ATTiny85 MNIST RNN EEPROM

ATtiny85 arduino example, running an RNN MNIST model via the (internal) 512-Byte EEPROM with ~95% accuracy

universal

attiny85cpp11eeprom+7

Updated 2d ago

xigh / Herbert Rs

Local LLM inference engine written from scratch in Rust — hand-written AVX-512 assembly kernels, Metal & Vulkan compute shaders. Supports Qwen3, Mistral3, ... Q4/INT8/BF16 quantization.

universal

Updated 4d ago