4 skills found
amirzandieh / QJLQJL: 1-Bit Quantized JL transform for KV Cache Quantization with Zero Overhead
RecursiveIntell / Turbo QuantRust implementation of TurboQuant, PolarQuant, and QJL — zero-overhead vector quantization for semantic search and KV cache compression (ICLR 2026)
mindtro / SemafoldVector compression with TurboQuant codecs for embeddings, retrieval, and KV-cache. 10x compression, pure NumPy core — optional GPU acceleration via PyTorch (CUDA/MPS) or MLX (Metal).
javalikescript / Qjls DistNo description available