12 skills found
flashinfer-ai / FlashinferFlashInfer: Kernel Library for LLM Serving
flashinfer-ai / Flashinfer BenchBuilding the Virtuous Cycle for AI-driven LLM Systems
flashinfer-ai / Flashinfer Bench Starter KitFlashInfer Bench @ MLSys 2026: Building AI agents to write high performance GPU kernels
Bruce-Lee-LY / Decoding AttentionDecoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.
sgl-project / WhlSGLang Kernel Wheel Index
Triang-jyed-driung / Rapid SamplingFast LLM sampling kernels (3-7x flashinfer!) in CUDA.
tomasruizt / Flashinfer Competition CodebaseNo description available
opensource4you / Flashinfer Contest PlaygroundA stable environment for os4y members who want to participate FlashInfer Contest
flashinfer-ai / Flashinfer NightlyFlashInfer Nightly
shadowpa0327 / FlashInfer GymNo description available
caoshiyi / Flashinfer Bench KsearchNo description available
vanshnawander / Flashinfer MlsysNo description available