76 skills found · Page 1 of 3
tpill90 / Steam Lancache PrefillCLI tool to automatically prime a Lancache with Steam games
guqiong96 / LvllmLvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features an efficient GPU parallel and NUMA parallel architecture, supporting hybrid inference for MOE large models.
sindresorhus / New Github Issue UrlGenerate a URL for opening a new GitHub issue with prefilled title, body, and other fields
scrya-com / RotorquantKV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.
grofers / LegendLegend builds and publishes Grafana dashboards for your services with prefilled metrics and alerts for your services.
ByteDance-Seed / FlexPrefillCode for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
psmarter / Mini InferLLM inference engine from scratch — paged KV cache, continuous batching, chunked prefill, prefix caching, speculative decoding, CUDA graph, tensor parallelism, MoE expert parallelism, OpenAI-compatible serving
awslabs / Aws Cloudformation Template Builderaws-cloudformation-template-builder contains cfn-skeleton is a command line tool and Go library that consumes the published CloudFormation specification and generates skeleton CloudFormation templates with mandatory and optional parameters of chosen resource types prefilled with placeholder values.
infinigence / Semi PDA prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.
sindresorhus / New Github Release UrlGenerate a URL for opening a new GitHub release with prefilled tag, body, and other fields
tpill90 / Battlenet Lancache PrefillCLI tool to automatically prefill a Lancache with Battle.Net games
Naoray / Laravel Factory PrefillPrefills factories with faker method suggestions to increase productivity
tpill90 / Epic Lancache PrefillCLI tool to automatically prime a Lancache with Epic Launcher games
slwang-ustc / Nano Vllm V1Nano vLLM with vLLM v1's request scheduling strategy and chunked prefill
yuanchuan / Codepen PrefillCreate new pen from local HTML/JS/CSS files with ease
siyan-zhao / PrepackingThe source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS 2025]
Jingyu6 / Speculative PrefillNo description available
simonschiller / PrefillerPrefiller is a Gradle plugin that generates pre-filled Room databases at compile time.
guqiong96 / LsglangLsglang is a special extension of sglang that fully utilizes CPU and GPU computing resources with an efficient GPU parallel + NUMA parallel architecture, suitable for MOE model hybrid inference.
qhfan / FlashPrefillImplementation of "FlashPreill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling"