43 skills found · Page 1 of 2
HamzaElshafie / Gpt Oss 20BA PyTorch implementation of the GPT-OSS-20B architecture. All components are coded from scratch: RoPE with YaRN, RMSNorm, SwiGLU with clamping and residual connection, Mixture-of-Experts (MoE), Self-Attention with learned sinks, banded attention, GQA, and KV-cache.
fkodom / Grouped Query Attention Pytorch(Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints" (https://arxiv.org/pdf/2305.13245.pdf)
bknyaz / SggTrain Scene Graph Generation for Visual Genome and GQA in PyTorch >= 1.2 with improved zero and few-shot generalization [BMVC 2020, ICCV 2021]
ExplainableML / CzslPyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.
doc-doc / NExT GQACan I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
WissingChen / CRA GQAThe official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"
preacher-1 / MLA Tutorialfrom MHA, MQA, GQA to MLA by 苏剑林, with code
Bruce-Lee-LY / Decoding AttentionDecoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.
gqa-ood / GQA OODGQA-OOD is a new dataset and benchmark for the evaluation of VQA models in OOD (out of distribution) settings.
Jack-ctrl6 / GPT KVcache GQANo description available
ronilp / Mac Network Pytorch GqaMemory, Attention and Composition (MAC) Network for CLEVR/GQA implemented in PyTorch
adapter-hub / XGQANo description available
fotoetienne / GqaiTurn any GraphQL endpoint into a set of MCP tools
PingchengDong / GQA LUTThe official implementation of the DAC 2024 paper GQA-LUT
g763007297 / GQAnimation几种常见的动画,如晴天,雨天,多云,雷阵雨,二维码扫描,脉冲等
Octavian-ai / Gqa Node PropertiesRecalling node properties from a knowledge graph
ronghanghu / Gqa Single Hop BaselineA simple but well-performing "single-hop" visual attention model for the GQA dataset
kyegomez / Attn ResA clean, single-file PyTorch implementation of Attention Residuals (Kimi Team, MoonshotAI, 2026), integrated with Grouped Query Attention (GQA), SwiGLU feed-forward networks, and Rotary Position Embeddings (RoPE).
zhousheng97 / ViTXT GQA[IEEE TMM'25] Scene-Text Grounding for Text-Based Video Question Answering
kyegomez / MGQAThe open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints"