SkillAgentSearch skills...

AwesomeMLSys

An ML Systems Onboarding list

Install / Use

/learn @gpu-mode/AwesomeMLSys
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

ML Systems Onboarding Reading List

This is a reading list of papers/videos/repos I've personally found useful as I was ramping up on ML Systems and that I wish more people would just sit and study carefully during their work hours. If you're looking for more recommendations, go through the citations of the below papers and enjoy!

Conferences where MLSys papers get published

Attention Mechanism

Performance Optimizations

Quantization

Long context length

Sparsity

  • Venom: Vectorized N:M Format for sparse tensor cores when hardware only supports 2:4
  • Megablocks: Efficient Sparse training with mixture of experts
  • ReLu Strikes Back: Really enjoyed this paper as an example of doing model surgery for more efficient inference

Distributed

Speculative decoding

Linear attention

  • Flash linear attention: Efficient implementations of state-of-the-art linear attention models (and their papers)
View on GitHub
GitHub Stars1.0k
CategoryDevelopment
Updated1d ago
Forks38

Security Score

80/100

Audited on Apr 1, 2026

No findings