DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Install / Use
/learn @deepspeedai/DeepSpeedREADME
Latest News
-
[2025/12] DeepSpeed Core API updates: PyTorch-style backward and low-precision master states
-
[2025/10] We hosted the Ray x DeepSpeed Meetup at Anyscale. We shared our most recent work on SuperOffload, ZenFlow, Muon Optimizer Support, Arctic Long Sequence Training and DeepCompile. Please find the meetup slides here.
-
[2025/10] SuperOffload: Unleashing the Power of Large-Scale LLM Training on Superchips
-
[2025/10] Study of ZenFlow and ZeRO offload performance with DeepSpeed CPU core binding
-
[2025/08] ZenFlow: Stall-Free Offloading Engine for LLM Training
-
[2025/06] DeepNVMe: Affordable I/O scaling for Deep Learning Applications
Extreme Speed and Scale for DL Training
DeepSpeed enabled the world's most powerful language models (at the time of this writing) such as MT-530B and BLOOM. DeepSpeed offers a confluence of system innovations, that has made large scale DL training effective, and efficient, greatly improved ease of use, and redefined the DL training landscape in terms of scale that is possible. These innovations include ZeRO, ZeRO-Infinity, 3D-Parallelism, Ulysses Sequence Parallelism, DeepSpeed-MoE, etc.
DeepSpeed Adoption
DeepSpeed was an important part of Microsoft’s AI at Scale initiative to enable next-generation AI capabilities at scale, where you can find more information here.
DeepSpeed has been used to train many different large-scale models, below is a list of several examples that we are aware of (if you'd like to include your model please submit a PR):
- Megatron-Turing NLG (530B)
- Jurassic-1 (178B)
- BLOOM (176B)
- GLM (130B)
- xTrimoPGLM (100B)
- YaLM (100B)
- GPT-NeoX (20B)
- AlexaTM (20B)
- Turing NLG (17B)
- METRO-LM (5.4B)
DeepSpeed has been integrated with several different popular open-source DL frameworks such as:
| | Documentation | | ---------------------------------------------------------------------------------------------- | -------------------------------------------- | <img src="docs/assets/images/transformers-light.png#gh-light-mode-only" width="250px"><img src="docs/assets/images/transformers-dark.png#gh-dark-mode-only" width="250px"> | Transformers with DeepSpeed | | <img src="docs/assets/images/accelerate-light.png#gh-light-mode-only" width="250px"><img src="docs/assets/images/accelerate-dark.png#gh-dark-mode-only" width="250px"> | Accelerate with DeepSpeed | | <img src="docs/assets/images/lightning-light.svg#gh-light-mode-only" width="200px"><img src="docs/assets/images/lightning-dark.svg#gh-dark-mode-only" width="200px"> | Lightning with DeepSpeed | | <img src="docs/assets/images/mosaicml.svg" width="200px"> | MosaicML with DeepSpeed | | <img src="docs/assets/images/determined.svg" width="225px"> | Determined with DeepSpeed | | <img src="https://user-images.githubusercontent.com/58739961/187154444-fce76639-ac8d-429b-9354-c6fac64b7ef8.jpg" width=150> | MMEngine with DeepSpeed |
Build Pipeline Status
| Description | Status |
| ----------- | ------ |
| NVIDIA |
|
| AMD |
|
| CPU |
|
| Intel Gaudi |
|
| Intel XPU |
|
| Integrations |
|
| Misc |
[
](htt
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
