Slime
slime is a LLM post-training framework for RL Scaling.
Install / Use
/learn @RLsys-Foundation/SlimeREADME
slime
slime is an LLM post-training framework for RL scaling, providing two core capabilities:
- High-Performance Training: Supports efficient training in various modes by connecting Megatron with SGLang;
- Flexible Data Generation: Enables arbitrary training data generation workflows through custom data generation interfaces and server-based engines.
slime is the RL-framework behind GLM-4.5 and GLM-4.6 and apart from models from Z.ai, we also supports the following models:
- Qwen3 series (Qwen3Next, Qwen3MoE, Qwen3), Qwen2.5 series;
- DeepSeek V3 series (DeepSeek V3, V3.1, DeepSeek R1);
- Llama 3.
Blogs
- Our vision: slime: An SGLang-Native Post-Training Framework for RL Scaling.
- Our ideas on agentic training: Agent-Oriented Design: An Asynchronous and Decoupled Framework for Agentic RL
- v0.1.0 release note: v0.1.0: Redefining High-Performance RL Training Frameworks
Table of Contents
- Architecture Overview
- Quick Start
- Projects Built with slime
- Arguments Walkthrough
- Developer Guide
- FAQ & Acknowledgements
Architecture Overview

Module Descriptions:
- training (Megatron): Responsible for the main training process, reads data from the Data Buffer, and synchronizes parameters to the rollout module after training.
- rollout (SGLang + router): Generates new data (including rewards/verifier outputs) and stores it in the Data Buffer.
- data buffer: A bridge module that manages prompt initialization, custom data, and rollout generation methods.
Quick Start
For a comprehensive quick start guide covering environment setup, data preparation, training startup, and key code analysis, please refer to:
We also provide examples for some use cases not covered in the quick start guide; please check examples.
Projects Built upon slime
slime has powered several novel research projects and production systems. Here are some notable examples:
⚡ TritonForge: Agentic RL Training Framework for Kernel Generation
TritonForge leverages slime's SFT & RL capabilities to train LLMs that automatically generate optimized GPU kernels. By using a two-stage training approach—supervised fine-tuning followed by reinforcement learning with multi-turn compilation feedback—TritonForge achieves remarkable results in converting PyTorch operations into high-performance Triton kernels.
🚀 APRIL: Accelerating RL Training with Active Partial Rollouts
APRIL introduces a system-level optimization that seamlessly integrates with slime to accelerate the rollout generation phase in RL training. By intelligently over-provisioning requests and actively managing partial completions, APRIL addresses the long-tail generation bottleneck that typically consumes over 90% of RL training time.
These projects showcase slime's versatility—from training code-generation models to optimizing RL training systems—making it a powerful foundation for both research and production deployments.
Arguments Walkthrough
Arguments in slime are divided into three categories:
- Megatron arguments: slime reads all arguments set in Megatron via
PYTHONPATH. You can configure Megatron by passing arguments like--tensor-model-parallel-size 2. - SGLang arguments: All arguments for the installed SGLang are supported. These arguments must be prefixed with
--sglang-. For example,--mem-fraction-staticshould be passed as--sglang-mem-fraction-static. - slime-specific arguments: Please refer to: slime/utils/arguments.py
For complete usage instructions, please refer to the Usage Documentation.
Developer Guide
-
Contributions are welcome! If you have suggestions for new features, performance tuning, or feedback on user experience, feel free to submit an Issue or PR 😊
-
Use pre-commit to ensure code style consistency for your commits:
apt install pre-commit -y
pre-commit install
# run pre-commit to ensure code style consistency
pre-commit run --all-files --show-diff-on-failure --color=always
- For debugging tips, please refer to the Debugging Guide
FAQ & Acknowledgements
- For frequently asked questions, please see the Q&A
- Special thanks to the following projects & communities: SGLang, Megatron‑LM, mbridge, OpenRLHF, veRL, Pai-Megatron-Patch and others.
- To quote slime, please use:
@misc{slime_github,
author = {Zilin Zhu and Chengxing Xie and Xin Lv and slime Contributors},
title = {slime: An LLM post-training framework for RL Scaling},
year = {2025},
howpublished = {\url{https://github.com/THUDM/slime}},
note = {GitHub repository. Corresponding author: Xin Lv},
urldate = {2025-06-19}
}
Related Skills
node-connect
339.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
339.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.9kCommit, push, and open a PR
