SkillAgentSearch skills...

ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Install / Use

/learn @alibaba/ROLL
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<div align="center"> <img src="assets/roll.jpeg" width="40%" alt="ROLL Logo">

ROLL: Reinforcement Learning Optimization for Large-Scale Learning

<h4>🚀 An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models 🚀</h4> <p> <a href="https://github.com/alibaba/ROLL/blob/main/LICENSE"> <img src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" alt="License"> </a> <a href="https://github.com/alibaba/ROLL/issues"> <img src="https://img.shields.io/github/issues/alibaba/ROLL" alt="GitHub issues"> </a> <a href="https://github.com/alibaba/ROLL/stargazers"> <img src="https://img.shields.io/github/stars/alibaba/ROLL?style=social" alt="Repo stars"> </a> <a href="https://arxiv.org/abs/2506.06122"><img src="https://img.shields.io/static/v1?label=arXiv&message=Paper&color=red"></a> <!-- 组织主页:点击跳转到 https://github.com/alibaba --> <a href="./assets/roll_wechat.png" target="_blank"> <img src="https://img.shields.io/badge/WeChat-green?logo=wechat" alt="WeChat QR"> </a> <a href="https://deepwiki.com/alibaba/ROLL" target="_blank"> <img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"> </a> <a href="./assets/future_lab.png" target="_blank"> <img src="https://img.shields.io/twitter/follow/FutureLab2025?style=social" alt="X QR"> </a> </p> </div>

ROLL is an efficient and user-friendly RL library designed for Large Language Models (LLMs) utilizing Large Scale GPU resources. It significantly enhances LLM performance in key areas such as human preference alignment, complex reasoning, and multi-turn agentic interaction scenarios.

Leveraging a multi-role distributed architecture with Ray for flexible resource allocation and heterogeneous task scheduling, ROLL integrates cutting-edge technologies like Megatron-Core, SGLang and vLLM to accelerate model training and inference.


📢 News

| 📣 Updates | |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | [03/06/2026] 🎉 We support Qwen3.5 Dense and MoE series models and [on-policy distill](docs_roll/i18n/zh-Hans/docusaurus-plugin-content-docs/current/User Guides/Pipeline/on_policy_distill_pipeline_start.md). Welcome to use! | | [02/03/2026] 🎉 We released FSDP2 Strategy, Megatron with LoRA, GPU partial overlapping, Qwen3-Omni supports and other features. For more details, please refer to the release notes. Welcome to use! | | [01/01/2026] 🎉 Our Let It Flow: Agentic Crafting on Rock and Roll report released! Introducing ALE ecosystem and ROME, an open-source agentic model with novel IPA algorithm. | | [11/08/2025] 🎉 Our ROCK: Reinforcement Open Construction Kit released, Explore the new capabilities!. | | [10/23/2025] 🎉 Our Papers released, see Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning and Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization. | | [10/14/2025] 🎉 Our Paper released, see Part II: ROLL Flash -- Accelerating RLVR and Agentic Training with Asynchrony. | | [09/28/2025] 🎉 Ascend NPU support — see usage guide. | | [09/25/2025] 🎉 Our Paper released, see RollPacker: Mitigating Long-Tail Rollouts for Fast, Synchronous RL Post-Training | | [09/24/2025] 🎉 Support Wan2_2 Reward FL pipeline. Explore the new capabilities! | | [09/23/2025] 🎉 ROLL aligns with GEM environment definition, providing agentic Tool Use training capabilities, ToolUse docs. | | [09/16/2025] 🎉 Qwen3-Next model training is supported, refer to configuration. | | [09/04/2025] 🎉 ROLL supports vLLM dynamic FP8 rollout and remove_padding for acceleration. | | [08/28/2025] 🎉 ROLL supports SFT pipeline, refer to configuration. | | [08/13/2025] 🎉 ROLL supports AMD GPUs with out-of-box image docker and Dockerfile and specific yamls under examples/ directory. Please refer to Installation. | | [08/11/2025] 🎉 Our Paper released, see Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning. | | [08/10/2025] 🎉 Agentic RL supports stepwise learning, like GiGPO; Distill supports VLM. Explore the new capabilities! | | [08/06/2025] 🎉 ROLL PPT is now available, Slides. | | [07/31/2025] 🎉 Refactor agentic rl design. Support agentic rl async training. Explore the new capabilities! | | [07/31/2025] 🎉 Support DistillPipeline/DpoPipeline. Support lora. Support GSPO | | [06/25/2025] 🎉 Support thread env for env scaling and support qwen2.5 VL agentic pipeline. | | [06/13/2025] 🎉 Support Qwen2.5 VL rlvr pipeline and upgrade mcore to 0.12 version.

Related Skills

View on GitHub
GitHub Stars3.0k
CategoryEducation
Updated6h ago
Forks258

Languages

Python

Security Score

100/100

Audited on Mar 30, 2026

No findings