AgentJet
Cutting-edge platform for LLM agent tuning. Deliver RL tuning with flexibility, reliability, speed, multi-agent optimization and realtime community benchmarking.
Install / Use
/learn @modelscope/AgentJetREADME
AgentJet
<div align="center"> <a href="https://modelscope.github.io/AgentJet" target="_blank"> <img width="500" alt="AgentJet" src="docs/agentjet.jpg"/> </a> </div>AgentJet (AJet) is a cutting-edge, user-friendly agent RL training framework designed to optimize agents and agentic workflows (supporting any agent built with OpenAI SDK, AgentScope, Langchain, or raw HTTP requests), fine-tuning LLM weights to enhance model performance.
AgentJet (AJet) has fully-distributed swarm training capability, which means that you can deploy ajet-swarm start in GPU server(s) and then start training agents in your laptop(s)! Simply provide your agent workflow, training dataset, and reward function, and AgentJet will be ready to go!
✈️ News
- 2026.3.26 Upgrade verl backend to 0.7.1 to support more models and increase training speed! All benchmark verified.
- 2026.3.19 Support for latest Qwen3.5 models is in progress.
- 2026.3.12 Tuning Original OpenClaw Agent without Editing Any Agent Code. EN Blog / ZH Blog.
- 2026.3.09 Non-shared-parameter Multiagent Training. EN Blog / ZH Blog.
- 2026.2.20 Introducing AgentJet Swarm. ZH Blog / EN Blog.
✈️ Fast Introduction
Classic Mode
Let's begin with the simplest example: a math agent with a tool call. This is a simple & centralized training example.
- please check out the installation guide to set up the training environment.
- tune your first model using the minimum example.
ajet --conf ./tutorial/example_math_agent/math_agent.yaml --backbone='verl'
Swarm Mode
Let's begin with the simplest AgentJet Swarm example: also a math agent. In this case, you can use any GPU-less laptop to train the model remotely.
- Start swarm server and begin swarm overwatch:
ajet-swarm startandajet-swarm overwatch. (Alternative: if you are a fan of docker, use our prebuilt docker image here without setting up dependencies) - From your laptop (or swarm server localhost), run this simple script to begin training:
AJET_SWARM_URL="http://swarm-server-ip:10086" python ./tutorial/example_math_swarm/math.py
✈️ Features
We aim to build an easy-to-learn Agent tuner that unlocks more possibilities for agent developers:
- Easy and Friendly. AgentJet helps you tune models behind your agent workflows easily, optimizing your agents for top performance with minimal effort.
- Rich Tutorial Library. AgentJet provides a rich library of examples as tutorials.
- Swarm Training. This unique feature of AgentJet opens many possibilities: deploying distributed & self-healing rollout workers, non-shared-parameter multi-agent training, multi-runtime & multi-task cocktail training. And just like Tinker, you can use AgentJet Swarm to train models even on GPU-less laptop(s).
- Efficient and Scalable. AgentJet uses [verl] as the default backbone (
--backbone=verl). However, we also support trinity as an alternative backbone, accelerating your tuning process via fully asynchronous RFT. - Flexible and Fast. AgentJet supports multi-agent workflows and adopts a context merging technique, accelerating training by 1.5x to 10x when the workflow involves multi-turn (or multi-agent) conversations.
- Reliability and Reproducibility. Our team keeps track of framework performance across multiple tasks + major-git-version + training-backbones (under construction, still gathering data, coming soon).
For advanced researchers, AgentJet also provides high-resolution logging and debugging solutions:
<!-- For advanced researchers, AgentJet provides high-resolution logging and debugging solutions that are, to our knowledge, unprecedented in other prior projects. -->- High-Resolution Logging: AgentJet allows users to save and inspect token-level rollout details, recording token IDs, token loss masks, and even token logprobs to facilitate workflow development and agent diagnostics.
- Fast Debugging: AgentJet also provides the
--backbone=debugoption for the best debugging experience, shortening your wait period from minutes to seconds after code changes and enabling breakpoint debugging in IDEs.
✈️ Quick Start
Installation
- Click here to read the installation guide.
Example Library
Explore our rich library of examples to kickstart your journey:
- 🔢 Training a math agent that can write python code.
- 📱 Creating an AppWorld agent using AgentScope and training it.
- 🐺 Developing Werewolves RPG agents and training them.
- 👩🏻⚕️ Learning to ask questions like a doctor.
- 🎴 Writing a countdown game using AgentScope and solving it.
- 🚶 Solving a frozen lake walking puzzle using AgentJet.
Explore our automated benchmarking system https://benchmark.agentjet.top/:
<div align="center"> <img width="600" alt="image" src="https://serve.gptacademic.cn/publish/shared/Image/benchmark.gif"/> </div>✈️ Core Concepts
AgentJet makes agent fine-tuning straightforward by separating the developer interface from the internal execution logic.
<div align="center"> <img width="480" alt="image" src="https://img.alicdn.com/imgextra/i2/O1CN01PdCJym1jqr1jWGMZ4_!!6000000004600-0-tps-2013-870.jpg"/> </div>1. The User-Centric Interface
To optimize an agent, you provide three core inputs:
- Trainable Workflow: Define your agent logic by inheriting the Workflow class, supporting both simple agent setups and advanced multi-agent collaborations.
- Task Reader: Load training tasks from JSONL files, HuggingFace datasets, interactive environments, or auto-generate them from documents.
- Task Judger: Evaluates agent outputs and assigns rewards to guide training.
2. Internal System Architecture
The internal system orchestrates several specialized modules to handle the complexities of RL training and agent interactions.
- Launcher: Manages background service processes (Ray, vLLM) and routes the backbone.
- Task Reader: Handles data ingestion, augmentation, and filtering.
- Task Rollout: Bridges LLM engines and manages the Gym environment lifecycle.
- Task Runner: Executes the Agent workflow and calculates rewards.
- Model Tuner: Forwards inference requests from the workflow to the LLM engine.
- Context Tracker: Monitors LLM calls and automatically merges shared-history timelines to improve training efficiency by 1.5x to 10x.
- Swarm Server: A data interchange center that accepts OpenAI-like requests and engine instructions, activated only in AgentJet Swarm mode.
3. Swarm Architecture
When swarm training mode is enabled, an additional component will be activated:
- Swarm Data Interchange Server: Maintains HTTP service, listens to swarm instructions and OpenAI compatible requests. Establishes a high-speed zmq communication channel to coordinate other modules.
✈️ Navigation
- Tutorials: From Installation to Tuning your first agent — the essential path for beginners.
- Core Components: Define your [Trainab
