AgentJet

AgentJet (AJet) is a cutting-edge, user-friendly agent RL training framework designed to optimize agents and agentic workflows (supporting any agent built with OpenAI SDK, AgentScope, Langchain, or raw HTTP requests), fine-tuning LLM weights to enhance model performance.

AgentJet (AJet) has fully-distributed swarm training capability, which means that you can deploy ajet-swarm start in GPU server(s) and then start training agents in your laptop(s)! Simply provide your agent workflow, training dataset, and reward function, and AgentJet will be ready to go!

✈️ News

2026.3.26 Upgrade verl backend to 0.7.1 to support more models and increase training speed! All benchmark verified.
2026.3.19 Support for latest Qwen3.5 models is in progress.
2026.3.12 Tuning Original OpenClaw Agent without Editing Any Agent Code. EN Blog / ZH Blog.
2026.3.09 Non-shared-parameter Multiagent Training. EN Blog / ZH Blog.
2026.2.20 Introducing AgentJet Swarm. ZH Blog / EN Blog.

✈️ Fast Introduction

Classic Mode

Let's begin with the simplest example: a math agent with a tool call. This is a simple & centralized training example.

please check out the installation guide to set up the training environment.

tune your first model using the minimum example.

ajet --conf ./tutorial/example_math_agent/math_agent.yaml --backbone='verl'

Swarm Mode

Let's begin with the simplest AgentJet Swarm example: also a math agent. In this case, you can use any GPU-less laptop to train the model remotely.

Start swarm server and begin swarm overwatch: ajet-swarm start and ajet-swarm overwatch. (Alternative: if you are a fan of docker, use our prebuilt docker image here without setting up dependencies)

From your laptop (or swarm server localhost), run this simple script to begin training:

AJET_SWARM_URL="http://swarm-server-ip:10086" python ./tutorial/example_math_swarm/math.py

✈️ Features

We aim to build an easy-to-learn Agent tuner that unlocks more possibilities for agent developers:

Easy and Friendly. AgentJet helps you tune models behind your agent workflows easily, optimizing your agents for top performance with minimal effort.
Rich Tutorial Library. AgentJet provides a rich library of examples as tutorials.
Swarm Training. This unique feature of AgentJet opens many possibilities: deploying distributed & self-healing rollout workers, non-shared-parameter multi-agent training, multi-runtime & multi-task cocktail training. And just like Tinker, you can use AgentJet Swarm to train models even on GPU-less laptop(s).
Efficient and Scalable. AgentJet uses [verl] as the default backbone (--backbone=verl). However, we also support trinity as an alternative backbone, accelerating your tuning process via fully asynchronous RFT.
Flexible and Fast. AgentJet supports multi-agent workflows and adopts a context merging technique, accelerating training by 1.5x to 10x when the workflow involves multi-turn (or multi-agent) conversations.
Reliability and Reproducibility. Our team keeps track of framework performance across multiple tasks + major-git-version + training-backbones (under construction, still gathering data, coming soon).

For advanced researchers, AgentJet also provides high-resolution logging and debugging solutions:

High-Resolution Logging: AgentJet allows users to save and inspect token-level rollout details, recording token IDs, token loss masks, and even token logprobs to facilitate workflow development and agent diagnostics.
Fast Debugging: AgentJet also provides the --backbone=debug option for the best debugging experience, shortening your wait period from minutes to seconds after code changes and enabling breakpoint debugging in IDEs.

✈️ Quick Start

Installation

Click here to read the installation guide.

Example Library

Explore our rich library of examples to kickstart your journey:

🔢 Training a math agent that can write python code.
📱 Creating an AppWorld agent using AgentScope and training it.
🐺 Developing Werewolves RPG agents and training them.
👩🏻‍⚕️ Learning to ask questions like a doctor.
🎴 Writing a countdown game using AgentScope and solving it.
🚶 Solving a frozen lake walking puzzle using AgentJet.

Explore our automated benchmarking system https://benchmark.agentjet.top/:

✈️ Core Concepts

AgentJet makes agent fine-tuning straightforward by separating the developer interface from the internal execution logic.

1. The User-Centric Interface

To optimize an agent, you provide three core inputs:

Trainable Workflow: Define your agent logic by inheriting the Workflow class, supporting both simple agent setups and advanced multi-agent collaborations.
Task Reader: Load training tasks from JSONL files, HuggingFace datasets, interactive environments, or auto-generate them from documents.
Task Judger: Evaluates agent outputs and assigns rewards to guide training.

2. Internal System Architecture

The internal system orchestrates several specialized modules to handle the complexities of RL training and agent interactions.

Launcher: Manages background service processes (Ray, vLLM) and routes the backbone.
Task Reader: Handles data ingestion, augmentation, and filtering.
Task Rollout: Bridges LLM engines and manages the Gym environment lifecycle.
Task Runner: Executes the Agent workflow and calculates rewards.
Model Tuner: Forwards inference requests from the workflow to the LLM engine.
Context Tracker: Monitors LLM calls and automatically merges shared-history timelines to improve training efficiency by 1.5x to 10x.
Swarm Server: A data interchange center that accepts OpenAI-like requests and engine instructions, activated only in AgentJet Swarm mode.

3. Swarm Architecture

When swarm training mode is enabled, an additional component will be activated:

Swarm Data Interchange Server: Maintains HTTP service, listens to swarm instructions and OpenAI compatible requests. Establishes a high-speed zmq communication channel to coordinate other modules.

✈️ Navigation

Tutorials: From Installation to Tuning your first agent — the essential path for beginners.
Core Components: Define your [Trainab

AgentJet

Install / Use

README

AgentJet

✈️ News

✈️ Fast Introduction

Classic Mode

Swarm Mode

✈️ Features

✈️ Quick Start

Installation

Example Library

✈️ Core Concepts

1. The User-Centric Interface

2. Internal System Architecture

3. Swarm Architecture

✈️ Navigation