DEPO

[AAAI 2026] Code and Data for Paper "DEPO: Dual-Efficiency Preference Optimization for LLM Agents"

Generate Convert Improve

Install / Use

/learn @OpenCausaLab/DEPO

About this skill

Quality Score

0/100

README

DEPO

This is the official data and code of the paper: DEPO: Dual-Efficiency Preference Optimization for LLM Agents

Project Page: Link

1) Configure Paths

Before training, update both of the following:

Dataset registry
```
DEPO/data/dataset_info.json
```
Point each dataset entry to your local files.
Experiment configs
```
DEPO/efficient_agent/*.yaml
```
Edit any fields that contain file paths (output dirs, model checkpoints, etc.).

2) Install LLaMA-Factory Environment

Create and activate a Python environment that satisfies LLaMA-Factory.

3) Train

Kick off training with the provided script:

bash train_depo.sh

Common things to customize:

Which YAML config to load (inside train_depo.sh)
Output directory, logging/ckpt intervals
LoRA settings, batch size, learning rate
Which datasets (as defined in dataset_info.json) to use

4) Evaluation

For model evaluation, we use the testing data from data/test. All evaluations are conducted within the AgentGym framework, which provides the necessary environment server.

Repo Layout

DEPO/
├─ data/
│  ├─ dataset_info.json         # dataset path registry
│  ├─ kto_data                  # training data
│  └─ test                      # testing data
├─ efficient_agent/
│  ├─ *.yaml                    # experiment configs
├─ src/
│  └─ llamafactory/
│     └─ train/
│        └─ kto/
├─ train_depo.sh                # entry script to start training
├─ requirements.txt             # env deps (example)
└─ ......

That’s it—edit paths, install env, run the script. Happy training! 🚀

🖇️ Citation

🤝 Feel free to cite our paper if you find this repository benefits your work.

@misc{chen2025depodualefficiencypreferenceoptimization,
      title={DEPO: Dual-Efficiency Preference Optimization for LLM Agents}, 
      author={Sirui Chen and Mengshi Zhao and Lei Xu and Yuying Zhao and Beier Zhu and Hanwang Zhang and Shengjie Zhao and Chaochao Lu},
      year={2025},
      eprint={2511.15392},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2511.15392}, 
}

Related Skills

node-connect

343.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

92.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

343.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

343.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。