DEPO
[AAAI 2026] Code and Data for Paper "DEPO: Dual-Efficiency Preference Optimization for LLM Agents"
Install / Use
/learn @OpenCausaLab/DEPOREADME
DEPO
This is the official data and code of the paper: DEPO: Dual-Efficiency Preference Optimization for LLM Agents
Project Page: Link
1) Configure Paths
Before training, update both of the following:
-
Dataset registry
DEPO/data/dataset_info.jsonPoint each dataset entry to your local files.
-
Experiment configs
DEPO/efficient_agent/*.yamlEdit any fields that contain file paths (output dirs, model checkpoints, etc.).
2) Install LLaMA-Factory Environment
Create and activate a Python environment that satisfies LLaMA-Factory.
3) Train
Kick off training with the provided script:
bash train_depo.sh
Common things to customize:
- Which YAML config to load (inside
train_depo.sh) - Output directory, logging/ckpt intervals
- LoRA settings, batch size, learning rate
- Which datasets (as defined in
dataset_info.json) to use
4) Evaluation
For model evaluation, we use the testing data from data/test.
All evaluations are conducted within the AgentGym framework, which provides the necessary environment server.
Repo Layout
DEPO/
├─ data/
│ ├─ dataset_info.json # dataset path registry
│ ├─ kto_data # training data
│ └─ test # testing data
├─ efficient_agent/
│ ├─ *.yaml # experiment configs
├─ src/
│ └─ llamafactory/
│ └─ train/
│ └─ kto/
├─ train_depo.sh # entry script to start training
├─ requirements.txt # env deps (example)
└─ ......
That’s it—edit paths, install env, run the script. Happy training! 🚀
🖇️ Citation
🤝 Feel free to cite our paper if you find this repository benefits your work.
@misc{chen2025depodualefficiencypreferenceoptimization,
title={DEPO: Dual-Efficiency Preference Optimization for LLM Agents},
author={Sirui Chen and Mengshi Zhao and Lei Xu and Yuying Zhao and Beier Zhu and Hanwang Zhang and Shengjie Zhao and Chaochao Lu},
year={2025},
eprint={2511.15392},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2511.15392},
}
Related Skills
node-connect
343.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
92.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
