Dipllm
This is the official implementation of the paper "DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy".
Install / Use
/learn @KaiXIIM/DipllmREADME
DipLLM
<div align="center"> <strong>England (DipLLM)</strong> 🟣 vs <strong>France (Cicero)</strong> 🔵 <img src="demos/france_dipllm_vs_cicero.gif" alt="Demo of DipLLM vs Cicero" style="width: 80%;"> <p><em>Figure: Gameplay demo — <strong>DipLLM</strong> (England) vs. <strong>Cicero</strong> (France)</em></p> </div>🧠 A fine-tuned LLM agent for high-level strategic planning in Diplomacy, achieving strong performance against top agents like Cicero.
📢 Accepted at ICML 2025 — Paper.
📦 Models — Hugging Face
🛠️ Installation Instructions
This project is built on the <a href="https://github.com/facebookresearch/diplomacy_cicero" target="_blank"> <u>Cicero</u> </a> framework. For installation and usage instructions, their repository provides comprehensive guidance.
🧩 Install Diplomacy Dependencies
# Clone the repo with submodules:
git clone --recursive git@github.com:facebookresearch/diplomacy_cicero.git diplomacy_cicero
cd diplomacy_cicero
# Apt installs
apt-get install -y wget bzip2 ca-certificates curl git build-essential clang-format-8 git wget cmake build-essential autoconf libtool pkg-config libgoogle-glog-dev
# Install conda
wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-4.7.10-Linux-x86_64.sh -O ~/miniconda.sh
/bin/bash ~/miniconda.sh -b
# Create conda env
conda create --yes -n diplomacy_cicero python=3.7
conda activate diplomacy_cicero
# Install pytorch, pybind11
conda install --yes pytorch=1.7.1 torchvision cudatoolkit=11.0 -c pytorch
conda install --yes pybind11
# Install go for boringssl in grpc
# We have some hacky patching code for protobuf that is not guaranteed
# to work on versions other than this.
conda install --yes go protobuf=3.19.1
# Install python requirements
pip install -r requirements.txt
# Local pip installs
pip install -e ./thirdparty/github/fairinternal/postman/nest/
# NOTE: Postman here links against pytorch for tensors, for this to work you may
# need to separately have installed cuda 11 on your own.
pip install -e ./thirdparty/github/fairinternal/postman/postman/
pip install -e . -vv
# (Optional but recommended) Clean previous builds before compiling, especially if you encounter build errors
make clean
# Make
make
# Run unit tests
make test_fast
🧵Install DipLLM Fine-tuning Environment
conda create --name unsloth_env \
python=3.10 \
pytorch-cuda=12.1 \
pytorch cudatoolkit xformers -c pytorch -c nvidia -c xformers \
-y
conda activate unsloth_env
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
pip install --no-deps "trl<0.9.0" peft accelerate bitsandbytes
💻Project Overview
| Code Directory | Description |
| ---------------------- | --------------------------------------------------------------------- |
| fairdiplomacy/agents | Includes DipLLM and various agents released or unreleased by Meta AI. |
| conf | Configuration files for different experiments and setups. |
| models | Pre-downloaded models used in training and evaluation. |
🔧 Fine-tune DipLLM
▶️ Run Data Collection
python run.py --adhoc --cfg conf/c04_exploit/research_20240730_collect_data.prototxt launcher.local.use_local=true
🧹 Convert to JSON Format
Convert raw binary data to LLM-compatible text format:
python finetuning/data_process/bin2json.py --input_file_path --output_file_path
🎯 Fine-tune LLM
Use DeepSpeed to launch multi-GPU training:
deepspeed --include localhost:0,1,2,3 finetune.py
🤖 Test DipLLM
🛰️ Launch DipLLM API Server
uvicorn make_api:app --host 0.0.0.0 --port 8011
🧪 Evaluate Against Cicero
python h2h_evaluate.py --adhoc --cfg conf/c01_ag_cmp/cmp_8011_cicero_nopress.prototxt
🤼♂️ Evaluate Against DipNet
python h2h_evaluate.py --adhoc --cfg conf/c01_ag_cmp/cmp_8011_base_strategy_model.prototxt
📝 Citation
If you find our research helpful and would like to reference it in your work, please consider the following citations:
@inproceedings{DBLP:conf/icml/XuCLFZZ25,
author = {Kaixuan Xu and
Jiajun Chai and
Sicheng Li and
Yuqian Fu and
Yuanheng Zhu and
Dongbin Zhao},
title = {DipLLM: Fine-Tuning {LLM} for Strategic Decision-making in Diplomacy},
booktitle = {Forty-second International Conference on Machine Learning, {ICML}
2025, Vancouver, BC, Canada, July 13-19, 2025},
publisher = {OpenReview.net},
year = {2025},
}
Related Skills
node-connect
345.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
106.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
345.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
345.9kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
