PAD

[ICLR 2025] Pad: Personalized alignment of llms at decoding-time

Generate Convert Improve

Install / Use

/learn @zjuruizhechen/PAD

About this skill

Quality Score

0/100

README

PAD: Personalized Alignment at Decoding-time

This repository contains the official implementation of the paper PAD: Personalized Alignment at Decoding-time (https://openreview.net/pdf?id=e7AUJpP8bV). This paper presents Personalized Alignment at Decoding-time (PAD), a novel framework designed to align LLM outputs with diverse personalized preferences during the inference phase, eliminating the need for additional training.

Acknowledgement: This repository is built based on https://github.com/deeplearning-wisc/args.

Setup

The following packages, and versions were used:

git clone https://github.com/deeplearning-wisc/args.git

conda create -n PAD python=3.10 -y
conda activate PAD

cd args
pip -r requirements.txt

We recommend the following versions for the main packages:

| Mandatory | Recommend | | ------------ |-----------| | python | 3.10 | | torch | 2.1.0 | | transformers | 4.45.0 | | datasets | 2.16.0 | | accelerate | 0.34.2 | | peft | 0.14.0 | | trl | 0.9.6 |

Training

The training module of PAD is coming soon.

Inference

We have released our Personalized Reward Model at https://huggingface.co/RuizheChen/PAD, which can be directly employed to perform alignment at decoding-time.

The following command can be run to start personalized generation:

cd inference

python collect_model_outs.py \
    --run_percent 100. \
    --config="configs/psoups_llama3SFT_hhh.config" \
    --out_file="results/psoups_llama3SFT_hhh" \
    --llm_gpu="cuda:0" \
    --rm_gpu="cuda:0" \
    --llm="princeton-nlp/Llama-3-Base-8B-SFT" \
    --rm="RuizheChen/PAD" \
    --dataset="psoups" \
    --sys_prompt="[Guidelines] Your task is to generate response by considering the following principle. \
[Principles] harmless and helpfulness and humor \
[Instruction] "

The result of the generation will be stored as a jsonl file with the path results/psoups_llama3SFT_hhh_0.jsonl. We have provided config files for aligning with different dimensions on both datasets in ./configs. For a new alignment task, we recommend to perform grid search with inference/configs/grid_search.config.

Detailed codes for reproducing experiments are provided in inference/decoding.sh.

Evaluations

To prepare for the evaluations, please extract your models output in the following form and save it as results/your_run_name.jsonl. Note that the response should contain both prompt and the output generated by the model.

[
    {
        "prompt": "Are you okay? You look",
        "response": "Are you okay? You look a bit tired or stressed. Anything you'd like to talk about?",
        "method": "greedy"
    },
    {
        "prompt": "...",
        "response": "...",
        "method": "..."
    },
    ...
]

To run the evaluations, execute the following commands:



# helpfulness

python measure_reward.py \
    --out_file="<your_results>.jsonl" \
    --tokenizer="Ray2333/gpt2-large-helpful-reward_model" \
    --rm="Ray2333/gpt2-large-helpful-reward_model" \
    --rm_gpu="cuda:0"

python measure_reward.py \
    --out_file="<your_results>.jsonl" \
    --tokenizer="RLHFlow/ArmoRM-Llama3-8B-v0.1" \
    --rm="RLHFlow/ArmoRM-Llama3-8B-v0.1" \
    --rm_gpu="cuda:0" \
    --dimension=0

# harmless

python measure_reward.py \
    --out_file="<your_results>.jsonl" \
    --tokenizer="Ray2333/gpt2-large-harmless-reward_model" \
    --rm="Ray2333/gpt2-large-harmless-reward_model" \
    --rm_gpu="cuda:0"

python measure_reward.py \
    --out_file="<your_results>.jsonl" \
    --tokenizer="RLHFlow/ArmoRM-Llama3-8B-v0.1" \
    --rm="RLHFlow/ArmoRM-Llama3-8B-v0.1" \
    --rm_gpu="cuda:0" \
    --dimension=10

# humor

python measure_reward.py \
  --out_file="<your_results>.jsonl" \
  --tokenizer="mohameddhiab/humor-no-humor" \
  --rm="mohameddhiab/humor-no-humor" \
  --rm_gpu="cuda:0"

Citation

If you find this repository useful in your research, please consider citing:

@inproceedings{chen2024pad,
  title={Pad: Personalized alignment at decoding-time},
  author={Chen, Ruizhe and Zhang, Xiaotian and Luo, Meng and Chai, Wenhao and Liu, Zuozhu},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2024}
}

Related Skills

node-connect

352.9k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.5k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.9k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

352.9k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。