AtSpeed
Efficient Inference for Large Language Model-based Generative Recommendation (ICLR'25)
Install / Use
/learn @Linxyhaha/AtSpeedREADME
AtSpeed
This is the pytorch implementation of our paper:
Efficient Inference for Large Language Model-based Generative Recommendation (ICLR 2025)
We also release a Python package, BeamSD, which can accelerate the beam search generation of transformers by 1.5x speedup with just one line of code!
Environment
- Anaconda 3
- Python 3.9.0
- pytorch 1.13.0
- transformers 4.41.0
Usage
Data
data/
├── beauty
├── games
The data in the floder is already processed and can be used directly. The raw data is from Amazon product data.
We sort users' historical interactions by the global timestamps, and then split them into training, validation, and testing sets with the ratio of 8:1:1. If you want to apply this splitting method to your own dataset, please refer to the example for Beauty dataset in data/data_process.ipynb.
For the item identifier, we follow LC-Rec to set the length L = 4, i.e., the token sequence length of a generated item would be 4.
Train
Target Model
First, replace the parameters in code/script/finetune_llama.sh with your own parameters, such as LOG_DIR, OUTPUT_DIR, etc.
LOG_DIR=YOUR_LOG_DIR
OUTPUT_DIR=YOUR_OUTPUT_DIR
BASE_MODEL=YOUR_BASE_MODEL_PATH
Then, run the following command to train the target model.
cd code
bash script/finetune_llama.sh
Draft Model
- Generate Teacher Data
Replace the parameters in code/script/generate_teacher_data.sh with your own parameters, and then run the following command.
cd code
bash script/generate_teacher_data.sh
Then, the data will be generated in ${YOUR_OUTPUT_DIR}/${dataset}/train_teacher_data and ${YOUR_OUTPUT_DIR}/${dataset}/eval_teacher_data, which are the parameters train_data and valid_data in code/script/train.sh.
- Train Draft Model
Replace the parameters in code/script/train.sh with your own parameters, such as LOG_DIR, OUTPUT_DIR, TARGET_MODEL, BASE_MODEL, MODEL_CLASS, etc. And modify accelerate.yaml according to your needs if necessary.
LOG_DIR=YOUR_LOG_DIR
OUTPUT_DIR=YOUR_OUTPUT_DIR
TARGET_MODEL=YOUR_TARGET_MODEL_PATH
BASE_MODEL=YOUR_BASE_MODEL_PATH
MODEL_CLASS=AtSpeedRModel
Then, run the following command to train the target model.
cd code
bash script/train.sh
Inference
First, replace the parameters in code/script/inference.sh with your own parameters, such as LOG_DIR, OUTPUT_DIR, DRAFT_MODEL, DRAFT_MODEL_NAME, etc.
LOG_DIR=YOUR_LOG_DIR
OUTPUT_DIR=YOUR_OUTPUT_DIR
DRAFT_MODEL=DRAFT_MODEL_PATH
DRATF_MODEL_NAME=DRAFT_MODEL_NAME
Then, run the following command to train the target model.
cd code
bash script/inference.sh
Citation
If you find our work is useful for your research, please consider citing:
@inproceedings{lin2024efficient,
title={Efficient Inference for Large Language Model-based Generative Recommendation},
author={Lin, Xinyu and Yang, Chaoqun and Wang, Wenjie and Li, Yongqi and Du, Cunxiao and Feng, Fuli and Ng, See-Kiong and Chua, Tat-Seng},
booktitle={ICLR},
year={2025}
}
License
NUS © NExT++
Related Skills
node-connect
349.7kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.7kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.7kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
