HLLM
HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling
Install / Use
/learn @bytedance/HLLMREADME
HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling
HLLM-Creator: Hierarchical LLM-based Personalized Creative Generation
<div align="left"> </div> <div align="left"> </div>🔥 Update
- [2025.08.26] HLLM-Creator is released ! Find more details in Paper and README !
- [2024.09.20] HLLM Codes and Model Weights are released !
Installation
- Install packages via
pip3 install -r requirements.txt. Some basic packages are shown below :
pytorch==2.3.1
deepspeed==0.14.2
transformers==4.51.0
lightning==2.4.0
flash-attn==2.5.9post1
fbgemm-gpu==0.5.0 [optional for HSTU]
sentencepiece==0.2.0 [optional for Baichuan2]
- Prepare
PixelRecandAmazon Book ReviewsDatasets:- Download
PixelRecInteractions and Item Information from PixelRec and put into the dataset and information folder. - Download
Amazon Book ReviewsInteractions and Item Information, process them byprocess_books.py, and put into the dataset and information folder. We also provide Interactions and Item Information of Books after processing. - Please note that Interactions and Item Information should be put into two folders like:
Here dataset represents data_path, and infomation represents text_path.├── dataset # Store Interactions │ ├── amazon_books.csv │ ├── Pixel1M.csv │ ├── Pixel200K.csv │ └── Pixel8M.csv └── information # Store Item Information ├── amazon_books.csv ├── Pixel1M.csv ├── Pixel200K.csv └── Pixel8M.csv
- Download
- Prepare pre-trained LLM models, such as TinyLlama, Baichuan2.
Training
To train HLLM on PixelRec / Amazon Book Reviews, you can run the following command.
Set
master_addr,master_port,nproc_per_node,nnodesandnode_rankin environment variables for multinodes training.
All hyper-parameters (except model's config) can be found in code/REC/utils/argument_list.py and passed through CLI. More model's hyper-parameters are in
IDNet/*orHLLM/*.
# Item and User LLM are initialized by specific pretrain_dir.
python3 main.py \
--config_file overall/LLM_deepspeed.yaml HLLM/HLLM.yaml \ # We use deepspeed for training by default.
--loss nce \
--epochs 5 \
--dataset {Pixel200K / Pixel1M / Pixel8M / amazon_books} \
--train_batch_size 16 \
--MAX_TEXT_LENGTH 256 \
--MAX_ITEM_LIST_LENGTH 10 \
--checkpoint_dir saved_path \
--optim_args.learning_rate 1e-4 \
--item_pretrain_dir item_pretrain_dir \ # Set to LLM dir.
--user_pretrain_dir user_pretrain_dir \ # Set to LLM dir.
--text_path text_path \ # Use absolute path to text files.
--text_keys '[\"title\", \"tag\", \"description\"]' # Please remove tag in books dataset.
You can use
--gradient_checkpointing Trueand--stage 3with deepspeed to save memory.
You can also train ID-based models by the following command.
python3 main.py \
--config_file overall/ID.yaml IDNet/{hstu / sasrec / llama_id}.yaml \
--loss nce \
--epochs 201 \
--dataset {Pixel200K / Pixel1M / Pixel8M / amazon_books} \
--train_batch_size 64 \
--MAX_ITEM_LIST_LENGTH 10 \
--optim_args.learning_rate 1e-4
To reproduce our experiments on Pixel8M and Books you can run scripts in reproduce folder. You should be able to reproduce the following results.
For ID-based models, we follow the hyper-parameters from PixelRec and HSTU.
| Method | Dataset | Negatives | R@10 | R@50 | R@200 | N@10 | N@50 | N@200 | | ------------- | ------- |---------- | ---------- | --------- |---------- | --------- | --------- | --------- | | HSTU | Pixel8M | 5632 | 4.83 | 10.30 | 18.28 | 2.75 | 3.94 | 5.13 | | SASRec | Pixel8M | 5632 | 5.08 | 10.62 | 18.64 | 2.92 | 4.12 | 5.32 | | HLLM-1B | Pixel8M | 5632 | 6.13 | 12.48 | 21.18 | 3.54 | 4.92 | 6.22 | | HSTU-large | Books | 512 | 5.00 | 11.29 | 20.13 | 2.78 | 4.14 | 5.47 | | SASRec | Books | 512 | 5.35 | 11.91 | 21.02 | 2.98 | 4.40 | 5.76 | | HLLM-1B | Books | 512 | 6.97 | 14.61 | 24.78 | 3.98 | 5.64 | 7.16 | | HSTU-large | Books | 28672 | 6.50 | 12.22 | 19.93 | 4.04 | 5.28 | 6.44 | | HLLM-1B | Books | 28672 | 9.28 | 17.34 | 27.22 | 5.65 | 7.41 | 8.89 | | HLLM-7B | Books | 28672 | 9.39 | 17.65 | 27.59 | 5.69 | 7.50 | 8.99 |
Inference
We provide fine-tuned HLLM models for evaluation, you can download from the following links or huggingface. Remember put the weights to checkpoint_dir.
| Model | Dataset | Weights | |:---|:---|:---| |HLLM-1B | Pixel8M | HLLM-1B-Pixel8M |HLLM-1B | Books | HLLM-1B-Books-neg512 |HLLM-1B | Books | HLLM-1B-Books |HLLM-7B | Books | HLLM-7B-Books
Please ensure compliance with the respective licenses of TinyLlama-1.1B and Baichuan2-7B when using corresponding weights.
Then you can evaluate models by the following command (the same as training but val_only).
python3 main.py \
--config_file overall/LLM_deepspeed.yaml HLLM/HLLM.yaml \ # We use deepspeed for training by default.
--loss nce \
--epochs 5 \
--dataset {Pixel200K / Pixel1M / Pixel8M / amazon_books} \
--train_batch_size 16 \
--MAX_TEXT_LENGTH 256 \
--MAX_ITEM_LIST_LENGTH 10 \
--checkpoint_dir saved_path \
--optim_args.learning_rate 1e-4 \
--item_pretrain_dir item_pretrain_dir \ # Set to LLM dir.
--user_pretrain_dir user_pretrain_dir \ # Set to LLM dir.
--text_path text_path \ # Use absolute path to text files.
--text_keys '[\"title\", \"tag\", \"description\"]' \ # Please remove tag in books dataset.
--val_only True # Add this for evaluation
Citation
If our work has been of assistance to your work, feel free to give us a star ⭐ or cite us using :
@article{HLLM,
title={HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling},
author={Junyi Chen and Lu Chi and Bingyue Peng and Zehuan Yuan},
journal={arXiv preprint arXiv:2409.12740},
year={2024}
}
@article{HLLM-Creator,
title={HLLM-Creator: Hierarchical LLM-based Personalized Creative Generation},
author={Junyi Chen and Lu Chi and Siliang Xu and Shiwei Ran and Bingyue Peng and Zehuan Yuan},
journal={arXiv preprint arXiv:2508.18118},
year={2025}
}
Thanks to the excellent code repository RecBole, VisRec, PixelRec and HSTU ! HLLM is released under the Apache License 2.0, some codes are modified from HSTU and PixelRec, which are released under the Apache License 2.0 and MIT License, respectively.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
mentoring-juniors
Community-contributed instructions, agents, skills, and configurations to help you make the most of GitHub Copilot.
groundhog
399Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
