GRID
GRID: Generative Recommendation with Semantic IDs
Install / Use
/learn @snap-research/GRIDREADME
Generative Recommendation with Semantic IDs (GRID)
GRID (Generative Recommendation with Semantic IDs) is a state-of-the-art framework for generative recommendation systems using semantic IDs, developed by a group of scientists and engineers from Snap Research. This project implements novel approaches for learning semantic IDs from text embedding and generating recommendations through transformer-based generative models.
🚀 Overview
GRID facilitates generative recommendation three overarching steps:
- Embedding Generation with LLMs: Converting item text into embeddings using any LLMs available on Huggingface.
- Semantic ID Learning: Converting item embedding into hierarchical semantic IDs using Residual Quantization techniques such as RQ-KMeans, RQ-VAE, RVQ.
- Generative Recommendations: Using transformer architectures to generate recommendation sequences as semantic ID tokens.
📦 Installation
Prerequisites
- Python 3.10+
- CUDA-compatible GPU (recommended)
Setup Environment
# Clone the repository
git clone https://github.com/snap-research/GRID.git
cd GRID
# Install dependencies
pip install -r requirements.txt
🎯 Quick Start
1. Data Preparation
Prepare your dataset in the expected format:
data/
├── train/ # training sequence of user history
├── validation/ # validation sequence of user history
├── test/ # testing sequence of user history
└── items/ # text of all items in the dataset
We provide pre-processed Amazon data explored in the P5 paper [4]. The data can be downloaded from this google drive link.
2. Embedding Generation with LLMs
Generate embeddings from LLMs, which later will be transformed into semantic IDs.
python -m src.inference experiment=sem_embeds_inference_flat data_dir=data/amazon_data/beauty # avaiable data includes 'beauty', 'sports', and 'toys'
3. Train and Generate Semantic IDs
Learn semantic ID centroids for embeddings generated in step 2:
python -m src.train experiment=rkmeans_train_flat \
data_dir=data/amazon_data/beauty \
embedding_path=<output_path_from_step_2>/merged_predictions_tensor.pt \ # this can be found in the log dirs in step2
embedding_dim=2048 \ # the model dimension of the LLMs you use in step 2. 2048 for flan-t5-xl as used in this example.
num_hierarchies=3 \ # we train 3 codebooks
codebook_width=256 \ # each codebook has 256 rows of centroids
Generate SIDs:
python -m src.inference experiment=rkmeans_inference_flat \
data_dir=data/amazon_data/beauty \
embedding_path=<output_path_from_step_2>/merged_predictions_tensor.pt \
embedding_dim=2048 \
num_hierarchies=3 \
codebook_width=256 \
ckpt_path=<the_checkpoint_you_just_get_above> # this can be found in the log dir for training SIDs
4. Train Generative Recommendation Model with Semantic IDs
Train the recommendation model using the learned semantic IDs:
python -m src.train experiment=tiger_train_flat \
data_dir=data/amazon_data/beauty \
semantic_id_path=<output_path_from_step_3>/pickle/merged_predictions_tensor.pt \
num_hierarchies=4 # Please note that we add 1 for num_hierarchies because in the previous step we appended one additional digit to de-duplicate the semantic IDs we generate.
4. Generate Recommendations
Run inference to generate recommendations:
python -m src.inference experiment=tiger_inference_flat \
data_dir=data/amazon_data/beauty \
semantic_id_path=<output_path_from_step_3>/pickle/merged_predictions_tensor.pt \
ckpt_path=<the_checkpoint_you_just_get_above> \ # this can be found in the log dir for training GR models
num_hierarchies=4 \ # Please note that we add 1 for num_hierarchies because in the previous step we appended one additional digit to de-duplicate the semantic IDs we generate.
Supported Models:
Semantic ID:
- Residual K-means proposed in One-Rec [2]
- Residual Vector Quantization
- Residual Quantization with Variational Autoencoder [3]
Generative Recommendation:
- TIGER [1]
📚 Citation
If you use GRID in your research, please cite:
@inproceedings{grid,
title = {Generative Recommendation with Semantic IDs: A Practitioner's Handbook},
author = {Ju, Clark Mingxuan and Collins, Liam and Neves, Leonardo and Kumar, Bhuvesh and Wang, Louis Yufeng and Zhao, Tong and Shah, Neil},
booktitle = {Proceedings of the 34th ACM International Conference on Information and Knowledge Management (CIKM)},
year = {2025}
}
🤝 Acknowledgments
- Built with PyTorch and PyTorch Lightning
- Configuration management by Hydra
- Inspired by recent advances in generative AI and recommendation systems
- Part of this repo is built on top of https://github.com/ashleve/lightning-hydra-template
📞 Contact
For questions and support:
- Create an issue on GitHub
- Contact the development team: Clark Mingxuan Ju (mju@snap.com), Liam Collins (lcollins2@snap.com), Bhuvesh Kumar (bhuvesh@snap.com) and Leonardo Neves (lneves@snap.com).
Bibliography
[1] Rajput, Shashank, et al. "Recommender systems with generative retrieval." Advances in Neural Information Processing Systems 36 (2023): 10299-10315.
[2] Deng, Jiaxin, et al. "Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment." arXiv preprint arXiv:2502.18965 (2025).
[3] Lee, Doyup, et al. "Autoregressive image generation using residual quantization." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.
[4] Geng, Shijie, et al. "Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5)." Proceedings of the 16th ACM conference on recommender systems. 2022.
Related Skills
node-connect
343.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
92.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
