GRID

GRID: Generative Recommendation with Semantic IDs

Generate Convert Improve

Install / Use

/learn @snap-research/GRID

About this skill

Quality Score

0/100

README

Generative Recommendation with Semantic IDs (GRID)

GRID (Generative Recommendation with Semantic IDs) is a state-of-the-art framework for generative recommendation systems using semantic IDs, developed by a group of scientists and engineers from Snap Research. This project implements novel approaches for learning semantic IDs from text embedding and generating recommendations through transformer-based generative models.

🚀 Overview

GRID facilitates generative recommendation three overarching steps:

Embedding Generation with LLMs: Converting item text into embeddings using any LLMs available on Huggingface.
Semantic ID Learning: Converting item embedding into hierarchical semantic IDs using Residual Quantization techniques such as RQ-KMeans, RQ-VAE, RVQ.
Generative Recommendations: Using transformer architectures to generate recommendation sequences as semantic ID tokens.

📦 Installation

Prerequisites

Python 3.10+
CUDA-compatible GPU (recommended)

Setup Environment

# Clone the repository
git clone https://github.com/snap-research/GRID.git
cd GRID

# Install dependencies
pip install -r requirements.txt

🎯 Quick Start

1. Data Preparation

Prepare your dataset in the expected format:

data/
├── train/       # training sequence of user history 
├── validation/  # validation sequence of user history 
├── test/        # testing sequence of user history 
└── items/       # text of all items in the dataset

We provide pre-processed Amazon data explored in the P5 paper [4]. The data can be downloaded from this google drive link.

2. Embedding Generation with LLMs

Generate embeddings from LLMs, which later will be transformed into semantic IDs.

python -m src.inference experiment=sem_embeds_inference_flat data_dir=data/amazon_data/beauty # avaiable data includes 'beauty', 'sports', and 'toys'

3. Train and Generate Semantic IDs

Learn semantic ID centroids for embeddings generated in step 2:

python -m src.train experiment=rkmeans_train_flat \
    data_dir=data/amazon_data/beauty \
    embedding_path=<output_path_from_step_2>/merged_predictions_tensor.pt \ # this can be found in the log dirs in step2
    embedding_dim=2048 \ # the model dimension of the LLMs you use in step 2. 2048 for flan-t5-xl as used in this example.
    num_hierarchies=3 \  # we train 3 codebooks
    codebook_width=256 \ # each codebook has 256 rows of centroids

Generate SIDs:

python -m src.inference experiment=rkmeans_inference_flat \
    data_dir=data/amazon_data/beauty \
    embedding_path=<output_path_from_step_2>/merged_predictions_tensor.pt \ 
    embedding_dim=2048 \ 
    num_hierarchies=3 \  
    codebook_width=256 \ 
    ckpt_path=<the_checkpoint_you_just_get_above> # this can be found in the log dir for training SIDs

4. Train Generative Recommendation Model with Semantic IDs

Train the recommendation model using the learned semantic IDs:

python -m src.train experiment=tiger_train_flat \
    data_dir=data/amazon_data/beauty \ 
    semantic_id_path=<output_path_from_step_3>/pickle/merged_predictions_tensor.pt \
    num_hierarchies=4 # Please note that we add 1 for num_hierarchies because in the previous step we appended one additional digit to de-duplicate the semantic IDs we generate.

4. Generate Recommendations

Run inference to generate recommendations:

python -m src.inference experiment=tiger_inference_flat \
    data_dir=data/amazon_data/beauty \ 
    semantic_id_path=<output_path_from_step_3>/pickle/merged_predictions_tensor.pt \
    ckpt_path=<the_checkpoint_you_just_get_above> \ # this can be found in the log dir for training GR models
    num_hierarchies=4 \ # Please note that we add 1 for num_hierarchies because in the previous step we appended one additional digit to de-duplicate the semantic IDs we generate.

Supported Models:

Semantic ID:

Residual K-means proposed in One-Rec [2]
Residual Vector Quantization
Residual Quantization with Variational Autoencoder [3]

Generative Recommendation:

TIGER [1]

📚 Citation

If you use GRID in your research, please cite:

@inproceedings{grid,
  title     = {Generative Recommendation with Semantic IDs: A Practitioner's Handbook},
  author    = {Ju, Clark Mingxuan and Collins, Liam and Neves, Leonardo and Kumar, Bhuvesh and Wang, Louis Yufeng and Zhao, Tong and Shah, Neil},
  booktitle = {Proceedings of the 34th ACM International Conference on Information and Knowledge Management (CIKM)},
  year      = {2025}
}

🤝 Acknowledgments

Built with PyTorch and PyTorch Lightning
Configuration management by Hydra
Inspired by recent advances in generative AI and recommendation systems
Part of this repo is built on top of https://github.com/ashleve/lightning-hydra-template

📞 Contact

For questions and support:

Create an issue on GitHub
Contact the development team: Clark Mingxuan Ju (mju@snap.com), Liam Collins (lcollins2@snap.com), Bhuvesh Kumar (bhuvesh@snap.com) and Leonardo Neves (lneves@snap.com).

Bibliography

[1] Rajput, Shashank, et al. "Recommender systems with generative retrieval." Advances in Neural Information Processing Systems 36 (2023): 10299-10315.

[2] Deng, Jiaxin, et al. "Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment." arXiv preprint arXiv:2502.18965 (2025).

[3] Lee, Doyup, et al. "Autoregressive image generation using residual quantization." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.

[4] Geng, Shijie, et al. "Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5)." Proceedings of the 16th ACM conference on recommender systems. 2022.

Related Skills

node-connect

343.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

92.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

343.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

343.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。