Genrec
GenRec: Generative Recommender Systems with RQ-VAE semantic IDs, Transformer-based retrieval, and LLM integration. Built on PyTorch with distributed training support.
Install / Use
/learn @phonism/GenrecREADME
GenRec
A Model Zoo for Generative Recommendation.
Benchmark Results
Evaluation Protocol
Following TIGER, LC-Rec, and OpenOneRec:
- Dataset: Amazon 2014 with 5-core filtering (users and items with < 5 interactions removed)
- Split: Leave-one-out (last item for test, second-to-last for validation, rest for training)
- Ranking: Full-item-set ranking over all items (no negative sampling)
- Max sequence length: 50 for all models
- Metrics: Recall@K and NDCG@K (K=5, 10)

Amazon 2014 Beauty
| Methods | R@5 | R@10 | N@5 | N@10 | |---------|-----|------|-----|------| | SASRec (CE) | 0.0538 | 0.0851 | 0.0320 | 0.0421 | | SASRec (BCE) | 0.0258 | 0.0503 | 0.0137 | 0.0216 | | HSTU | 0.0568 | 0.0859 | 0.0347 | 0.0441 | | TIGER | 0.0419 | 0.0644 | 0.0282 | 0.0354 | | LCRec | 0.0481 | 0.0704 | 0.0331 | 0.0403 | | OneRec-SFT (1.7B) | 0.0578 | 0.0816 | 0.0398 | 0.0475 |
Amazon 2014 Sports
| Methods | R@5 | R@10 | N@5 | N@10 | |---------|-----|------|-----|------| | SASRec (CE) | 0.0321 | 0.0495 | 0.0191 | 0.0248 | | SASRec (BCE) | 0.0156 | 0.0291 | 0.0085 | 0.0128 | | HSTU | 0.0283 | 0.0439 | 0.0182 | 0.0232 | | TIGER | 0.0236 | 0.0377 | 0.0150 | 0.0195 | | LCRec | 0.0238 | 0.0360 | 0.0159 | 0.0198 | | OneRec-SFT (1.7B) | 0.0299 | 0.0436 | 0.0200 | 0.0244 |
Amazon 2014 Toys
| Methods | R@5 | R@10 | N@5 | N@10 | |---------|-----|------|-----|------| | SASRec (CE) | 0.0613 | 0.0922 | 0.0348 | 0.0448 | | SASRec (BCE) | 0.0353 | 0.0594 | 0.0186 | 0.0264 | | HSTU | 0.0611 | 0.0914 | 0.0363 | 0.0461 | | TIGER | 0.0340 | 0.0521 | 0.0214 | 0.0272 | | LCRec | 0.0433 | 0.0614 | 0.0310 | 0.0368 | | OneRec-SFT (1.7B) | 0.0545 | 0.0790 | 0.0383 | 0.0462 |
Amazon 2014 Home
| Methods | R@5 | R@10 | N@5 | N@10 | |---------|-----|------|-----|------| | SASRec (CE) | 0.0177 | 0.0277 | 0.0106 | 0.0138 | | SASRec (BCE) | 0.0081 | 0.0143 | 0.0046 | 0.0066 | | HSTU | 0.0129 | 0.0208 | 0.0084 | 0.0109 | | TIGER | 0.0145 | 0.0231 | 0.0096 | 0.0123 | | LCRec | 0.0163 | 0.0234 | 0.0110 | 0.0133 | | OneRec-SFT (1.7B) | 0.0166 | 0.0246 | 0.0112 | 0.0138 |
Features
- Multiple Models: Implementations of SASRec, HSTU, RQVAE, TIGER, LCRec, COBRA, and NoteLLM
- Multiple Datasets: Amazon 2014 (Beauty, Sports, Toys, Clothing) and Amazon 2023 (32 categories)
- Modular Design: Clean separation of models, data, and training logic
- Flexible Configuration: Gin-config based experiment management
- Easy Extension: Add custom datasets and models with minimal code
- Reproducible: Consistent evaluation metrics (Recall@K, NDCG@K) with W&B logging
Models
| Model | Type | Description | |-------|------|-------------| | SASRec | Baseline | Self-Attentive Sequential Recommendation | | HSTU | Baseline | Hierarchical Sequential Transduction Unit with temporal bias | | RQVAE | Generative | Residual Quantized VAE for semantic ID generation | | TIGER | Generative | Generative Retrieval with trie-based constrained decoding | | LCRec | Generative | LLM-based recommendation with collaborative semantics | | COBRA | Generative | Cascaded sparse-dense representations | | NoteLLM | Generative | Retrievable LLM for note recommendation (experimental) |
Installation
From Source (Recommended)
git clone https://github.com/phonism/genrec.git
cd genrec
pip install -e .
Full Installation (with Triton, TorchRec, etc.)
pip install -e ".[full]"
Dependencies Only
pip install -r requirements.txt
Quick Start
Train Baseline Models
# SASRec on Amazon 2014
python genrec/trainers/sasrec_trainer.py config/sasrec/amazon.gin --split beauty
# HSTU on Amazon 2014
python genrec/trainers/hstu_trainer.py config/hstu/amazon.gin --split beauty
# SASRec on Amazon 2023
python genrec/trainers/sasrec_trainer.py config/sasrec/amazon2023.gin
# HSTU on Amazon 2023
python genrec/trainers/hstu_trainer.py config/hstu/amazon2023.gin
Train RQVAE (Semantic ID Generator)
# For TIGER pipeline
python genrec/trainers/rqvae_trainer.py config/tiger/amazon/rqvae.gin --split beauty
# For LCRec pipeline
python genrec/trainers/rqvae_trainer.py config/lcrec/amazon/rqvae.gin --split beauty
# For COBRA pipeline
python genrec/trainers/rqvae_trainer.py config/cobra/amazon/rqvae.gin --split beauty
Train TIGER (Generative Retrieval)
# Requires pretrained RQVAE checkpoint
python genrec/trainers/tiger_trainer.py config/tiger/amazon/tiger.gin --split beauty
# On Amazon 2023
python genrec/trainers/tiger_trainer.py config/tiger/amazon2023/tiger.gin
Train LCRec (LLM-based)
# Requires pretrained RQVAE checkpoint
python genrec/trainers/lcrec_trainer.py config/lcrec/amazon/lcrec.gin --split beauty
# On Amazon 2023
python genrec/trainers/lcrec_trainer.py config/lcrec/amazon2023/lcrec.gin
Train COBRA
# Requires pretrained RQVAE checkpoint
python genrec/trainers/cobra_trainer.py config/cobra/amazon/cobra.gin --split beauty
Configuration
Dataset Selection
# Amazon 2014 datasets (via --split)
--split beauty # Beauty
--split sports # Sports and Outdoors
--split toys # Toys and Games
--split clothing # Clothing, Shoes and Jewelry
# Amazon 2023 datasets use dedicated config files
config/sasrec/amazon2023.gin
config/hstu/amazon2023.gin
config/tiger/amazon2023/tiger.gin
config/lcrec/amazon2023/lcrec.gin
Parameter Override
--gin "param=value"
Examples
# Change epochs and batch size
python genrec/trainers/tiger_trainer.py config/tiger/amazon/tiger.gin \
--split beauty \
--gin "train.epochs=200" \
--gin "train.batch_size=128"
# Custom model path for LCRec
python genrec/trainers/lcrec_trainer.py config/lcrec/amazon/lcrec.gin \
--split beauty \
--gin "MODEL_HUB_QWEN3_1_7B='/path/to/model'"
Project Structure
genrec/
├── genrec/
│ ├── models/ # Model implementations
│ │ ├── sasrec.py # SASRec
│ │ ├── hstu.py # HSTU
│ │ ├── rqvae.py # RQVAE
│ │ ├── tiger.py # TIGER
│ │ ├── lcrec.py # LCRec
│ │ ├── cobra.py # COBRA
│ │ └── notellm.py # NoteLLM
│ ├── trainers/ # Training scripts
│ │ ├── sasrec_trainer.py
│ │ ├── hstu_trainer.py
│ │ ├── rqvae_trainer.py
│ │ ├── tiger_trainer.py
│ │ ├── lcrec_trainer.py
│ │ ├── cobra_trainer.py
│ │ └── trainer_utils.py
│ ├── modules/ # Reusable components
│ │ ├── transformer.py # Transformer blocks
│ │ ├── embedding.py # Embedding layers
│ │ ├── encoder.py # Encoder modules
│ │ ├── metrics.py # Recall@K, NDCG@K
│ │ ├── loss.py # Loss functions
│ │ ├── scheduler.py # LR schedulers
│ │ ├── kmeans.py # K-means for RQVAE init
│ │ ├── gumbel.py # Gumbel softmax
│ │ └── normalize.py # Normalization layers
│ └── data/ # Dataset implementations
│ ├── amazon.py # Amazon 2014 datasets
│ ├── amazon2023.py # Amazon 2023 datasets (32 categories)
│ ├── amazon_sasrec.py # SASRec-specific data
│ ├── amazon_hstu.py # HSTU-specific data
│ ├── amazon_lcrec.py # LCRec-specific data
│ ├── amazon_cobra.py # COBRA-specific data
│ └── p5_amazon.py # P5-format data
├── config/ # Gin configuration files
│ ├── base.gin # Base config
│ ├── sasrec/ # SASRec configs
│ ├── hstu/ # HSTU configs
│ ├── tiger/ # TIGER configs (amazon/, amazon2023/)
│ ├── lcrec/ # LCRec configs (amazon/, amazon2023/)
│ └── cobra/ # COBRA configs
├── scripts/ # Utility scripts
├── docs/ # Documentation (English & Chinese)
├── assets/ # Media assets
└── reference/ # Reference implementations
Documentation
Full documentation is available at https://phonism.github.io/genrec
Contributing
We welcome contributions! Please see our Contributing Guide for details.
Citation
If you find this project useful, please cite:
@software{genrec2025,
title = {GenRec: A Model Zoo for Generative Recommendation},
author = {Qi Lu},
year = {2025},
url = {https://github.com/phonism/genrec}
}
References
- SASRec: Self-Attentive Sequential Recommendation
- HSTU: Hierarchical Sequential Transduction Units
- TIGER: Recommender Systems with Generative Retrieval
- RQ-VAE-Recommender by Edoardo Botta
- LC-Rec: LLM-based Collaborative Recommendation
- COBRA: Cascaded Sparse-Dense Representations
- NoteLLM: A Retrievable LLM for Note Recommendation
License
This project is licensed under the MIT License - see the LICENSE file for details.
