Recommenders
Best Practices on Recommendation Systems
Install / Use
/learn @recommenders-team/RecommendersREADME
What's New (April, 2025)
We reached 20,000 stars!!
We are happy to announce that we have reached 20,000 stars on GitHub! Thank you for your support and contributions to the Recommenders project. We are excited to continue building and improving this project with your help.
Check out the release Recommenders 1.2.1!
We fixed a lot of bugs due to dependencies, improved security, reviewed the notebooks and the libraries.
Introduction
Recommenders objective is to assist researchers, developers and enthusiasts in prototyping, experimenting with and bringing to production a range of classic and state-of-the-art recommendation systems.
Recommenders is a project under the Linux Foundation of AI and Data.
This repository contains examples and best practices for building recommendation systems, provided as Jupyter notebooks. The examples detail our learnings on five key tasks:
- Prepare Data: Preparing and loading data for each recommendation algorithm.
- Model: Building models using various classical and deep learning recommendation algorithms such as Alternating Least Squares (ALS) or eXtreme Deep Factorization Machines (xDeepFM).
- Evaluate: Evaluating algorithms with offline metrics.
- Model Select and Optimize: Tuning and optimizing hyperparameters for recommendation models.
- Operationalize: Operationalizing models in a production environment on Azure.
Several utilities are provided in recommenders to support common tasks such as loading datasets in the format expected by different algorithms, evaluating model outputs, and splitting training/test data. Implementations of several state-of-the-art algorithms are included for self-study and customization in your own applications. See the Recommenders documentation.
For a more detailed overview of the repository, please see the documents on the wiki page.
For some of the practical scenarios where recommendation systems have been applied, see scenarios.
Getting Started
We recommend uv for environment management (10-100x faster than conda/pip), and VS Code for development. To install the recommenders package and run an example notebook on Linux/WSL:
# 1. Install gcc if it is not installed already. On Ubuntu, this could done by using the command
# sudo apt install gcc
# 2. Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# 3. Create and activate a new virtual environment
uv venv ~/.venvs/recommenders --python 3.11
source ~/.venvs/recommenders/bin/activate
# 4. Install the core recommenders package. It can run all the CPU notebooks.
uv pip install recommenders
# 5. Create a Jupyter kernel
uv pip install ipykernel
python -m ipykernel install --user --name recommenders --display-name "Python (recommenders)"
# 6. Clone this repo within VSCode or using command line:
git clone https://github.com/recommenders-team/recommenders.git
# 7. Within VSCode:
# a. Open a notebook, e.g., examples/00_quick_start/sar_movielens.ipynb;
# b. Select Jupyter kernel "Python (recommenders)";
# c. Run the notebook.
For more information about setup on other platforms (e.g., Windows and macOS) and different configurations (e.g., GPU, Spark and experimental features), see the Setup Guide.
In addition to the core package, several extras are also provided, including:
[gpu]: Needed for running GPU models.[spark]: Needed for running Spark models.[dev]: Needed for development for the repo.[all]:[gpu]|[spark]|[dev][experimental]: Models that are not thoroughly tested and/or may require additional steps in installation.
Algorithms
The table below lists the recommendation algorithms currently available in the repository. Notebooks are linked under the Example column as Quick start, showcasing an easy to run example of the algorithm, or as Deep dive, explaining in detail the math and implementation of the algorithm.
| Algorithm | Type | Description | Example | |-----------|------|-------------|---------| | Alternating Least Squares (ALS) | Collaborative Filtering | Matrix factorization algorithm for explicit or implicit feedback in large datasets, optimized for scalability and distributed computing capability. It works in the PySpark environment. | Quick start / Deep dive | | Attentive Asynchronous Singular Value Decomposition (A2SVD)<sup></sup> | Collaborative Filtering | Sequential-based algorithm that aims to capture both long and short-term user preferences using attention mechanism. It works in the CPU/GPU environment. | Quick start | | Cornac/Bayesian Personalized Ranking (BPR) | Collaborative Filtering | Matrix factorization algorithm for predicting item ranking with implicit feedback. It works in the CPU environment. | Deep dive | | Cornac/Bilateral Variational Autoencoder (BiVAE) | Collaborative Filtering | Generative model for dyadic data (e.g., user-item interactions). It works in the CPU/GPU environment. | Deep dive | | Convolutional Sequence Embedding Recommendation (Caser) | Collaborative Filtering | Algorithm based on convolutions that aim to capture both user’s general preferences and sequential patterns. It works in the CPU/GPU environment. | Quick start | | Deep Knowledge-Aware Network (DKN)<sup></sup> | Content-Based Filtering | Deep learning algorithm incorporating a knowledge graph and article embeddings for providing news or article recommendations. It works in the CPU/GPU environment. | Quick start / Deep dive | | Extreme Deep Factorization Machine (xDeepFM)<sup></sup> | Collaborative Filtering | Deep learning based algorithm for implicit and explicit feedback with user/item features. It works in the CPU/GPU environment. | Quick start | | Embedding Dot Bias | Collaborative Filtering | General purpose algorithm with embeddings and biases for users and items. It works in the CPU/GPU environment. | Quick start | | LightFM/Factorization Machine | Collaborative Filtering | Factorization Machine algorithm for both implicit and explicit feedbacks. It works in the CPU environment. | Quick start | | LightGBM/Gradient Boosting Tree<sup></sup> | Content-Based Filtering | Gradient Boosting Tree algorithm for fast training and low memory usage in content-based problems. It works in the CPU/GPU/PySpark environments. | Quick start in CPU / Deep dive in PySpark | | LightGCN | Collaborative Filtering | Deep learning algorithm which simplifies the design of GCN for predicting implicit feedback. It works in the CPU/GPU environment. | Deep dive | | GeoIMC<sup>*</sup> | Collaborative Filtering | Matrix completion algorithm that takes into account user and item features using Riemannian conjugate gradient optimization and follows a geometric approach. It works in the CPU environment. | Quick start | | GRU | Collaborative Filtering | Sequential-based algorithm that aims to capture both long and short-term user preferences using recurrent neural networks. It works in the CPU/GPU environment. | Quick start | | Multinomial VAE | Collaborative Filtering | Generative model for predicting user/item interactions. It works in the CPU/GPU environment. | [Deep dive](examples/02_model_collaborative_filtering/multi_vae_deep_dive.
