Relbench
RelBench: Relational Deep Learning Benchmark
Install / Use
/learn @snap-stanford/RelbenchREADME
<!-- **Get Started:** loading data [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/drive/1PAOktBqh_3QzgAKi53F4JbQxoOuBsUBY?usp=sharing), training model [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/drive/1_z0aKcs5XndEacX1eob6csDuR4DYhGQU?usp=sharing). --> <!-- [<img align="center" src="https://relbench.stanford.edu/img/favicon.png" width="20px" /> -->
Website | Position Paper | RelBench v1 Paper | RelBench v2 Paper | Mailing List
News
February 13, 2026: RelBench v2 paper + Temporal Graph Benchmark integration
The RelBench v2 paper is now accessible as a preprint! Please see the paper on arXiv.
Alongside our paper, we also integrate the Temporal Graph Benchmark (TGB) into RelBench. TGB integration includes translating time-stamped event streams into normalized relational schemas, which enables direct comparison between temporal graph models and relational deep learning models.
January 12, 2026: RelBench v2 is now released!
- Introducing Autocomplete tasks: new task paradigm to predict existing columns in the database.
- 4 new databases: SALT, RateBeer, arXiv, and MIMIC-IV.
- 36 new predictive tasks, including 23 Autocomplete tasks across new and existing databases.
- CTU integration: 70+ relational datasets from the CTU repository via ReDeLEx.
- Direct SQL database connectivity via ReDeLEx.
- 4DBInfer integration: 7 relational datasets from the 4DBInfer repository in RelBench format.
- Bug fixes and performance improvements:
- Optionally include (time-censored) labels as features in the database. (#327)
- Support NDCG metric for link prediction. (#276)
- Optimize SentenceTransformer encoding with Torch for 10-20% faster processing than default NumPy encoding. (#261)
- Enable configuring RelBench cache directory via environment variable. (#336)
- ... and more (see commit history for details)
September 26, 2024: RelBench is accepted to the NeurIPS Datasets and Benchmarks track!
July 3rd, 2024: RelBench v1 is now released!
Overview
<!-- The Relational Deep Learning Benchmark (RelBench) is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on relational databases. RelBench supports deep learning framework agnostic data loading, task specification, standardized data splitting, and transforming data into graph format. RelBench also provides standardized evaluation metric computations and a leaderboard for tracking progress. --> <!-- <p align="center"><img src="https://relbench.stanford.edu/img/relbench-fig.png" alt="pipeline" /></p> -->Relational Deep Learning is a new approach for end-to-end representation learning on data spread across multiple tables, such as in a relational database (see our position paper). Relational databases are the world's most widely used data management system, and are used for industrial and scientific purposes across many domains. RelBench is a benchmark designed to facilitate efficient, robust and reproducible research on end-to-end deep learning over relational databases.
RelBench v1 contains 7 realistic, large-scale, and diverse relational databases spanning domains including medical, social networks, e-commerce and sport. RelBench v2 adds 4 more, now totaling 11 databases. Each database has multiple predictive tasks (66 in total) defined, each carefully scoped to be both challenging and of domain-specific importance. It provides full support for data downloading, task specification and standardized evaluation in an ML-framework-agnostic manner.
Additionally, RelBench provides a first open-source implementation of a Graph Neural Network based approach to relational deep learning. This implementation uses PyTorch Geometric to load the data as a graph and train GNN models, and PyTorch Frame for modeling tabular data. Finally, there is an open leaderboard for tracking progress.
Key Papers
RelBench: A Benchmark for Deep Learning on Relational Databases
This paper details our approach to designing the RelBench benchmark. It also includes a key user study showing that relational deep learning can produce performant models with a fraction of the manual human effort required by typical data science pipelines. This paper is useful for a detailed understanding of RelBench and our initial benchmarking results. If you just want to quickly familiarize with the data and tasks, the website is a better place to start.
<!---Joshua Robinson*, Rishabh Ranjan*, Weihua Hu*, Kexin Huang*, Jiaqi Han, Alejandro Dobles, Matthias Fey, Jan Eric Lenssen, Yiwen Yuan, Zecheng Zhang, Xinwei He, Jure Leskovec-->This paper outlines our proposal for how to do end-to-end deep learning on relational databases by combining graph neural networsk with deep tabular models. We reccomend reading this paper if you want to think about new methods for end-to-end deep learning on relational databases. The paper includes a section on possible directions for future research to give a snapshot of some of the research possibilities there are in this area.
<!--- Matthias Fey*, Weihua Hu*, Kexin Huang*, Jan Eric Lenssen*, Rishabh Ranjan, Joshua Robinson*, Rex Ying, Jiaxuan You, Jure Leskovec.-->Design of RelBench
<p align="center"><img src="https://relbench.stanford.edu/img/relbench-fig.png" alt="logo" width="900px" /></p>RelBench has the following main components:
- 11 databases with a total of 66 tasks; both of these automatically downloadable for ease of use
- Easy data loading, and graph construction from pkey-fkey links
- Your own model, which can use any deep learning stack since RelBench is framework-agnostic. We provide a first model implementation using PyTorch Geometric and PyTorch Frame.
- Standardized evaluators - all you need to do is produce a list of predictions for test samples, and RelBench computes metrics to ensure standardized evaluation
- A leaderboard you can upload your results to, to track SOTA progress.
Installation
You can install RelBench using pip:
pip install relbench
This will allow usage of the core RelBench data and task loading functionality.
To additionally use relbench.modeling, which requires PyTorch, PyTorch Geometric and PyTorch Frame, install these dependencies manually or do:
pip install relbench[full]
For the scripts in the examples directory, use:
pip install relbench[example]
Then, to run a script:
git clone https://github.com/snap-stanford/relbench
cd relbench/examples
python gnn_entity.py --dataset rel-f1 --task driver-position
Using External Integrations
Using CTU datasets
To use datasets from the CTU repository, use:
pip install relbench[ctu]
If you use the CTU datasets in your work, please cite ReDeLEx as below:
@misc{peleska2025redelex,
title={REDELEX: A Framework for Relational Deep Learning Exploration},
author={Jakub Peleška and Gustav Šír},
year={2025},
eprint={2506.22199},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2506.22199},
}
Using 4DBInfer datasets
To use datasets from the 4DBInfer repository, use:
pip install relbench[dbinfer]
If you use the 4DBInfer datasets in your work, please cite 4DBInfer as below:
@article{dbinfer,
title={4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs},
author={Wang, Minjie and Gan, Quan and Wipf, David and Cai, Zhenkun and Li, Ning and Tang, Jianheng and Zhang, Yanlin and Zhang, Zizhao and Mao, Zunyao and Song, Yakun and Wang, Yanbo and Li, Jiahang and Zhang, Ha
