DDUp
An implementation of the SIGMOD23 paper: Detect, Distill and Update: Detect, Distill and Update: Learned DB Systems Facing Out of Distribution Data
Install / Use
/learn @meghdadk/DDUpREADME
Introduction
The code includes instantiation of three learned database systems: DBEst++, Naru, and TVAE.
To develope DDUp, we have used the source codes published by the respective works at below:
Setup
To increase reproducibility, we have created subdirectories for each dataset within each model. The latest codes for each case could be found inside the directory
For DBEst: The train/update procedures are located in the MDN.py. The evaluation procedures are located in benchmarking.py
For Naru: The train/update procedures are located in the incremental_train.py. The evaluation procedures are located in eval_model.py
For TVAE: The train/update procedures are located in the tvae_train.py. The evaluation procedures are located in benchmarking.py
The codes are tested for Python3.6 and Pytorch 1.9
Datasets
The experiments in the paper are for six public datasets. For DBest++, we have used a query template with two columns and have added the modified datasets in the related folders. For TVAE, we have used a samller (1m) sample of DMV dataset, as it was too expensive to train TVAE on the full data.
The link to some of the datasets:
References
If you find this repository useful in your work, please cite our SIGMOD23 paper:
@article{kurmanji2023detect,
title={Detect, Distill and Update: Learned DB Systems Facing Out of Distribution Data},
author={Kurmanji, Meghdad and Triantafillou, Peter},
journal={Proceedings of the ACM on Management of Data},
volume={1},
number={1},
pages={1--27},
year={2023},
publisher={ACM New York, NY, USA}
}
Related Skills
feishu-drive
348.2k|
things-mac
348.2kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
348.2kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
postkit
PostgreSQL-native identity, configuration, metering, and job queues. SQL functions that work with any language or driver
