Athena

Athena is a learned optimizer enhancer, proposed to enhance the query optimizer of DBMSs. It is consisted of three key designs: (i) Order-centric plan explorer, (ii) Tree-Mamba plan comparator, and (iii) Time-weighted loss function.

Order-centric plan explorer is implemented in Athena_PG.
Tree-Mamba plan comparator is implemented in athena_model.py.
Time-weighted loss function is implemented in train.py as function tailr_loss_with_logits.

To ensure usability and reusability, we provide the following two implementation approaches: (1) Docker and (2) manual step-by-step deployment.

Method 1: Run in Docker

Install Docker and Docker-compose.

For example, for ubuntu, please refer to the following link:

Install Docker using the apt repository

Notice that if you encountered a network problem, supposing that you have a http proxy at http://<ip>:<port>, such as http://127.0.0.1:7890, you can try the following command. Remember to replace the <ip> and <port> in the following texts to your actual value.
```
export HTTP_PROXY=http://<ip>:<port>
export HTTPS_PROXY=$HTTP_PROXY
cat <<EOF | sudo tee /etc/systemd/system/docker.service.d/http-proxy.conf > /dev/null
[Service]
Environment="HTTP_PROXY=$HTTP_PROXY"
Environment="HTTPS_PROXY=$HTTP_PROXY"
Environment="NO_PROXY=localhost,127.0.0.1"
EOF
```
and add two lines after the line - POSTGRES_PASSWORD=postgres in the ./docker-compose.yml
```
   - HTTP_PROXY=http://<ip>:<port>
   - HTTPS_PROXY=http://<ip>:<port>
```
Download datasets
- IMDB
Download the imdb.tgz to ./postgres_service
Generate configuration for PostgreSQL.

Go to PGTune, generate the configuration of your computer, and save it to ./postgres_service/postgresql.conf.
Train our model in Docker
```
docker compose up --build
```
Run other commands

Refer to other commands in section Run. Replace the command property of the athena service with the command you want in docker-compose.yml. And then run docker compose up --build again.

Method 2: Manual deployment

I. Dependencies

- Python dependencies

uv pip install -r requirements.txt

// install mamba in https://github.com/a858438680/TreeMamba.git
git clone https://github.com/a858438680/tree_mamba.git
uv pip install --no-build-isolation ./python

- Postgres extensions

No need to install these extensions here, there is guidelines when installing Athena_PG.

II. Data

Athena support 4 benchmarks: JOB, STATS-CEB, TPC-DS, and DSB.

We have included their workloads into the dir workloads/, where the training and test queries of each benchmark are placed into dir xxx and xxx-sample respectively.

III. Run

Install Athena_PG before the following steps.

1. Prepare plans for model training

# generate plans for JOB queries as example
python -u src/Athena/generate_dataset.py --database imdb --workload JOB

This code employs Athena's plan explorer to enumerate candidate plans for input queries, executes them, and stores the results into dir datasets as the training samples. The dir datasets is orgranizeda as datasets/{database}/{workload}/{plan-generation-method}/.

We also support to generate training samples of Bao, Lero, Kepler. For example, for Bao, check the file.

2. Train the Tree-Mamba plan comparator

# set parameter for torch determistic algorithm
export CUBLAS_WORKSPACE_CONFIG=:4096:8

# train a Tree-Mamba plan comparator for JOB queries as an example
python scripts/run_experiments.py --models athena --workloads JOB --methods athena

This code will train a Tree-Mamba plan comparator based on the collected training samples in the last step and store the model in the dir models/{model}/{workload}/{plan-generation-methods}/{random-seed}/models/. For this example, the dir to store the model is models/athena/JOB/athena/3407/models/.

3. Test the model

This step tests the model based JOB as an example.

3.1 Geneate plans for test queries

python -u src/Athena/generate_dataset.py --database imdb --workload JOB-sample --generate_candidate_plans

The code with the option generate_candidate_plans only generates the plans for queries and does not execute them. The generated plans will be also stored in dir datasets/.

3.2 Test model

python src/Athena/test.py --dataset imdb/JOB-sample/Athena --model models/athena/JOB/athena/3407/models/199.pt

This code will employ the model to select a plan from the candidate plans of each query and store the indexes of selected plans in the dir results/{database}/{workload}/{plan-generation-method}/{model}.json. file results/imdb/JOB-sample/Athena/Athena.json.

3.3 Execute the selected plans in the last step

python -u scripts/test_model.py --database=imdb --workload=JOB-sample --plan_generation_method=Athena --model=Athena --result_path=Athena

This code loads the indexes of the selected plan in the last step, executes them, and stores the execution results into dir records/. This code will directly load the records instead of executing the queries next time.

We have stored model selection and execution records for Athena on JOB-sample, so you can just run 3.3 without 3.1 and 3.2.

Citation

If you find Athena useful, please cite the following paper:

@article{10.1145/3725395,
author = {Li, Runzhong and Li, Qilong and Liu, Haotian and Mao, Rui and Li, Qing and Tang, Bo},
title = {Athena: An Effective Learning-based Framework for Query Optimizer Performance Improvement},
year = {2025},
issue_date = {June 2025},
volume = {3},
number = {3},
doi = {10.1145/3725395},
journal = {Proc. ACM Manag. Data},
month = jun,
articleno = {132},
numpages = {24}
}

Athena

Install / Use

README

Athena

Method 1: Run in Docker

Method 2: Manual deployment

I. Dependencies

- Python dependencies

- Postgres extensions

II. Data

III. Run

1. Prepare plans for model training

2. Train the Tree-Mamba plan comparator

3. Test the model

Citation