PDTrans (SDM 2023)

This repository provides an implementation for PDTrans as described in the paper:

Probabilistic Decomposition Transformer for Time Series Forecasting. Junlong Tong, Liping Xie, Kanjian Zhang. SDM, 2023. [Paper]

Probabilistic Decomposition Transformer

Time series forecasting is crucial for many fields, such as disaster warning, weather prediction, and energy consumption. The Transformer-based models are considered to have revolutionized the field of time series. However, the autoregressive form of the Transformer introduces cumulative errors in the inference stage. Furthermore, the complex temporal pattern of the time series leads to an increased difficulty for the models in mining reliable temporal dependencies. In this paper, we propose the Probabilistic Decomposition Transformer model, which provides a flexible framework for hierarchical and decomposable forecasts. The hierarchical mechanism utilizes the forecasting results of Transformer as conditional information for the generative model, performing sequence-level forecasts to approximate the ground truth, which can mitigate the cumulative error of the autoregressive Transformer. In addition, the conditional generative model encodes historical and predictive information into the latent space and reconstructs typical patterns from the latent space, such as seasonality and trend terms. The process provides a flexible framework for the separation of complex patterns through the interaction of information in the latent space. Extensive experiments on several datasets demonstrate the effectiveness and robustness of the model, indicating that it compares favorably with the state-of-the-art.

Architecture

<img src=".\PDTrans.png" height = "250" alt="" align=center /> Figure 1. Overall architecture of PDTrans.

Loss function

$$ \mathcal{L}= \gamma \mathcal{L}{NLL}+\beta \mathcal{L}{KL} + \mathcal{L}{R}.$$ $$ \mathcal{L}{NLL} = -\sum_{t} \text{log }l\left(Y_{t} \mid \mu_{t}, \sigma_{t}\right)=\sum_{t} \frac{\left(Y_{t}-\mu_{t}\right)^{2}}{2 \sigma_{t}^{2}}+\log \sigma_{t}+ Const .$$ $$\mathcal{L}{KL} = D{K L}\left(q_{\phi}\left(z \mid Y_{1: t_{0}}, \mu_{t_{0}+1: t_{0}+\tau}\right) | p_{\theta}\left(z \mid Y_{1: t_{0}}\right)\right)=-\frac{1}{2} \sum_{t=t_{0}+1}^{t_{0}+\tau}\left(1+\log \sigma_{t}^{2}-\mu_{t}^{2}-\sigma_{t}^{2}\right) $$ $$\mathcal{L}{R} = -\sum{t} \text{log }l'\left(\hat{Y}{t} \mid \hat{\mu}{t}, \hat{\sigma}_{t}\right).$$

Requirements

Python 3.8
PyTorch 1.8

Data

Electricity dataset: http://archive.ics.uci.edu/dataset/321/electricityloaddiagrams20112014
Traffic dataset: http://archive.ics.uci.edu/dataset/204/pems+sf
Solar dataset: https://www.nrel.gov/grid/solar-power-data.html
M4 dataset: https://www.kaggle.com/datasets/yogesh94/m4-forecasting-competition-dataset
Exchange dataset: https://github.com/laiguokun/multivariate-time-series-data/tree/master/exchange_rate

Usage

Preprocess the data:
```
python prepdata.py
```

Restore the saved model and make prediction:

python evaluate.py --dataset='elect' --model-name='output_elect' --restore-file='best'

Train the model:

python train.py --dataset='elect' --model-name='output_elect'

Reproducibility

To easily reproduce the results, we provide the experiment script on electricity dataset. You can reproduce the experiment results by:
```
bash ./script/PDTrans_elect.sh
```

Citation

If you find this repository useful, please cite our paper.

@inproceedings{tong2023probabilistic,
  title={Probabilistic decomposition transformer for time series forecasting},
  author={Tong, Junlong and Xie, Liping and Zhang, Kanjian},
  booktitle={Proceedings of the 2023 SIAM International Conference on Data Mining (SDM)},
  pages={478--486},
  year={2023},
  organization={SIAM}
}

Contact

If you have any questions, please contact: jl-tong@sjtu.edu.cn

PDTrans

Install / Use

README