TCPDBench
The Turing Change Point Detection Benchmark: An Extensive Benchmark Evaluation of Change Point Detection Algorithms on real-world data
Install / Use
/learn @alan-turing-institute/TCPDBenchREADME
Turing Change Point Detection Benchmark
Welcome to the repository for the Turing Change Point Detection Benchmark, a benchmark evaluation of change point detection algorithms developed at The Alan Turing Institute. This benchmark uses the time series from the Turing Change Point Dataset (TCPD).
Useful links:
- Turing Change Point Detection Benchmark
- Turing Change Point Dataset
- An Evaluation of Change Point Detection Algorithms by Gertjan van den Burg and Chris Williams.
- Annotation Tool
If you encounter a problem when using this repository or simply want to ask a
question, please don't hesitate to open an issue on
GitHub or send an
email to gertjanvandenburg at gmail dot com.
Introduction
Change point detection focuses on accurately detecting moments of abrupt change in the behavior of a time series. While many methods for change point detection exist, past research has paid little attention to the evaluation of existing algorithms on real-world data. This work introduces a benchmark study and a dataset (TCPD) that are explicitly designed for the evaluation of change point detection algorithms. We hope that our work becomes a proving ground for the comparison and development of change point detection algorithms that work well in practice.
This repository contains the code necessary to evaluate and analyze a significant number of change point detection algorithms on the TCPD, and serves to reproduce the work in Van den Burg and Williams (2020). Note that work based on either the dataset or this benchmark should cite that paper:
@article{vandenburg2020evaluation,
title={An Evaluation of Change Point Detection Algorithms},
author={{Van den Burg}, G. J. J. and Williams, C. K. I.},
journal={arXiv preprint arXiv:2003.06222},
year={2020}
}
For the experiments we've used the abed command line program, which makes it easy to organize and run the experiments. This means that all experiments are defined through the abed_conf.py file. In particular, the hyperparameters and the command line arguments to all methods are defined in that file. Next, all methods are called as command line scripts and they are defined in the execs directory. The raw results from the experiments are collected in JSON files and placed in the abed_results directory, organized by dataset and method. Finally, we use Make to coordinate our analysis scripts: first we generate summary files using summarize.py, and then use these to generate all the tables and figures in the paper.
Getting Started
This repository contains all the code to generate the results (tables/figures/constants) from the paper, as well as to reproduce the experiments entirely. You can either install the dependencies directly on your machine or use the provided Dockerfile (see below). If you don't use Docker, first clone this repository using:
$ git clone --recurse-submodules https://github.com/alan-turing-institute/TCPDBench
Generating Tables/Figures
Generating the tables and figures from the paper is done through the scripts
in analysis/scripts and can be run through the provided Makefile. A
working Python and R installation is necessary to reproduce the analysis. For
Python, install the required dependencies by running:
$ pip install -r ./analysis/requirements.txt
For R, we need the argparse and exactRankTests packages, which we can install as follows from the command line:
$ Rscript -e "install.packages(c('argparse', 'exactRankTests'))"
Subsequently we can use make to reproduce the experimental results:
$ make results
The results will be placed in ./analysis/output. Note that to generate the
figures a working LaTeX and latexmk installation is needed.
Reproducing the experiments
To fully reproduce the experiments, some additional steps are needed. Note that the Docker procedure outlined below automates this process somewhat.
First, obtain the Turing Change Point
Dataset and follow the
instructions provided there. Copy the dataset files to a datasets
directory in this repository.
To run all the tasks we use the abed
command line tool. This allows us to define the experiments in a single
configuration file (abed_conf.py) and makes it easy to keep track of which
tasks still need to be run.
Note that this repository contains all the result files, so it is not necessary to redo all the experiments. If you still wish to do so, the instructions are as follows:
-
Move the current result directory out of the way:
$ mv abed_results old_abed_results -
Install abed. This requires an existing installation of openmpi, but otherwise should be a matter of running:
$ pip install 'abed>=0.1.3' -
Tell abed to rediscover all the tasks that need to be done:
$ abed reload_tasksThis will populate the
abed_tasks.txtfile and will automatically commit the updated file to the Git repository. You can show the number of tasks that need to be completed through:$ abed status -
Initialize the virtual environments for Python and R, which installs all required dependencies:
$ make venvsNote that this will also create an R virtual environment (using RSimpleVenv), which ensures that the exact versions of the packages used in the experiments will be installed. This step can take a little while (:coffee:), but is important to ensure reproducibility.
-
Run abed through
mpiexec, as follows:$ mpiexec -np 4 abed localThis will run abed using 4 cores, which can of course be increased or decreased if desired. Note that a minimum of two cores is needed for abed to operate. You may want to run these experiments in parallel on a large number of cores, as the expected runtime is on the order of 21 days on a single core. Once this command starts running the experiments you will see result files appear in the
stagingdirectory.
Running the experiments with Docker
If you like to use Docker to manage the environment and dependencies, you can do so easily with the provided Dockerfile. You can build the Docker image using:
$ docker build -t alan-turing-institute/tcpdbench github.com/alan-turing-institute/TCPDBench
To ensure that the results created in the docker container persist to the host, we need to create a volume first (following these instructions):
$ mkdir /path/to/tcpdbench/results # *absolute* path where you want the results
$ docker volume create --driver local \
--opt type=none \
--opt device=/path/to/tcpdbench/results \
--opt o=bind tcpdbench_vol
You can then follow the same procedure as described above to reproduce the experiments, but using the relevant docker commands to run them in the container:
-
For reproducing just the tables and figures, use:
$ docker run -i -t -v tcpdbench_vol:/TCPDBench alan-turing-institute/tcpdbench /bin/bash -c "make results" -
For reproducing all the experiments, use:
$ docker run -i -t -v tcpdbench_vol:/TCPDBench alan-turing-institute/tcpdbench /bin/bash -c "mv abed_results old_abed_results && mkdir abed_results && abed reload_tasks && abed status && make venvs && mpiexec --allow-run-as-root -np 4 abed local && make results"where
-np 4sets the number of cores used for the experiments to four. This can be changed as desired to increase efficiency.
Extending the Benchmark
It should be relatively straightforward to extend the benchmark with your own methods and datasets. Remember to cite our paper if you do end up using this work.
Adding a new method
To add a new method to the benchmark, you'll need to write a script in the
execs folder that takes a dataset file as input and computes the change
point locations. Currently the methods are organized by language (R and
python), but you don't necessarily need to follow this structure when adding a
new method. Please do check the existing code for inspiration though, as
adding a new method is probably easiest when following the same structure.
Experiments are managed using the abed command line application. This facilitates running all the methods with all their hyperparameter settings on all datasets.
Note that currently the methods print the output file
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
399Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
18.7kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
Security Score
Audited on Mar 20, 2026
