.. |PyPI Version| image:: https://img.shields.io/pypi/v/stumpy.svg :target: https://pypi.org/project/stumpy/ :alt: PyPI Version .. |Conda Forge Version| image:: https://anaconda.org/conda-forge/stumpy/badges/version.svg :target: https://anaconda.org/conda-forge/stumpy :alt: Conda-Forge Version .. |PyPI Downloads| image:: https://static.pepy.tech/badge/stumpy/month :target: https://pepy.tech/project/stumpy :alt: PyPI Downloads .. |License| image:: https://img.shields.io/pypi/l/stumpy.svg :target: https://github.com/stumpy-dev/stumpy/blob/main/LICENSE.txt :alt: License .. |Test Status| image:: https://github.com/stumpy-dev/stumpy/workflows/Tests/badge.svg :target: https://github.com/stumpy-dev/stumpy/actions?query=workflow%3ATests+branch%3Amain :alt: Test Status .. |Code Coverage| image:: https://img.shields.io/badge/Coverage-100%25-green :alt: Code Coverage .. |RTD Status| image:: https://readthedocs.org/projects/stumpy/badge/?version=latest :target: https://stumpy.readthedocs.io/ :alt: ReadTheDocs Status .. |Binder| image:: https://mybinder.org/badge_logo.svg :target: https://mybinder.org/v2/gh/stumpy-dev/stumpy/main?filepath=notebooks :alt: Binder .. |JOSS| image:: http://joss.theoj.org/papers/10.21105/joss.01504/status.svg :target: https://doi.org/10.21105/joss.01504 :alt: JOSS .. |DOI| image:: https://zenodo.org/badge/184809315.svg :target: https://zenodo.org/badge/latestdoi/184809315 :alt: DOI .. |NumFOCUS| image:: https://img.shields.io/badge/NumFOCUS-Affiliated%20Project-orange.svg?style=flat&colorA=E1523D&colorB=007D8A :target: https://numfocus.org/sponsored-projects/affiliated-projects :alt: NumFOCUS Affiliated Project .. |Twitter| image:: https://img.shields.io/twitter/follow/stumpy_dev.svg?style=social :target: https://twitter.com/stumpy_dev :alt: Twitter

.. image:: https://raw.githubusercontent.com/stumpy-dev/stumpy/main/docs/images/stumpy_logo_small.png :target: https://github.com/stumpy-dev/stumpy :alt: STUMPY Logo

====== STUMPY

STUMPY is a powerful and scalable Python library that efficiently computes something called the matrix profile <https://stumpy.readthedocs.io/en/latest/Tutorial_The_Matrix_Profile.html>__, which is just an academic way of saying "for every (green) subsequence within your time series, automatically identify its corresponding nearest-neighbor (grey)":

.. image:: https://github.com/stumpy-dev/stumpy/blob/main/docs/images/stumpy_demo.gif?raw=true :alt: STUMPY Animated GIF

What's important is that once you've computed your matrix profile (middle panel above) it can then be used for a variety of time series data mining tasks such as:

pattern/motif (approximately repeated subsequences within a longer time series) discovery
anomaly/novelty (discord) discovery
shapelet discovery
semantic segmentation
streaming (on-line) data
fast approximate matrix profiles
time series chains (temporally ordered set of subsequence patterns)
snippets for summarizing long time series
pan matrix profiles for selecting the best subsequence window size(s)
and more ... <https://www.cs.ucr.edu/~eamonn/100_Time_Series_Data_Mining_Questions__with_Answers.pdf>__

Whether you are an academic, data scientist, software developer, or time series enthusiast, STUMPY is straightforward to install and our goal is to allow you to get to your time series insights faster. See documentation <https://stumpy.readthedocs.io/en/latest/>__ for more information.

How to use STUMPY

Please see our API documentation <https://stumpy.readthedocs.io/en/latest/api.html>__ for a complete list of available functions and see our informative tutorials <https://stumpy.readthedocs.io/en/latest/tutorials.html>__ for more comprehensive example use cases. Below, you will find code snippets that quickly demonstrate how to use STUMPY.

Typical usage (1-dimensional time series data) with STUMP <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.stump>__:

.. code:: python

import stumpy
import numpy as np

if __name__ == "__main__":
    your_time_series = np.random.rand(10000)
    window_size = 50  # Approximately, how many data points might be found in a pattern 

    matrix_profile = stumpy.stump(your_time_series, m=window_size)

Distributed usage for 1-dimensional time series data with Dask Distributed via STUMPED <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.stumped>__:

.. code:: python

import stumpy
import numpy as np
from dask.distributed import Client

if __name__ == "__main__":
    with Client() as dask_client:
        your_time_series = np.random.rand(10000)
        window_size = 50  # Approximately, how many data points might be found in a pattern 

        matrix_profile = stumpy.stumped(dask_client, your_time_series, m=window_size)

GPU usage for 1-dimensional time series data with GPU-STUMP <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.gpu_stump>__:

.. code:: python

import stumpy
import numpy as np
from numba import cuda

if __name__ == "__main__":
    your_time_series = np.random.rand(10000)
    window_size = 50  # Approximately, how many data points might be found in a pattern
    all_gpu_devices = [device.id for device in cuda.list_devices()]  # Get a list of all available GPU devices

    matrix_profile = stumpy.gpu_stump(your_time_series, m=window_size, device_id=all_gpu_devices)

Multi-dimensional time series data with MSTUMP <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.mstump>__:

.. code:: python

import stumpy
import numpy as np

if __name__ == "__main__":
    your_time_series = np.random.rand(3, 1000)  # Each row represents data from a different dimension while each column represents data from the same dimension
    window_size = 50  # Approximately, how many data points might be found in a pattern

    matrix_profile, matrix_profile_indices = stumpy.mstump(your_time_series, m=window_size)

Distributed multi-dimensional time series data analysis with Dask Distributed MSTUMPED <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.mstumped>__:

.. code:: python

import stumpy
import numpy as np
from dask.distributed import Client

if __name__ == "__main__":
    with Client() as dask_client:
        your_time_series = np.random.rand(3, 1000)   # Each row represents data from a different dimension while each column represents data from the same dimension
        window_size = 50  # Approximately, how many data points might be found in a pattern

        matrix_profile, matrix_profile_indices = stumpy.mstumped(dask_client, your_time_series, m=window_size)

Time Series Chains with Anchored Time Series Chains (ATSC) <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.atsc>__:

.. code:: python

import stumpy
import numpy as np

if __name__ == "__main__":
    your_time_series = np.random.rand(10000)
    window_size = 50  # Approximately, how many data points might be found in a pattern 
    
    matrix_profile = stumpy.stump(your_time_series, m=window_size)

    left_matrix_profile_index = matrix_profile[:, 2]
    right_matrix_profile_index = matrix_profile[:, 3]
    idx = 10  # Subsequence index for which to retrieve the anchored time series chain for

    anchored_chain = stumpy.atsc(left_matrix_profile_index, right_matrix_profile_index, idx)

    all_chain_set, longest_unanchored_chain = stumpy.allc(left_matrix_profile_index, right_matrix_profile_index)

Semantic Segmentation with Fast Low-cost Unipotent Semantic Segmentation (FLUSS) <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.fluss>__:

.. code:: python

import stumpy
import numpy as np

if __name__ == "__main__":
    your_time_series = np.random.rand(10000)
    window_size = 50  # Approximately, how many data points might be found in a pattern

    matrix_profile = stumpy.stump(your_time_series, m=window_size)

    subseq_len = 50
    correct_arc_curve, regime_locations = stumpy.fluss(matrix_profile[:, 1], 
                                                    L=subseq_len, 
                                                    n_regimes=2, 
                                                    excl_factor=1
                                                    )

Dependencies

Supported Python and NumPy versions are determined according to the NEP 29 deprecation policy <https://numpy.org/neps/nep-0029-deprecation_policy.html>__.

NumPy <http://www.numpy.org/>__
Numba <http://numba.pydata.org/>__
SciPy <https://www.scipy.org/>__

Where to get it

conda:

.. code:: bash

conda install -c conda-forge stumpy

pip:

.. code:: bash

python -m pip install stumpy

pixi:

.. code:: bash

pixi add stumpy

uv:

.. code:: bash

uv add stumpy

To install stumpy from source, see the instructions in the documentation <https://stumpy.readthedocs.io/en/latest/install.html>__.

Documentation

In order to fully understand and appreciate the underlying algorithms and applications, it is imperative that you read the original publications_. For a more detailed example of how to use STUMPY please consult the latest documentation <https://stumpy.readthedocs.io/en/latest/>__ or explore our hands-on tutorials <https://stumpy.readthedocs.io/en/latest/tutorials.html>__.

Performance

We tested the performance of computing the exact matrix profile using the Numba JIT compiled version of the code on r

Stumpy

Install / Use

README

====== STUMPY

How to use STUMPY

Dependencies

Where to get it

Documentation

Performance