Stumpy
STUMPY is a powerful and scalable Python library for modern time series analysis
Install / Use
/learn @stumpy-dev/StumpyREADME
|PyPI Version| |Conda Forge Version| |PyPI Downloads| |License| |Test Status| |Code Coverage|
|RTD Status| |Binder| |JOSS| |NumFOCUS|
.. |PyPI Version| image:: https://img.shields.io/pypi/v/stumpy.svg :target: https://pypi.org/project/stumpy/ :alt: PyPI Version .. |Conda Forge Version| image:: https://anaconda.org/conda-forge/stumpy/badges/version.svg :target: https://anaconda.org/conda-forge/stumpy :alt: Conda-Forge Version .. |PyPI Downloads| image:: https://static.pepy.tech/badge/stumpy/month :target: https://pepy.tech/project/stumpy :alt: PyPI Downloads .. |License| image:: https://img.shields.io/pypi/l/stumpy.svg :target: https://github.com/stumpy-dev/stumpy/blob/main/LICENSE.txt :alt: License .. |Test Status| image:: https://github.com/stumpy-dev/stumpy/workflows/Tests/badge.svg :target: https://github.com/stumpy-dev/stumpy/actions?query=workflow%3ATests+branch%3Amain :alt: Test Status .. |Code Coverage| image:: https://img.shields.io/badge/Coverage-100%25-green :alt: Code Coverage .. |RTD Status| image:: https://readthedocs.org/projects/stumpy/badge/?version=latest :target: https://stumpy.readthedocs.io/ :alt: ReadTheDocs Status .. |Binder| image:: https://mybinder.org/badge_logo.svg :target: https://mybinder.org/v2/gh/stumpy-dev/stumpy/main?filepath=notebooks :alt: Binder .. |JOSS| image:: http://joss.theoj.org/papers/10.21105/joss.01504/status.svg :target: https://doi.org/10.21105/joss.01504 :alt: JOSS .. |DOI| image:: https://zenodo.org/badge/184809315.svg :target: https://zenodo.org/badge/latestdoi/184809315 :alt: DOI .. |NumFOCUS| image:: https://img.shields.io/badge/NumFOCUS-Affiliated%20Project-orange.svg?style=flat&colorA=E1523D&colorB=007D8A :target: https://numfocus.org/sponsored-projects/affiliated-projects :alt: NumFOCUS Affiliated Project .. |Twitter| image:: https://img.shields.io/twitter/follow/stumpy_dev.svg?style=social :target: https://twitter.com/stumpy_dev :alt: Twitter
|
.. image:: https://raw.githubusercontent.com/stumpy-dev/stumpy/main/docs/images/stumpy_logo_small.png :target: https://github.com/stumpy-dev/stumpy :alt: STUMPY Logo
====== STUMPY
STUMPY is a powerful and scalable Python library that efficiently computes something called the matrix profile <https://stumpy.readthedocs.io/en/latest/Tutorial_The_Matrix_Profile.html>__, which is just an academic way of saying "for every (green) subsequence within your time series, automatically identify its corresponding nearest-neighbor (grey)":
.. image:: https://github.com/stumpy-dev/stumpy/blob/main/docs/images/stumpy_demo.gif?raw=true :alt: STUMPY Animated GIF
What's important is that once you've computed your matrix profile (middle panel above) it can then be used for a variety of time series data mining tasks such as:
- pattern/motif (approximately repeated subsequences within a longer time series) discovery
- anomaly/novelty (discord) discovery
- shapelet discovery
- semantic segmentation
- streaming (on-line) data
- fast approximate matrix profiles
- time series chains (temporally ordered set of subsequence patterns)
- snippets for summarizing long time series
- pan matrix profiles for selecting the best subsequence window size(s)
and more ... <https://www.cs.ucr.edu/~eamonn/100_Time_Series_Data_Mining_Questions__with_Answers.pdf>__
Whether you are an academic, data scientist, software developer, or time series enthusiast, STUMPY is straightforward to install and our goal is to allow you to get to your time series insights faster. See documentation <https://stumpy.readthedocs.io/en/latest/>__ for more information.
How to use STUMPY
Please see our API documentation <https://stumpy.readthedocs.io/en/latest/api.html>__ for a complete list of available functions and see our informative tutorials <https://stumpy.readthedocs.io/en/latest/tutorials.html>__ for more comprehensive example use cases. Below, you will find code snippets that quickly demonstrate how to use STUMPY.
Typical usage (1-dimensional time series data) with STUMP <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.stump>__:
.. code:: python
import stumpy
import numpy as np
if __name__ == "__main__":
your_time_series = np.random.rand(10000)
window_size = 50 # Approximately, how many data points might be found in a pattern
matrix_profile = stumpy.stump(your_time_series, m=window_size)
Distributed usage for 1-dimensional time series data with Dask Distributed via STUMPED <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.stumped>__:
.. code:: python
import stumpy
import numpy as np
from dask.distributed import Client
if __name__ == "__main__":
with Client() as dask_client:
your_time_series = np.random.rand(10000)
window_size = 50 # Approximately, how many data points might be found in a pattern
matrix_profile = stumpy.stumped(dask_client, your_time_series, m=window_size)
GPU usage for 1-dimensional time series data with GPU-STUMP <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.gpu_stump>__:
.. code:: python
import stumpy
import numpy as np
from numba import cuda
if __name__ == "__main__":
your_time_series = np.random.rand(10000)
window_size = 50 # Approximately, how many data points might be found in a pattern
all_gpu_devices = [device.id for device in cuda.list_devices()] # Get a list of all available GPU devices
matrix_profile = stumpy.gpu_stump(your_time_series, m=window_size, device_id=all_gpu_devices)
Multi-dimensional time series data with MSTUMP <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.mstump>__:
.. code:: python
import stumpy
import numpy as np
if __name__ == "__main__":
your_time_series = np.random.rand(3, 1000) # Each row represents data from a different dimension while each column represents data from the same dimension
window_size = 50 # Approximately, how many data points might be found in a pattern
matrix_profile, matrix_profile_indices = stumpy.mstump(your_time_series, m=window_size)
Distributed multi-dimensional time series data analysis with Dask Distributed MSTUMPED <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.mstumped>__:
.. code:: python
import stumpy
import numpy as np
from dask.distributed import Client
if __name__ == "__main__":
with Client() as dask_client:
your_time_series = np.random.rand(3, 1000) # Each row represents data from a different dimension while each column represents data from the same dimension
window_size = 50 # Approximately, how many data points might be found in a pattern
matrix_profile, matrix_profile_indices = stumpy.mstumped(dask_client, your_time_series, m=window_size)
Time Series Chains with Anchored Time Series Chains (ATSC) <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.atsc>__:
.. code:: python
import stumpy
import numpy as np
if __name__ == "__main__":
your_time_series = np.random.rand(10000)
window_size = 50 # Approximately, how many data points might be found in a pattern
matrix_profile = stumpy.stump(your_time_series, m=window_size)
left_matrix_profile_index = matrix_profile[:, 2]
right_matrix_profile_index = matrix_profile[:, 3]
idx = 10 # Subsequence index for which to retrieve the anchored time series chain for
anchored_chain = stumpy.atsc(left_matrix_profile_index, right_matrix_profile_index, idx)
all_chain_set, longest_unanchored_chain = stumpy.allc(left_matrix_profile_index, right_matrix_profile_index)
Semantic Segmentation with Fast Low-cost Unipotent Semantic Segmentation (FLUSS) <https://stumpy.readthedocs.io/en/latest/api.html#stumpy.fluss>__:
.. code:: python
import stumpy
import numpy as np
if __name__ == "__main__":
your_time_series = np.random.rand(10000)
window_size = 50 # Approximately, how many data points might be found in a pattern
matrix_profile = stumpy.stump(your_time_series, m=window_size)
subseq_len = 50
correct_arc_curve, regime_locations = stumpy.fluss(matrix_profile[:, 1],
L=subseq_len,
n_regimes=2,
excl_factor=1
)
Dependencies
Supported Python and NumPy versions are determined according to the NEP 29 deprecation policy <https://numpy.org/neps/nep-0029-deprecation_policy.html>__.
NumPy <http://www.numpy.org/>__Numba <http://numba.pydata.org/>__SciPy <https://www.scipy.org/>__
Where to get it
conda:
.. code:: bash
conda install -c conda-forge stumpy
pip:
.. code:: bash
python -m pip install stumpy
pixi:
.. code:: bash
pixi add stumpy
uv:
.. code:: bash
uv add stumpy
To install stumpy from source, see the instructions in the documentation <https://stumpy.readthedocs.io/en/latest/install.html>__.
Documentation
In order to fully understand and appreciate the underlying algorithms and applications, it is imperative that you read the original publications_. For a more detailed example of how to use STUMPY please consult the latest documentation <https://stumpy.readthedocs.io/en/latest/>__ or explore our hands-on tutorials <https://stumpy.readthedocs.io/en/latest/tutorials.html>__.
Performance
We tested the performance of computing the exact matrix profile using the Numba JIT compiled version of the code on r
