DeepOD
Deep learning-based outlier/anomaly detection
Install / Use
/learn @xuhongzuo/DeepODREADME
Python Deep Outlier/Anomaly Detection (DeepOD)
.. image:: https://github.com/xuhongzuo/DeepOD/actions/workflows/testing.yml/badge.svg :target: https://github.com/xuhongzuo/DeepOD/actions/workflows/testing.yml :alt: testing2
.. image:: https://readthedocs.org/projects/deepod/badge/?version=latest :target: https://deepod.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status
.. image:: https://app.codacy.com/project/badge/Grade/2c587126aac2441abb917c032189fbe8 :target: https://app.codacy.com/gh/xuhongzuo/DeepOD/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade :alt: codacy
.. image:: https://coveralls.io/repos/github/xuhongzuo/DeepOD/badge.svg?branch=main :target: https://coveralls.io/github/xuhongzuo/DeepOD?branch=main :alt: coveralls
.. image:: https://static.pepy.tech/personalized-badge/deepod?period=total&units=international_system&left_color=black&right_color=orange&left_text=Downloads :target: https://pepy.tech/project/deepod :alt: downloads
.. image:: https://img.shields.io/badge/license-BSD2-blue :alt: license
DeepOD is an open-source python library for Deep Learning-based Outlier Detection <https://en.wikipedia.org/wiki/Anomaly_detection>_
and Anomaly Detection <https://en.wikipedia.org/wiki/Anomaly_detection>_. DeepOD supports tabular anomaly detection and time-series anomaly detection.
DeepOD includes 27 deep outlier detection / anomaly detection algorithms (in unsupervised/weakly-supervised paradigm). More baseline algorithms will be included later.
DeepOD is featured for:
- Unified APIs across various algorithms.
- SOTA models includes reconstruction-, representation-learning-, and self-superivsed-based latest deep learning methods.
- Comprehensive Testbed that can be used to directly test different models on benchmark datasets (highly recommend for academic research).
- Versatile in different data types including tabular and time-series data (DeepOD will support other data types like images, graph, log, trace, etc. in the future, welcome PR :telescope:).
- Diverse Network Structures can be plugged into detection models, we now support LSTM, GRU, TCN, Conv, and Transformer for time-series data. (welcome PR as well :sparkles:)
If you are interested in our project, we are pleased to have your stars and forks :thumbsup: :beers: .
Installation
The DeepOD framework can be installed via:
.. code-block:: bash
pip install deepod
install a developing version (strongly recommend)
.. code-block:: bash
git clone https://github.com/xuhongzuo/DeepOD.git
cd DeepOD
pip install .
Usages
Directly use detection models in DeepOD: ::::::::::::::::::::::::::::::::::::::::::
DeepOD can be used in a few lines of code. This API style is the same with Sklean <https://github.com/scikit-learn/scikit-learn>_ and PyOD <https://github.com/yzhao062/pyod>_.
for tabular anomaly detection:
.. code-block:: python
# unsupervised methods
from deepod.models.tabular import DeepSVDD
clf = DeepSVDD()
clf.fit(X_train, y=None)
scores = clf.decision_function(X_test)
# weakly-supervised methods
from deepod.models.tabular import DevNet
clf = DevNet()
clf.fit(X_train, y=semi_y) # semi_y uses 1 for known anomalies, and 0 for unlabeled data
scores = clf.decision_function(X_test)
# evaluation of tabular anomaly detection
from deepod.metrics import tabular_metrics
auc, ap, f1 = tabular_metrics(y_test, scores)
for time series anomaly detection:
.. code-block:: python
# time series anomaly detection methods
from deepod.models.time_series import TimesNet
clf = TimesNet()
clf.fit(X_train)
scores = clf.decision_function(X_test)
# evaluation of time series anomaly detection
from deepod.metrics import ts_metrics
from deepod.metrics import point_adjustment # execute point adjustment for time series ad
eval_metrics = ts_metrics(labels, scores)
adj_eval_metrics = ts_metrics(labels, point_adjustment(labels, scores))
Testbed usage: ::::::::::::::::::::::::::::::::::::::::::
Testbed contains the whole process of testing an anomaly detection model, including data loading, preprocessing, anomaly detection, and evaluation.
Please refer to testbed/
-
testbed/testbed_unsupervised_ad.pyis for testing unsupervised tabular anomaly detection models. -
testbed/testbed_unsupervised_tsad.pyis for testing unsupervised time-series anomaly detection models.
Key arguments:
-
--input_dir: name of the folder that contains datasets (.csv, .npy) -
--dataset: "FULL" represents testing all the files within the folder, or a list of dataset names using commas to split them (e.g., "10_cover*,20_letter*") -
--model: anomaly detection model name -
--runs: how many times running the detection model, finally report an average performance with standard deviation values
Example:
- Download
ADBench <https://github.com/Minqi824/ADBench/tree/main/adbench/datasets/>_ datasets. - modify the
dataset_rootvariable as the directory of the dataset. input_diris the sub-folder name of thedataset_root, e.g.,ClassicalorNLP_by_BERT.- use the following command in the bash
.. code-block:: bash
cd DeepOD
pip install .
cd testbed
python testbed_unsupervised_ad.py --model DeepIsolationForest --runs 5 --input_dir ADBench
Implemented Models
**Tabular Anomaly Detection models:**
.. csv-table::
:header: "Model", "Venue", "Year", "Type", "Title"
:widths: 4, 4, 4, 8, 20
Deep SVDD, ICML, 2018, unsupervised, Deep One-Class Classification [#Ruff2018Deep]_
REPEN, KDD, 2018, unsupervised, Learning Representations of Ultrahigh-dimensional Data for Random Distance-based Outlier Detection [#Pang2019Repen]_
RDP, IJCAI, 2020, unsupervised, Unsupervised Representation Learning by Predicting Random Distances [#Wang2020RDP]_
RCA, IJCAI, 2021, unsupervised, RCA: A Deep Collaborative Autoencoder Approach for Anomaly Detection [#Liu2021RCA]_
GOAD, ICLR, 2020, unsupervised, Classification-Based Anomaly Detection for General Data [#Bergman2020GOAD]_
NeuTraL, ICML, 2021, unsupervised, Neural Transformation Learning for Deep Anomaly Detection Beyond Images [#Qiu2021Neutral]_
ICL, ICLR, 2022, unsupervised, Anomaly Detection for Tabular Data with Internal Contrastive Learning [#Shenkar2022ICL]_
DIF, TKDE, 2023, unsupervised, Deep Isolation Forest for Anomaly Detection [#Xu2023DIF]_
SLAD, ICML, 2023, unsupervised, Fascinating Supervisory Signals and Where to Find Them: Deep Anomaly Detection with Scale Learning [#Xu2023SLAD]_
DevNet, KDD, 2019, weakly-supervised, Deep Anomaly Detection with Deviation Networks [#Pang2019DevNet]_
PReNet, KDD, 2023, weakly-supervised, Deep Weakly-supervised Anomaly Detection [#Pang2023PreNet]_
Deep SAD, ICLR, 2020, weakly-supervised, Deep Semi-Supervised Anomaly Detection [#Ruff2020DSAD]_
FeaWAD, TNNLS, 2021, weakly-supervised, Feature Encoding with AutoEncoders for Weakly-supervised Anomaly Detection [#Zhou2021FeaWAD]_
RoSAS, IP&M, 2023, weakly-supervised, RoSAS: Deep semi-supervised anomaly detection with contamination-resilient continuous supervision [#Xu2023RoSAS]_
**Time-series Anomaly Detection models:**
.. csv-table::
:header: "Model", "Venue", "Year", "Type", "Title"
:widths: 4, 4, 4, 8, 20
DCdetector, KDD, 2023, unsupervised, DCdetector: Dual Attention Contrastive Representation Learning for Time Series Anomaly Detection [#Yang2023dcdetector]_
TimesNet, ICLR, 2023, unsupervised, TIMESNET: Temporal 2D-Variation Modeling for General Time Series Analysis [#Wu2023timesnet]_
AnomalyTransformer, ICLR, 2022, unsupervised, Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy [#Xu2022transformer]_
NCAD, IJCAI, 2022, unsupervised, Neural Contextual Anomaly Detection for Time Series [#Carmona2022NCAD]_
TranAD, VLDB, 2022, unsupervised, TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data [#Tuli2022TranAD]_
COUTA, TKDE, 2024, unsupervised, Calibrated One-class Classification for Unsupervised Time Series Anomaly Detection [#Xu2024COUTA]_
USAD, KDD, 2020, unsupervised, USAD: UnSupervised Anomaly Detection on Multivariate Time Series
DIF, TKDE, 2023, unsupervised, Deep Isolation Forest for Anomaly Detection [#Xu2023DIF]_
TcnED, TNNLS, 2021, unsupervised, An Evaluation of Anomaly Detection and Diagnosis in Multivariate Time Series [#Garg2021Evaluation]_
Deep SVDD (TS), ICML, 2018, unsupervised, Deep One-Class Classification [#Ruff2018Deep]_
DevNet (TS), KDD, 2019, weakly-supervised, Deep Anomaly Detection with Deviation Networks [#Pang2019DevNet]_
PReNet (TS), KDD, 2023, weakly-supervised, Deep Weakly-supervised Anomaly Detection [#Pang2023PreNet]_
Deep SAD (TS), ICLR, 2020, weakly-supervised, Deep Semi-Supervised Anomaly Detection [#Ruff2020DSAD]_
NOTE:
- For Deep SVDD, DevNet, PReNet, and DeepSAD, we employ network structures that can handle time-series data. These models' classes have a parameter named ``network`` in these models, by changing it, you can use different networks.
- We currently support 'TCN', 'GRU', 'LSTM', 'Transformer', 'ConvSeq', and 'DilatedConv'.
Citation
~~~~~~~~~~~~~~~~~
If you use this library in your work, please cite these papers:
Xu, H., Pang, G., Wang, Y., & Wang, Y. (2023). Deep isolation forest for anomaly detection. IEEE Transactions on Knowledge and Data Engineering, 35(12), 12591-12604.
Xu, H., Wang, Y., Jian, S., Liao, Q., Wang, Y., & Pang, G. (2024). Calibrated one-class classification for unsupervised time series anomaly detection. IEEE Transactions on Knowledge and Data Engineering.
You can also use the BibTex entry below for citation.
.. code-
