MStream
Anomaly Detection on Time-Evolving Streams in Real-time. Detecting intrusions (DoS and DDoS attacks), frauds, fake rating anomalies.
Install / Use
/learn @Stream-AD/MStreamREADME
MSᴛʀᴇᴀᴍ
<p> <a href="https://www2021.thewebconf.org/"> <img src="http://img.shields.io/badge/WWW-2021-red.svg"> </a> <a href="https://arxiv.org/pdf/2009.08451.pdf"> <img src="http://img.shields.io/badge/Paper-PDF-brightgreen.svg"> </a> <a href="https://www.comp.nus.edu.sg/~sbhatia/assets/pdf/MStream_slides.pdf"> <img src="http://img.shields.io/badge/Slides-PDF-ff9e18.svg"> </a> <a href="https://youtu.be/HtemjzuKryU"> <img src="http://img.shields.io/badge/Talk-Youtube-ff69b4.svg"> </a> <a href="https://github.com/Stream-AD/MStream/blob/master/LICENSE"> <img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg"> </a> </p>Implementation of
- MSᴛʀᴇᴀᴍ: Fast Anomaly Detection in Multi-Aspect Streams. Siddharth Bhatia, Arjit Jain, Pan Li, Ritesh Kumar, Bryan Hooi. The Web Conference (formerly WWW), 2021.
MSᴛʀᴇᴀᴍ detects group anomalies from a multi-aspect data stream in constant time and memory. We output an anomaly score for each record. MSᴛʀᴇᴀᴍ builds on top of MIDAS to work in a multi-aspect setting such as event-log data, multi-attributed graphs etc.
Demo
- Run
bash run.sh KDDto compile the code and run it on the KDD dataset. - Run
bash run.sh DOSto compile the code and run it on the DOS dataset. - Run
bash run.sh UNSWto compile the code and run it on the UNSW dataset.
MSᴛʀᴇᴀᴍ
- Change Directory to MSᴛʀᴇᴀᴍ folder
cd mstream - Run
maketo compile code and create the binary - Run
./mstream -n numericalfile -c categoricalfile -t timefile - Run
make cleanto clean binaries
Command line options
-h --help: produce help message-n --numerical: Numerical file name-c --categorical: Categorical file name-c --time: Timestamps file name-o --output: Output file name (default: scores.txt)-r --rows: Number of Hash Functions (default: 2)-b --buckets: Number of Buckets (default: 1024)-a --alpha: Temporal Decay Factor (default: 0.6)
Input file format for MSᴛʀᴇᴀᴍ
MSᴛʀᴇᴀᴍ expects the input multi-aspect record stream to be stored in three files:
Numerical file: contains,separated Numerical Features.Categorical file: contains,separated Categorical Features.Time File: contains Timestamps.
Both Numerical and Categorical files contain corresponding features of the multi-aspect record. Records should be sorted in non-decreasing order of their time stamps and the column delimiter should be ,
Datasets
Citation
If you use this code for your research, please consider citing our WWW paper.
@inproceedings{bhatia2021mstream,
title={Fast Anomaly Detection in Multi-Aspect Streams},
author={Siddharth Bhatia and Arjit Jain and Pan Li and Ritesh Kumar and Bryan Hooi},
booktitle={The Web Conference (WWW)},
year={2021}
}
Related Skills
product-manager-skills
31PM skill for Claude Code, Codex, Cursor, and Windsurf: diagnose SaaS metrics, critique PRDs, plan roadmaps, run discovery, and coach PM career transitions.
devplan-mcp-server
3MCP server for generating development plans, project roadmaps, and task breakdowns for Claude Code. Turn project ideas into paint-by-numbers implementation plans.
