Sherlock
Sherlock is an anomaly detection service built on top of Druid
Install / Use
/learn @yahoo/SherlockREADME
Sherlock: Anomaly Detector
[![Release Artifacts][Badge-SonatypeReleases]][Link-SonatypeReleases]
[![Snapshot Artifacts][Badge-SonatypeSnapshots]][Link-SonatypeSnapshots]
Table of Contents
Introduction to Sherlock
Sherlock is an anomaly detection service built on top of Druid. It leverages EGADS (Extensible Generic Anomaly Detection System) to detect anomalies in time-series data. Users can schedule jobs on an hourly, daily, weekly, or monthly basis, view anomaly reports from Sherlock's interface, or receive them via email.
Components
Detailed Description
Timeseries Generation
Timeseries generation is the first phase of Sherlock's anomaly detection. The user inputs a full Druid JSON query with a metric name and group-by dimensions. Sherlock validates the query, adjusts the time interaval and granularity based on the EGADS config, and makes a call to Druid. Druid responds with an array of time-series, which are parsed into EGADS time-series.
Sample Druid Query:
{
"metric": "metric(metric1/metric2)",
"aggregations": [
{
"filter": {
"fields": [
{
"type": "selector",
"dimension": "dim1",
"value": "value1"
}
],
"type": "or"
},
"aggregator": {
"fieldName": "metric2",
"type": "longSum",
"name": "metric2"
},
"type": "filtered"
}
],
"dimension": "groupByDimension",
"intervals": "2017-09-10T00:00:01+00:00/2017-10-12T00:00:01+00:00",
"dataSource": "source1",
"granularity": {
"timeZone": "UTC",
"type": "period",
"period": "P1D"
},
"threshold": 50,
"postAggregations": [
{
"fields": [
{
"fieldName": "metric1",
"type": "fieldAccess",
"name": "metric1"
}
],
"type": "arithmetic",
"name": "metric(metric1/metric2)",
"fn": "/"
}
],
"queryType": "topN"
}
Sample Druid Response:
[ {
"timestamp" : "2017-10-11T00:00:00.000Z",
"result" : [ {
"groupByDimension" : "dim1",
"metric(metric1/metric2)" : 8,
"metric1" : 128,
"metric2" : 16
}, {
"groupByDimension" : "dim2",
"metric(metric1/metric2)" : 4.5,
"metric1" : 42,
"metric2" : 9.33
} ]
}, {
"timestamp" : "2017-10-12T00:00:00.000Z",
"result" : [ {
"groupByDimension" : "dim1",
"metric(metric1/metric2)" : 9,
"metric1" : 180,
"metric2" : 20
}, {
"groupByDimension" : "dim2",
"metric(metric1/metric2)" : 5.5,
"metric1" : 95,
"metric2" : 17.27
} ]
} ]
EGADS Anomaly Detection
Sherlock calls the user-configured EGADS API for each generated time-series, generates anomaly reports from the response, and stores these reports in a database. Users may also elect to receive anomaly reports by email.
Redis Database
Sherlock uses a Redis backend Redis to store job metadata, generated anomaly reports, among other information, and as a persistent job queue. Keys related to Reports have retention policy. Hourly job reports have retention of 14 days and daily/weekly/monthly job reports have 1 year of retention.
Sherlock UI
Sherlock's user interface is built with Spark. The UI enables users to submit instant anomaly analyses, create and launch detection jobs, view anomalies on a heatmap, and on a graph.
Building Sherlock
A Makefile is provided with all build targets.
Building the JAR
make jar
This creates sherlock.jar in the target/ directory.
How to run
Sherlock is run through the commandline with config arguments.
java -Dlog4j.configuration=file:${path_to_log4j}/log4j.properties \
-jar ${path_to_jar}/sherlock.jar \
--version $(VERSION) \
--project-name $(PROJECT_NAME) \
--port $(PORT) \
--enable-email \
--failure-email $(FAILURE_EMAIL) \
--from-mail $(FROM_MAIL) \
--reply-to $(REPLY_TO) \
--smtp-host $(SMTP_HOST) \
--interval-minutes $(INTERVAL_MINUTES) \
--interval-hours $(INTERVAL_HOURS) \
--interval-days $(INTERVAL_DAYS) \
--interval-weeks $(INTERVAL_WEEKS) \
--interval-months $(INTERVAL_MONTHS) \
--egads-config-filename $(EGADS_CONFIG_FILENAME) \
--redis-host $(REDIS_HOSTNAME) \
--redis-port $(REDIS_PORT) \
--execution-delay $(EXECUTION_DELAY) \
--timeseries-completeness $(TIMESERIES_COMPLETENESS)
CLI args usage
| args | required | default | description |
|---------------------------------------|---------------------|---------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------|
| --help | - | false | help |
| --config | - | null | config |
| --version | - | v0.0.0 | version |
| --egads-config-filename | - | provided | egads-config-filename |
| --port | - | 4080 | port |
| --interval-minutes | - | 180 | interval-minutes |
| --interval-hours | - | 672 | interval-hours |
| --interval-days | - | 28 | interval-days |
| --interval-weeks | - | 12 | interval-weeks |
| --interval-months | - | 6 | interval-months |
| --enable-email | - | false | enable-email |
| --from-mail | if email enabled | | from-mail |
| --reply-to | if email enabled | | reply-to |
| --smtp-host | if email enabled | | smtp-host |
| --smtp-port | - | 25 | smtp-port |
| --smtp-user | - | | smtp-user |
| --smtp-password | - | | [smtp-password](#smtp-passwor
Related Skills
node-connect
343.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
92.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
