Dragnet

Dragnet is a tool for analyzing event stream data stored in files. There are three main commands:

scan: scan over raw data to execute a query
build: scan over raw data to produce an index for quickly answering predefined queries
query: search indexes to execute a query

The prototypical use case is analyzing request logs from a production service. The workflow for Dragnet looks like this:

Predefine a bunch of metrics you care about (like total request count, request count by server instance, request type, and so on).
When you accumulate new logs (e.g., hourly or daily), you build the index.
Whenever you want the values of those metrics, you query the index. This might be part of a constantly-updating dashboard, a daily report, or a threshold-based alarm.
If you want to gather new metrics, you can define them and rebuild.
If you want to run a complex query just once, you can scan the raw data rather than adding the query as a metric.

This project is still a prototype. The commands and library interfaces may change incompatibly at any time!

Getting started

dragnet only supports newline-separated JSON. Try it on the sample data in ./tests/data. Start by defining a new datasource:

$ dn datasource-add my_logs --path=$PWD/tests/data

$ dn datasource-list -v
DATASOURCE           LOCATION                                                   
my_logs              file://home/dap/dragnet/dragnet/tests/data                 
    dataFormat: "json"

Now you can scan the data to count the total number of requests:

$ dn scan my_logs
VALUE
 2252

You can also break out counts, e.g., by request method:

$ dn scan -b req.method my_logs
REQ.METHOD VALUE
DELETE       582
GET          556
HEAD         551
PUT          563

You can break out results by more than one field:

$ dn scan -b req.method,res.statusCode my_logs
REQ.METHOD RES.STATUSCODE VALUE
DELETE     200               75
DELETE     204               87
DELETE     400               94
DELETE     404               85
DELETE     499               83
DELETE     500               79
DELETE     503               79
GET        200               77
GET        204               83
GET        400               84
GET        404               74
GET        499               79
GET        500               73
GET        503               86
HEAD       200               71
HEAD       204               85
HEAD       400               66
HEAD       404               77
HEAD       499               88
HEAD       500               88
HEAD       503               76
PUT        200               80
PUT        204               79
PUT        400               83
PUT        404               88
PUT        499               68
PUT        500               83
PUT        503               82

(This is randomly-generated data, which is why you see some combinations that probably don't make sense, like a 200 from a DELETE.)

You can specify multiple fields separated by commas, like above, or using "-b" more than once. This example does the same thing as the previous one:

$ dn scan -b req.method -b res.statusCode my_logs
REQ.METHOD RES.STATUSCODE VALUE
DELETE     200               75
DELETE     204               87
DELETE     400               94
DELETE     404               85
DELETE     499               83
DELETE     500               79
DELETE     503               79
GET        200               77
GET        204               83
GET        400               84
GET        404               74
GET        499               79
GET        500               73
GET        503               86
HEAD       200               71
HEAD       204               85
HEAD       400               66
HEAD       404               77
HEAD       499               88
HEAD       500               88
HEAD       503               76
PUT        200               80
PUT        204               79
PUT        400               83
PUT        404               88
PUT        499               68
PUT        500               83
PUT        503               82

The order of breakdowns matters. If we reverse them, we get different output:

$ dn scan -b res.statusCode,req.method my_logs
RES.STATUSCODE REQ.METHOD VALUE
200            DELETE        75
200            GET           77
200            HEAD          71
200            PUT           80
204            DELETE        87
204            GET           83
204            HEAD          85
204            PUT           79
400            DELETE        94
400            GET           84
400            HEAD          66
400            PUT           83
404            DELETE        85
404            GET           74
404            HEAD          77
404            PUT           88
499            DELETE        83
499            GET           79
499            HEAD          88
499            PUT           68
500            DELETE        79
500            GET           73
500            HEAD          88
500            PUT           83
503            DELETE        79
503            GET           86
503            HEAD          76
503            PUT           82

Filters

You can filter records using node-krill filter syntax:

$ dn scan -f '{ "eq": [ "req.method", "GET" ] }' my_logs
VALUE
  556

and you can combine this with breakdowns, of course:

$ dn scan -f '{ "eq": [ "req.method", "GET" ] }' -b operation my_logs
OPERATION        VALUE
getjoberrors       181
getpublicstorage   176
getstorage         199

Numeric breakdowns

To break down by numeric quantities, it's usually best to aggregate nearby values into buckets. Here's a histogram of the "latency" field from this log:

$ dn scan -b latency[aggr=quantize] my_logs

           value  ------------- Distribution ------------- count
               0 |                                         0
               1 |@@                                       113
               2 |@@@@@@@@                                 449
               4 |@@@@@@                                   348
               8 |                                         0
              16 |@@@@@@@@@@@@                             682
              32 |                                         0
              64 |@                                        57
             128 |@@@                                      165
             256 |                                         0
             512 |                                         0
            1024 |@@                                       136
            2048 |@@@@@                                    302
            4096 |                                         0

"aggr=quantize" specifies a power-of-two bucketization. You can also do a linear quantization, say with steps of size 200:

$ dn scan -b latency[aggr=lquantize,step=200] my_logs

           value  ------------- Distribution ------------- count
               0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@         1814
             200 |                                         0
             400 |                                         0
             600 |                                         0
             800 |                                         0
            1000 |                                         23
            1200 |@                                        31
            1400 |@                                        35
            1600 |                                         18
            1800 |                                         24
            2000 |@                                        34
            2200 |@                                        35
            2400 |                                         28
            2600 |@                                        33
            2800 |                                         18
            3000 |@                                        34
            3200 |                                         27
            3400 |@                                        34
            3600 |                                         26
            3800 |                                         25
            4000 |                                         13
            4200 |                                         0

These are modeled after DTrace's aggregating actions. You can combine these with filters and other breakdowns:

$ dn scan -f '{ "eq": [ "req.method", "GET" ] }' \
    -b req.method,operation,latency[aggr=quantize] my_logs
GET, getjoberrors
           value  ------------- Distribution ------------- count
               0 |                                         0
               1 |@@                                       9
               2 |@@@@@@@                                  32
               4 |@@@@@                                    24
               8 |                                         0
              16 |@@@@@@@@@@@@@@                           63
              32 |                                         0
              64 |@                                        5
             128 |@@@                                      13
             256 |                                         0
             512 |                                         0
            1024 |@@@

Dragnet

Install / Use

README

Dragnet

Getting started

Filters

Numeric breakdowns