Wrk2
A constant throughput, correct latency recording variant of wrk
Install / Use
/learn @giltene/Wrk2README
wrk2
a HTTP benchmarking tool based mostly on wrk
wrk2 is wrk modifed to produce a constant throughput load, and accurate latency details to the high 9s (i.e. can produce accurate 99.9999%'ile when run long enough). In addition to wrk's arguments, wrk2 takes a throughput argument (in total requests per second) via either the --rate or -R parameters (default is 1000).
CRITICAL NOTE: Before going farther, I'd like to make it clear that this work is in no way intended to be an attack on or a disparagement of the great work that Will Glozer has done with wrk. I enjoyed working with his code, and I sincerely hope that some of the changes I had made might be considered for inclusion back into wrk. As those of you who may be familiar with my latency related talks and rants, the latency measurement issues that I focused on fixing with wrk2 are extremely common in load generators and in monitoring code. I do not ascribe any lack of skill or intelligence to people who's creations repeat them. I was once (as recently as 2-3 years ago) just as oblivious to the effects of Coordinated Omission as the rest of the world still is.
wrk2 replaces wrk's individual request sample buffers with HdrHistograms. wrk2 maintains wrk's Lua API, including it's presentation of the stats objects (latency and requests). The stats objects are "emulated" using HdrHistograms. E.g. a request for a raw sample value at index i (see latency[i] below) will return the value at the associated percentile (100.0 * i / __len).
As a result of using HdrHistograms for full (lossless) recording, constant throughput load generation, and accurate tracking of response latency (from the point in time where a request was supposed to be sent per the "plan" to the time that it actually arrived), wrk2's latency reporting is significantly more accurate (as in "correct") than that of wrk's current (Nov. 2014) execution model.
It is important to note that in wrk2's current constant-throughput implementation, measured latencies are [only] accurate to a +/- ~1 msec granularity, due to OS sleep time behavior.
wrk2 is currently in experimental/development mode, and may well be merged into wrk in the future if others see fit to adopt it's changes.
The remaining part of the README is wrk's, with minor changes to reflect additional parameter and output. There is an important and detailed note at the end about about wrk2's latency measurement technique, including a discussion of Coordinated Omission, how wrk2 avoids it, and detailed output that demonstrates it.
wrk2 (as is wrk) is a modern HTTP benchmarking tool capable of generating significant load when run on a single multi-core CPU. It combines a multithreaded design with scalable event notification systems such as epoll and kqueue.
An optional LuaJIT script can perform HTTP request generation, response processing, and custom reporting. Several example scripts are located in scripts/
Basic Usage
wrk -t2 -c100 -d30s -R2000 http://127.0.0.1:8080/index.html
This runs a benchmark for 30 seconds, using 2 threads, keeping 100 HTTP connections open, and a constant throughput of 2000 requests per second (total, across all connections combined).
[It's important to note that wrk2 extends the initial calibration period to 10 seconds (from wrk's 0.5 second), so runs shorter than 10-20 seconds may not present useful information]
Output:
Running 30s test @ http://127.0.0.1:80/index.html
2 threads and 100 connections
Thread calibration: mean lat.: 9747 usec, rate sampling interval: 21 msec
Thread calibration: mean lat.: 9631 usec, rate sampling interval: 21 msec
Thread Stats Avg Stdev Max +/- Stdev
Latency 6.46ms 1.93ms 12.34ms 67.66%
Req/Sec 1.05k 1.12k 2.50k 64.84%
60017 requests in 30.01s, 19.81MB read
Requests/sec: 2000.15
Transfer/sec: 676.14KB
However, wrk2 will usually be run with the --latency flag, which provides detailed latency percentile information (in a format that can be easily imported to spreadsheets or gnuplot scripts and plotted per examples provided at http://hdrhistogram.org):
wrk -t2 -c100 -d30s -R2000 --latency http://127.0.0.1:80/index.html
Output:
Running 30s test @ http://127.0.0.1:80/index.html
2 threads and 100 connections
Thread calibration: mean lat.: 10087 usec, rate sampling interval: 22 msec
Thread calibration: mean lat.: 10139 usec, rate sampling interval: 21 msec
Thread Stats Avg Stdev Max +/- Stdev
Latency 6.60ms 1.92ms 12.50ms 68.46%
Req/Sec 1.04k 1.08k 2.50k 72.79%
Latency Distribution (HdrHistogram - Recorded Latency)
50.000% 6.67ms
75.000% 7.78ms
90.000% 9.14ms
99.000% 11.18ms
99.900% 12.30ms
99.990% 12.45ms
99.999% 12.50ms
100.000% 12.50ms
Detailed Percentile spectrum:
Value Percentile TotalCount 1/(1-Percentile)
0.921 0.000000 1 1.00
4.053 0.100000 3951 1.11
4.935 0.200000 7921 1.25
5.627 0.300000 11858 1.43
6.179 0.400000 15803 1.67
6.671 0.500000 19783 2.00
6.867 0.550000 21737 2.22
7.079 0.600000 23733 2.50
7.287 0.650000 25698 2.86
7.519 0.700000 27659 3.33
7.783 0.750000 29644 4.00
7.939 0.775000 30615 4.44
8.103 0.800000 31604 5.00
8.271 0.825000 32597 5.71
8.503 0.850000 33596 6.67
8.839 0.875000 34571 8.00
9.015 0.887500 35070 8.89
9.143 0.900000 35570 10.00
9.335 0.912500 36046 11.43
9.575 0.925000 36545 13.33
9.791 0.937500 37032 16.00
9.903 0.943750 37280 17.78
10.015 0.950000 37543 20.00
10.087 0.956250 37795 22.86
10.167 0.962500 38034 26.67
10.279 0.968750 38268 32.00
10.343 0.971875 38390 35.56
10.439 0.975000 38516 40.00
10.535 0.978125 38636 45.71
10.647 0.981250 38763 53.33
10.775 0.984375 38884 64.00
10.887 0.985938 38951 71.11
11.007 0.987500 39007 80.00
11.135 0.989062 39070 91.43
11.207 0.990625 39135 106.67
11.263 0.992188 39193 128.00
11.303 0.992969 39226 142.22
11.335 0.993750 39255 160.00
11.367 0.994531 39285 182.86
11.399 0.995313 39319 213.33
11.431 0.996094 39346 256.00
11.455 0.996484 39365 284.44
11.471 0.996875 39379 320.00
11.495 0.997266 39395 365.71
11.535 0.997656 39408 426.67
11.663 0.998047 39423 512.00
11.703 0.998242 39431 568.89
11.743 0.998437 39439 640.00
11.807 0.998633 39447 731.43
12.271 0.998828 39454 853.33
12.311 0.999023 39463 1024.00
12.327 0.999121 39467 1137.78
12.343 0.999219 39470 1280.00
12.359 0.999316 39473 1462.86
12.375 0.999414 39478 1706.67
12.391 0.999512 39482 2048.00
12.399 0.999561 39484 2275.56
12.407 0.999609 39486 2560.00
12.415 0.999658 39489 2925.71
12.415 0.999707 39489 3413.33
12.423 0.999756 39491 4096.00
12.431 0.999780 39493 4551.11
12.431 0.999805 39493 5120.00
12.439 0.999829 39495 5851.43
12.439 0.999854 39495 6826.67
12.447 0.999878 39496 8192.00
12.447 0.999890 39496 9102.22
12.455 0.999902 39497 10240.00
12.455 0.999915 39497 11702.86
12.463 0.999927 39498 13653.33
12.463 0.999939 39498 16384.00
12.463 0.999945 39498 18204.44
12.479 0.999951 39499 20480.00
12.479 0.999957 39499 23405.71
12.479 0.999963 39499 27306.67
12.479 0.999969 39499 32768.00
12.479 0.999973 39499 36408.89
12.503 0.999976 39500 40960.00
12.503 1.000000 39500 inf
#[Mean = 6.602, StdDeviation = 1.919]
#[Max = 12.496, Total count = 39500]
#[Buckets = 27, SubBuckets = 2048]
----------------------------------------------------------
60018 requests in 30.00s, 19.81MB read
Requests/sec: 2000.28
Related Skills
node-connect
337.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
337.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.1kCommit, push, and open a PR
