CtsTraffic
ctsTraffic is a highly scalable client/server networking tool giving detailed performance and reliability analytics
Install / Use
/learn @microsoft/CtsTrafficREADME
ctsTraffic
ctsTraffic is a highly scalable client/server networking tool giving detailed performance and reliability analytics
Requirements
- Windows 10 or later: This project uses modern Windows networking and threading APIs (IO Completion Ports,
SetThreadGroupAffinity, and related Winsock features) that require Windows 10+ for full behavior and processor-group support. Building and running on older Windows versions may result in missing API support or degraded behavior.
If you would like to download the latest build and not have to pull down the source code to build it yourself, you can download them from https://github.com/microsoft/ctsTraffic/tree/master/Releases/2.0.3.9 .
New Visualization Tool!
A great new visualization toolset has been created that can post-process the output files generated by ctsTraffic. At your convenince, please look at https://github.com/microsoft/Network-Performance-Visualization
A Practical Guide
ctsTraffic was a tool initially developed just after Windows 7 shipped to accurately measure how our diverse network deployments scale, as well as assessing its network reliability. Since then we have added a huge number of options to work within an increasingly growing number of deployments. This document reviews the 90% case that most people would likely want to start.
Good-Put
ctsTraffic is deliberately designed and implemented to demonstrate various best-practice guidance we (Winsock) have provided app developers for designing efficient and scalable solutions. It has a "pluggable" model where we have author multiple different IO models -- but the default IO model is what will be most scalable for most network-facing applications.
As our IO models are implemented to model what we want apps and services to build, the resulting performance data is a strong reflection of what one can expect normal apps and services to see in the tested deployment. This throughput measurement of data as seen from the app is commonly referred to as "good-put" (as opposed to "through-put" which is generally measured at the hardware level in raw bits/sec).
A suggested starting point: measuring Good Put
The below set of options (using most default options) is generally a good starting point when measuring good put and reliability. These options will have clients maintain 8 TCP connections with the server, sending 1GB of data per connection. Data will be flowing unidirectionally from the client to the server ('upload' scenarios).
These options will also a good starting point to track the reliability of a network deployment. It provides data across multiple reliability pivots:
- Reliably establishing connections over time
- Reliably in maintaining connections, sending precisely 1GB of data, with both sides agreeing on the number sent and received
- Reliability in data integrity for all bytes received: every buffer received is validated against a specific bit-pattern that we use to catch data corruption
| Server | Client |
| ----------------------------- | -------------------------------------------- |
| ctsTraffic.exe | ctsTraffic.exe |
| -listen:* | -target:<server> |
| -consoleverbosity:1 | -consoleverbosity:1 |
| | -statusfilename:clientstatus.csv |
| | -connectionfilename:clientconnections.csv |
Note: if one needs to measure the other direction, the clients receiving data from servers, one should append -pattern:pull to the above commands on both the client and the server.
We found the above default values to generally be an effective balance when measuring Good Put, balancing the number of connections being established to send and receive data with the number of bytes being sent per connection. We found these values scale very well across many scenarios: down to small devices with slower connections and up to reaching 10Gbit deployments. (Note: once one gets to 10Gb we recommend doubling the number of connections and moving to 1TB of data sent; increasing both again at 40Gb).
Explaining the console output
As a sample run, the below is output from a quick test ran over loopback (client and server were both run on my same machine). Note that the -consoleverbosity: flag controls the type and detail of what it output to the console (setting 0 turns off all output).
C:\Users\kehor\Desktop\2.0.1.7> ctsTraffic.exe -target:localhost -consoleverbosity:1 -statusfilename:clientstatus.csv -connectionfilename:clientconnections.csv
Configured Settings
Protocol: TCP
Options: InlineIOCP
IO function: Iocp (WSASend/WSARecv using IOCP)
IoPattern: Push \<TCP client send/server recv\>
PrePostRecvs: 1
PrePostSends: 1
Level of verification: Connections & Data
Port: 4444
Buffer used for each IO request: 65536 \[0x10000\] bytes
Total transfer per connection: 1073741824 bytes
Connecting out to addresses:
[::1]:4444
127.0.0.1:4444
Binding to local addresses for outgoing connections:
0.0.0.0
::
Connection limit (maximum established connections): 8 \[0x8\]
Connection throttling rate (maximum pended connection attempts): 1000 \[0x3e8\]
Total outgoing connections before exit (iterations \* concurrent connections) : 0xffffffffffffffff
Legend:
* TimeSlice - (seconds) cumulative runtime
* Send & Recv Rates - bytes/sec that were transferred within the TimeSlice period
* In-Flight - count of established connections transmitting IO pattern data
* Completed - cumulative count of successfully completed IO patterns
* Network Errors - cumulative count of failed IO patterns due to Winsock errors
* Data Errors - cumulative count of failed IO patterns due to data errors
| TimeSlice | SendBps | RecvBps | In-Flight | Completed | NetError | DataError | | --------: | ------: | ------: | --------: | --------: | -------: | --------: | | 0.001 | 0 | 0 | 8 | 0 | 0 | 0 | | 5.002 | 2635357062 | 124 | 8 | 8 | 0 | 0 | | 10.003 | 2519263596 | 171 | 8 | 19 | 0 | 0 | | 15.001 | 2437002784 | 202 | 8 | 32 | 0 | 0 | | 20.002 | 2639655364 | 171 | 8 | 43 | 0 | 0 | | 25.002 | 2557516185 | 218 | 8 | 57 | 0 | 0 |
Historic Connection Statistics (all connections over the complete lifetime)
SuccessfulConnections [59] NetworkErrors [0] ProtocolErrors [0]
Total Bytes Recv : 5194
Total Bytes Sent : 67358818304
Total Time : 26357 ms.
Configured Settings
The banner under "Configured Settings" shows many of the defaulted options.
- Default is to establish TCP connections using OVERLAPPED I/O with IO
completion ports managed by the NT threadpool, by default handling
inline successful completions. Thus all send and receive requests
will loop until an IO request pends, where we will wait for the
completion port to notify when it completes, when we'll continue to
send and receive.
- IO model is configurable: -IO. Inline IO completions is configurable: -inlineCompletions. Protocol is configurable: -protocol.
- Default pattern when sending and receiving is to "Push" data,
directionally sending data from the client to the server.
- The pattern to send and receive is configurable: -pattern.
- Default is to use 64K buffers for every send and receive request,
transferring a total of 1GB of data.
- The buffer size used for each IO request is configurable: -buffer. The total amount of data to transfer over each TCP connection is configurable: -transfer.
- Shows all resolved addresses which will be used in a round-robin
fashion as connections are made.
- Target IP addresses is configurable using one or more -Target options, specifying a name or IP address servers which to connect.
- Shows that will use ephemeral binding (binding to 'any' address of
all zeros lets the TCP/IP stack choose the best route to each target
address).
- The local addresses to use for outbound connections is configurable: -bind.
- Default is to keep 8 concurrent connections established and moving
data. Will throttle outgoing connections by only keeping up to 1000
connection attempts in flight at any one time (will above our 8
concurrent connections ).
- The number of connections to establish is configurable: -connections. The connection throttling limit is configurable (though not recommended): -throttleConnections.
- Default is to indefinitely continue making connections as individual
connections complete -- maintaining 8 connections at all time.
- The total number of connections is configurable: -connections: and -iterations. The total connections is the product of connections * iterations.
- e.g. -connections:100 -iterations:10 will iterate 10 times over 100 connections for a total of 1000 connections, at which point ctsTraffic will gracefully exit.
-consoleverbosity:1
Setting console verbosityto 1 will output an aggregate status at each time slice. The default time slice is every 5 seconds; the time slice is configurable: -statusUpdate. At every 5 seconds, a line will be output communicating the following aggregate information:
- TimeSlice: the time window in seconds with millisecond precision, starting from when ctsTraffic was launched (since -statusUpdate can set update frequency in milliseconds)
- SendBps: the sent bytes/second [within that specific time slice]
- RecvBps: the received bytes/second
