Trixter
Trixter Proxy + tokio-netem: chaos engineering tools for Rust networking
Install / Use
/learn @brk0v/TrixterREADME
Project Overview
-
trixter— a high‑performance, runtime‑tunable TCP chaos proxy — a minimal, blazing‑fast written in Rust with Tokio. It lets you inject latency, throttle bandwidth, slice writes (to simulate small MTUs/Nagle‑like behavior), corrupt bytes in flight by injecting random bytes, randomly terminate connections, and hard‑timeout sessions – all controllable per connection via a simple REST API. -
tokio-netem— a collection of TokioAsyncRead/AsyncWriteadapters (delay, throttle, slice, terminate, shutdown, corrupt data, inject data) that power theTrixterproxy and can be used independently in tests and harnesses.
The remainder of this document dives into the proxy. For the adapter crate’s detailed guide, follow the tokio-netem link above.
Trixter – Chaos Monkey TCP Proxy
A high‑performance, runtime‑tunable TCP chaos proxy — a minimal, blazing‑fast alternative to Toxiproxy written in Rust with Tokio. It lets you inject latency, throttle bandwidth, slice writes (to simulate small MTUs/Nagle‑like behavior), corrupt bytes in flight by injecting random bytes, randomly terminate connections, and hard‑timeout sessions – all controllable per connection via a simple REST API.
Why Trixter?
- Zero-friction: one static binary, no external deps.
- Runtime knobs: flip chaos on/off without restarting.
- Per-conn control: target just the flows you want.
- Minimal overhead: adapters are lightweight and composable.
Features
- Fast path:
tokio::io::copy_bidirectionalon a multi‑thread runtime; - Runtime control (per active connection):
- Latency: add/remove delay in ms.
- Throttle: cap bytes/sec.
- Slice: split writes into fixed‑size chunks.
- Corrupt: inject random bytes with a tunable probability.
- Chaos termination: probability [0.0..=1.0] to abort on each read/write.
- Hard timeout: stop a session after N milliseconds.
- REST API to list connections and change settings on the fly.
- Targeted kill: shut down a single connection with a reason.
- Deterministic chaos: seed the RNG for reproducible scenarios.
- RST on chaos: resets (best-effort) when a timeout/termination triggers.
Quick start
1. Run an upstream echo server (demo)
Use any TCP server. Examples:
nc -lk 127.0.0.1 8181
2. Run trixter chaos proxy
with docker:
docker run --network host -it --rm ghcr.io/brk0v/trixter \
--listen 0.0.0.0:8080 \
--upstream 127.0.0.1:8181 \
--api 127.0.0.1:8888 \
--delay-ms 0 \
--throttle-rate-bytes 0 \
--slice-size-bytes 0 \
--corrupt-probability-rate 0.0 \
--terminate-probability-rate 0.0 \
--connection-duration-ms 0 \
--random-seed 42
or build from scratch:
cd trixter/trixter
cargo build --release
or install with cargo:
cargo install trixter
and run:
RUST_LOG=info \
./target/release/trixter \
--listen 0.0.0.0:8080 \
--upstream 127.0.0.1:8181 \
--api 127.0.0.1:8888 \
--delay-ms 0 \
--throttle-rate-bytes 0 \
--slice-size-bytes 0 \
--corrupt-probability-rate 0.0 \
--terminate-probability-rate 0.0 \
--connection-duration-ms 0 \
--random-seed 42
3. Test
Now connect your app/CLI to localhost:8080. The proxy forwards to 127.0.0.1:8181.
REST API
Base URL is the --api address, e.g. http://127.0.0.1:8888.
Data model
{
"conn_info": {
"id": "pN7e3y...",
"downstream": "127.0.0.1:59024",
"upstream": "127.0.0.1:8181"
},
"delay": { "secs": 2, "nanos": 500000000 },
"throttle_rate": 10240,
"slice_size": 512,
"terminate_probability_rate": 0.05,
"corrupt_probability_rate": 0.02
}
Notes:
idis unique per connection; use it to target a single connection.corrupt_probability_rateandterminate_probability_ratereport the current per-operation flip probability (0.0when it is off).
Health check
curl -s http://127.0.0.1:8888/health
List connections
curl -s http://127.0.0.1:8888/connections | jq
Kill a connection
ID=$(curl -s http://127.0.0.1:8888/connections | jq -r '.[0].conn_info.id')
curl -i -X POST \
http://127.0.0.1:8888/connections/$ID/shutdown \
-H 'Content-Type: application/json' \
-d '{"reason":"test teardown"}'
Kill all connections
curl -i -X POST \
http://127.0.0.1:8888/connections/_all/shutdown \
-H 'Content-Type: application/json' \
-d '{"reason":"test teardown"}'
Set latency (ms)
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/delay \
-H 'Content-Type: application/json' \
-d '{"delay_ms":250}'
# Remove latency
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/delay \
-H 'Content-Type: application/json' \
-d '{"delay_ms":0}'
Throttle bytes/sec
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/throttle \
-H 'Content-Type: application/json' \
-d '{"rate_bytes":10240}' # 10 KiB/s
Slice writes (bytes)
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/slice \
-H 'Content-Type: application/json' \
-d '{"size_bytes":512}'
Randomly terminate reads/writes
# Set 5% probability per read/write operation
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/termination \
-H 'Content-Type: application/json' \
-d '{"probability_rate":0.05}'
Inject random bytes
# Corrupt ~1% of operations
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/corruption \
-H 'Content-Type: application/json' \
-d '{"probability_rate":0.01}'
# Remove corruption
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/corruption \
-H 'Content-Type: application/json' \
-d '{"probability_rate":0.0}'
Error responses
404 Not Found— bad connection ID400 Bad Request— invalid probability (outside 0.0..=1.0) for termination/corruption500 Internal Server Error— internal channel/handler error
CLI flags
--listen <ip:port> # e.g. 0.0.0.0:8080
--upstream <ip:port> # e.g. 127.0.0.1:8181
--api <ip:port> # e.g. 127.0.0.1:8888
--delay-ms <ms> # 0 = off (default)
--throttle-rate-bytes <bytes/s> # 0 = unlimited (default)
--slice-size-bytes <bytes> # 0 = off (default)
--terminate-probability-rate <0..1> # 0.0 = off (default)
--corrupt-probability-rate <0..1> # 0.0 = off (default)
--connection-duration-ms <ms> # 0 = unlimited (default)
--random-seed <u64> # seed RNG for deterministic chaos (optional)
All of the above can be changed per connection at runtime via the REST API, except
--connection-duration-mswhich is a process-wide default applied to new connections.Omit
--random-seedto draw entropy for every run; set it when you want bit-for-bit reproducibility.
How it works (architecture)
Each accepted downstream connection spawns a task that:
-
Connects to the upstream target.
-
Wraps both sides with tunable adapters with
tokio-netem:DelayedWriter→ optional latencyThrottledWriter→ bandwidth capSlicedWriter→ fixed‑size write chunksTerminator→ probabilistic abortsCorrupter→ probabilistic random byte injectorShutdowner(downstream only) → out‑of‑band shutdown via oneshot channel
-
Runs
tokio::io::copy_bidirectionaluntil EOF/error/timeout. -
Tracks the live connection in a
DashMapso the API can query/mutate it.
Use cases
- Flaky networks: simulate 3G/EDGE/satellite latency and low bandwidth.
- MTU/segmentation bugs: force small write slices to uncover packetization assumptions.
- Resilience drills: randomly kill connections during critical paths.
- Data validation: corrupt bytes to exercise checksums and retry logic.
- Timeout tuning: enforce hard upper‑bounds to validate client retry/backoff logic.
- Canary/E2E tests: target only specific connections and tweak dynamically.
- Load/soak: run for hours with varying chaos settings from CI/scripts.
Recipes
Simulate a shaky mobile link
# Add ~250ms latency and 64 KiB/s cap to the first active connection
ID=$(curl -s localhost:8888/connections | jq -r '.[0].conn_info.id')
curl -s -X PATCH localhost:8888/connections/$ID/delay \
-H 'Content-Type: application/json' -d '{"delay_ms":250}'
curl -s -X PATCH localhost:8888/connections/$ID/throttle \
-H 'Content-Type: application/json' -d '{"rate_bytes":65536}'
Force tiny packets (find buffering bugs)
curl -s -X PATCH localhost:8888/connections/$ID/slice \
-H 'Content-Type: application/json' -d '{"size_bytes":256}'
Introduce flakiness (5% ops abort)
curl -s -X PATCH localhost:8888/connections/$ID/termination \
-H 'Content-Type: application/json' -d '{"probability_rate":0.05}'
Add data corruption
curl -s -X PATCH localhost:8888/connections/$ID/corruption \
-H 'Content-Type: application/json' -d '{"probability_rate":0.01}'
Timebox a connection to 5s at startup
./trixter \
--listen 0.0.0.0:8080 \
--upstream 127.0.0.1:8181 \
--api 127.0.0.1:8888 \
--connection-duration-ms 5000
Kill the slowpoke
curl -s -X POST localhost:8888/connections/$ID/shutdown \
-H 'Content-Type: application/json' -d '{"reason":"too slow"}'
Integration: CI & E2E tests
- Spin up the proxy as a
