Namazu
:fish: 鯰: Programmable fuzzy scheduler for testing distributed systems
Install / Use
/learn @osrg/NamazuREADME
Namazu: Programmable Fuzzy Scheduler for Testing Distributed Systems
Namazu (formerly named Earthquake) is a programmable fuzzy scheduler for testing real implementations of distributed system such as ZooKeeper.

Namazu permutes Java function calls, Ethernet packets, Filesystem events, and injected faults in various orders so as to find implementation-level bugs of the distributed system.
Namazu can also control non-determinism of the thread interleaving (by calling sched_setattr(2) with randomized parameters).
So Namazu can be also used for testing standalone multi-threaded software.
Basically, Namazu permutes events in a random order, but you can write your own state exploration policy (in Golang) for finding deep bugs efficiently.
Namazu (鯰) means a catfish :fish: in Japanese.
Blog: http://osrg.github.io/namazu/
Twitter: @NamazuFuzzTest
Looking for Namazu Swarm (Distributed Parallel CI)?
Namazu Swarm executes multiple CI jobs in parallel across a Docker cluster. Namazu Swarm is developed as a part of Namazu, but it does not depends on Namazu (although you can combine them together).
<img src="https://raw.githubusercontent.com/osrg/namazu-swarm/507f1ea51790ebc6d64740e8eb14e009d0353970/docs/img/nmzswarm.png" width="300" />Namazu Swarm is hosted at osrg/namazu-swarm.
Found and Reproduced Bugs
:new:=Found, :repeat:=Reproduced
Flaky integration tests
Issue|Reproducibility<br>(traditional)|Reproducibility<br>(Namazu)|Note ---|---|---|--- :new: ZOOKEEPER-2212<br>(race)|0%|21.8%|In traditional testing, we could not reproduce the issue in 5,000 runs (60 hours). We newly found the issue and improved its reproducibility using Namazu Ethernet inspector. Note that the reproducibility improvement depends on its configuration(see also #137).<br>Blog article and repro code (Ryu SDN version and Netfilter version) are available.
Flaky xUnit tests (picked out, please see also #125)
Issue|Reproducibility<br>(traditional)|Reproducibility<br>(Namazu)|Note ---|---|---|--- :repeat: YARN-4548|11%|82%|Used Namazu process inspector. :repeat: ZOOKEEPER-2080|14%|62%|Used Namazu Ethernet inspector. Blog article and repro code are available. :repeat: YARN-4556|2%|44%|Used Namazu process inspector. :repeat: YARN-5043|12%|30%|Used Namazu process inspector. :repeat: ZOOKEEPER-2137|2%|16%|Used Namazu process inspector. :repeat: YARN-4168|1%|8%|Used Namazu process inspector. :repeat: YARN-1978|0%|4%|Used Namazu process inspector. :repeat: etcd #5022|0%|3%|Used Namazu process inspector.
We also improved reproducibility of some flaky etcd tests (to be documented).
Others
Issue|Note ---|--- :new: YARN-4301<br>(fault tolerance)|Used Namazu filesystem inspector and Namazu API. Repro code is available. :new: etcd command line client (etcdctl) #3517<br>(timing specification)|Used Namazu Ethernet inspector. Repro code is available.<br>The issue has been fixed in #3530 and it also resulted a hint of #3611.
Talks
- ApacheCon Core North America (May 11-13, 2016, Vancouver) [slide]
- CoreOS Fest (May 9-10, 2016, Berlin) [slide]
- FOSDEM (January 30-31, 2016, Brussels) [slide]
- The poster session of ACM Symposium on Cloud Computing (SoCC) (August 27-29, 2015, Hawaii) [poster]
Talks about Namazu Swarm
- Open Source Summit Japan (May 31-June 2, 2017, Tokyo) [slide]
Getting Started
Installation
The installation process is very simple:
$ sudo apt-get install libzmq3-dev libnetfilter-queue-dev
$ go get github.com/osrg/namazu/nmz
Currently, Namazu is tested with Go 1.6.
You can also download the latest binary from here.
Container Mode
The following instruction shows how you can start Namazu Container, the simplified, Docker-like CLI for Namazu.
$ sudo nmz container run -it --rm -v /foo:/foo ubuntu bash
In Namazu Container, you can run arbitrary command that might be flaky. JUnit tests are interesting to try.
nmzc$ git clone something
nmzc$ cd something
nmzc$ for f in $(seq 1 1000);do mvn test; done
You can also specify a config file (--nmz-autopilot option for nmz container.)
A typical configuration file (config.toml) is as follows:
# Policy for observing events and yielding actions
# You can also implement your own policy.
# Default: "random"
explorePolicy = "random"
[explorePolicyParam]
# for Ethernet/Filesystem/Java inspectors, event are non-deterministically delayed.
# minInterval and maxInterval are bounds for the non-deterministic delays
# Default: 0 and 0
minInterval = "80ms"
maxInterval = "3000ms"
# for Ethernet/Filesystem inspectors, you can specify fault-injection probability (0.0-1.0).
# Default: 0.0
faultActionProbability = 0.0
# for Process inspector, you can specify how to schedule processes
# "mild": execute processes with randomly prioritized SCHED_NORMAL/SCHED_BATCH scheduler.
# "extreme": pick up some processes and execute them with SCHED_RR scheduler. others are executed with SCHED_BATCH scheduler.
# "dirichlet": execute processes with SCHED_DEADLINE scheduler. Dirichlet-distribution is used for deciding runtime values.
# Default: "mild"
procPolicy = "extreme"
[container]
# Default: false
enableEthernetInspector = true
ethernetNFQNumber = 42
# Default: true
enableProcInspector = true
procWatchInterval = "1s"
# Default: true (for volumes (`-v /foo:/bar`))
enableFSInspector = true
For other parameters, please refer to config.go and randompolicy.go.
Non-container Mode
Process inspector
$ sudo nmz inspectors proc -pid $TARGET_PID -watch-interval 1s
By default, all the processes and the threads under $TARGET_PID are randomly scheduled.
You can also specify a config file by running with -autopilot config.toml.
You can also set -orchestrator-url (e.g. http://127.0.0.1:10080/api/v3) and -entity-id for distributed execution.
Note that the process inspector may be not effective for reproducing short-running flaky tests, but it's still effective for long-running tests: issue #125.
The guide for reproducing flaky Hadoop tests (please use nmz instead of microearthquake): FOSDEM slide 42.
Filesystem inspector (FUSE)
$ mkdir /tmp/{nmzfs-orig,nmzfs}
$ sudo nmz inspectors fs -original-dir /tmp/nmzfs-orig -mount-point /tmp/nmzfs -autopilot config.toml
$ $TARGET_PROGRAM_WHICH_ACCESSES_TMP_NMZFS
$ sudo fusermount -u /tmp/nmzfs
By default, all the read, mkdir, and rmdir accesses to the files under /tmp/nmzfs are randomly scheduled.
/tmp/nmzfs-orig is just used as the backing storage.
(Note that you have to set explorePolicyParam.minInterval and explorePolicyParam.maxInterval in the config file.)
You can also inject faullts (currently just injects -EIO) by set
