Skip to content
This repository has been archived by the owner on Dec 13, 2021. It is now read-only.

Project Rename: Earthquake --> Namazu #142

Merged
merged 22 commits into from
Apr 28, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
03ff83b
*: git grep -l osrg/earthquake/earthquake | xargs sed -i -e 's@osrg/e…
AkihiroSuda Apr 27, 2016
bcb5a51
*: git grep -l pyearthquake | xargs sed -i -e 's@pyearthquake@pynmz@g'
AkihiroSuda Apr 27, 2016
1da3326
*: git grep -l net.osrg.earthquake | grep -v "^*.md" | xargs sed -i …
AkihiroSuda Apr 27, 2016
50f9631
*: git grep -l 'osrg/earthquake | xargs sed -i -e s@osrg/earthquake@o…
AkihiroSuda Apr 27, 2016
3013dba
*: fix false s@osrg/earthquake@osrg/namazu@g
AkihiroSuda Apr 27, 2016
8b67770
*: git grep -l Earthquake | grep -v blog/content| xargs sed -i -e s@E…
AkihiroSuda Apr 27, 2016
2d6feed
*: git grep -l earthquake_ | grep -v "^.*log" | grep -v "^.*json" | g…
AkihiroSuda Apr 27, 2016
5cc2d50
*: git grep -l -i Earth | grep -v "^.*log" | grep -v "^.*md" | grep -…
AkihiroSuda Apr 27, 2016
63b418c
*; fix s@earthquake_@namazu_g
AkihiroSuda Apr 27, 2016
50f3fe2
*: fix some things manually
AkihiroSuda Apr 27, 2016
5ccd249
*: git grep -l EQ_ | grep -v blog | xargs sed -i -e s@EQ_@NMZ_@g
AkihiroSuda Apr 27, 2016
3c5b85c
blog: twitter account: @NamazuFuzzTest
AkihiroSuda Apr 27, 2016
154c123
*: rename files
AkihiroSuda Apr 27, 2016
6803c77
doc: fix
AkihiroSuda Apr 27, 2016
d614978
*: git grep -l eqfs | xargs sed -i -e s/eqfs/nmzfs/g
AkihiroSuda Apr 27, 2016
f15f16e
container: rename -eq-config to -nmz-autopilot
AkihiroSuda Apr 27, 2016
53acd2f
inspectors/proc: -root-pid -> -pid
AkihiroSuda Apr 27, 2016
d752d69
*: gofmt
AkihiroSuda Apr 27, 2016
55db6eb
*: git grep -l "namazu --" | xargs sed -i -e 's@namazu --@nmz --@g'
AkihiroSuda Apr 27, 2016
979fb19
*: some fix
AkihiroSuda Apr 27, 2016
7385590
*: fix some namazu->nmz
AkihiroSuda Apr 28, 2016
2aa1421
doc: add logo
AkihiroSuda Apr 28, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Earthquake binary
# Namazu binary
bin/

# backups
Expand Down
8 changes: 4 additions & 4 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
[submodule "inspector/c/llvm/clang.git"]
path = inspector/c/llvm/clang.git
[submodule "misc/inspector/c/llvm/clang.git"]
path = misc/inspector/c/llvm/clang.git
url = https://github.com/osrg/clang.git
branch = earthquake
[submodule "inspector/c/llvm/llvm.git"]
path = inspector/c/llvm/llvm.git
[submodule "misc/inspector/c/llvm/llvm.git"]
path = misc/inspector/c/llvm/llvm.git
url = https://github.com/osrg/llvm.git
branch = earthquake
6 changes: 3 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,16 +18,16 @@ before_install:
- go get golang.org/x/tools/cmd/cover github.com/axw/gocov/gocov github.com/mattn/goveralls github.com/modocache/gover

script:
# Test the `earthquake` Go package and get the coverage data
# Test the `nmz` Go package and get the coverage data
# https://gist.github.com/rjeczalik/6f01430e8554bf59b88e
- go list -f '{{if len .TestGoFiles}}"go test -race -cover -coverprofile={{.Dir}}/.coverprofile {{.ImportPath}}"{{end}}' ./earthquake/... | xargs -L 1 sh -c
- go list -f '{{if len .TestGoFiles}}"go test -race -cover -coverprofile={{.Dir}}/.coverprofile {{.ImportPath}}"{{end}}' ./nmz/... | xargs -L 1 sh -c
- gover
- goveralls -coverprofile=gover.coverprofile -service=travis-ci
# Test some trivial things
- go vet ./...
- go fmt ./...
# Test the entire Dockerfile
- docker build -t osrg/earthquake .
- docker build -t osrg/namazu .

notifications:
webhooks:
Expand Down
26 changes: 13 additions & 13 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
## Dockerfile for Earthquake
## Available at Docker Hub: osrg/earthquake
## Dockerfile for Namazu
## Available at Docker Hub: osrg/namazu
FROM osrg/dind-ovs-ryu
MAINTAINER Akihiro Suda <suda.akihiro@lab.ntt.co.jp>

RUN apt-get update && apt-get install -y --no-install-recommends \
## Install Earthquake deps
## Install Namazu deps
protobuf-compiler pkg-config libzmq3-dev libnetfilter-queue-dev \
## (Optional) Install Java inspector deps
default-jdk maven \
Expand All @@ -14,17 +14,17 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
mongodb \
## (Optional) Install FUSE inspector deps
fuse \
## (Optional) Install pyearthquake deps
## (Optional) Install pynmz deps
python-flask python-scapy python-zmq \
## (Optional) Install pyearthquake nfqhook deps
## (Optional) Install pynmz nfqhook deps
libnetfilter-queue1 python-prctl

## Install Go 1.6
RUN curl https://storage.googleapis.com/golang/go1.6.linux-amd64.tar.gz | tar Cxz /usr/local && mkdir /gopath
ENV PATH /usr/local/go/bin:$PATH
ENV GOPATH /gopath

## (Optional) Install pyearthquake deps
## (Optional) Install pynmz deps
RUN pip install hexdump

## (Optional) Install hookswitch
Expand All @@ -37,18 +37,18 @@ RUN chmod +x /usr/local/bin/pipework
## (Optional) Create a user for nfqueue sandbox
RUN useradd -m nfqhooked

## Copy Earthquake to /earthquake
ADD . /earthquake
WORKDIR /earthquake
## Copy Namazu to /namazu
ADD . /namazu
WORKDIR /namazu
RUN ( git submodule init && git submodule update )
ENV PYTHONPATH /earthquake:$PYTHONPATH
ENV PYTHONPATH /namazu:$PYTHONPATH

## Build Earthquake
## Build Namazu
RUN ./build

## Silence dind logs
ENV LOG file

## Start init (does NOT enable DinD/OVS/Ryu by default)
ADD misc/docker/eq-init.py /eq-init.py
CMD ["/eq-init.py"]
ADD misc/docker/nmz-init.py /nmz-init.py
CMD ["/nmz-init.py"]
104 changes: 54 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,30 @@
# Earthquake: Programmable Fuzzy Scheduler for Testing Distributed Systems
# Namazu: Programmable Fuzzy Scheduler for Testing Distributed Systems

[![Release](http://github-release-version.herokuapp.com/github/osrg/earthquake/release.svg?style=flat)](https://github.com/osrg/earthquake/releases/latest)
[![Join the chat at https://gitter.im/osrg/earthquake](https://img.shields.io/badge/GITTER-join%20chat-green.svg)](https://gitter.im/osrg/earthquake?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[![GoDoc](https://godoc.org/github.com/osrg/earthquake/earthquake?status.svg)](https://godoc.org/github.com/osrg/earthquake/earthquake)
[![Build Status](https://travis-ci.org/osrg/earthquake.svg?branch=master)](https://travis-ci.org/osrg/earthquake)
[![Coverage Status](https://coveralls.io/repos/github/osrg/earthquake/badge.svg?branch=master)](https://coveralls.io/github/osrg/earthquake?branch=master)
[![Go Report Card](https://goreportcard.com/badge/github.com/osrg/earthquake)](https://goreportcard.com/report/github.com/osrg/earthquake)
[![Release](http://github-release-version.herokuapp.com/github/osrg/namazu/release.svg?style=flat)](https://github.com/osrg/namazu/releases/latest)
[![Join the chat at https://gitter.im/osrg/namazu](https://img.shields.io/badge/GITTER-join%20chat-green.svg)](https://gitter.im/osrg/namazu?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[![GoDoc](https://godoc.org/github.com/osrg/namazu/nmz?status.svg)](https://godoc.org/github.com/osrg/namazu/nmz)
[![Build Status](https://travis-ci.org/osrg/namazu.svg?branch=master)](https://travis-ci.org/osrg/namazu)
[![Coverage Status](https://coveralls.io/repos/github/osrg/namazu/badge.svg?branch=master)](https://coveralls.io/github/osrg/namazu?branch=master)
[![Go Report Card](https://goreportcard.com/badge/github.com/osrg/namazu)](https://goreportcard.com/report/github.com/osrg/namazu)

Earthquake is a programmable fuzzy scheduler for testing real implementations of distributed system (such as ZooKeeper).
Namazu (formerly named Earthquake) is a programmable fuzzy scheduler for testing real implementations of distributed system such as ZooKeeper.

Blog: [http://osrg.github.io/earthquake/](http://osrg.github.io/earthquake/)
![doc/img/namazu.png](doc/img/namazu.png)

Earthquakes permutes Java function calls, Ethernet packets, Filesystem events, and injected faults in various orders so as to find implementation-level bugs of the distributed system.
Earthquake can also control non-determinism of the thread interleaving (by calling `sched_setattr(2)` with randomized parameters).
So Earthquake can be also used for testing standalone multi-threaded software.
Namazu permutes Java function calls, Ethernet packets, Filesystem events, and injected faults in various orders so as to find implementation-level bugs of the distributed system.
Namazu can also control non-determinism of the thread interleaving (by calling `sched_setattr(2)` with randomized parameters).
So Namazu can be also used for testing standalone multi-threaded software.

Basically, Earthquake permutes events in a random order, but you can write your [own state exploration policy](doc/arch.md) (in Golang) for finding deep bugs efficiently.
Basically, Namazu permutes events in a random order, but you can write your [own state exploration policy](doc/arch.md) (in Golang) for finding deep bugs efficiently.

Blog: [http://osrg.github.io/namazu/](http://osrg.github.io/namazu/)

Twitter: [@NamazuFuzzTest](https://twitter.com/NamazuFuzzTest)

## Found/Reproduced Bugs
* ZooKeeper:
* Found [ZOOKEEPER-2212](https://issues.apache.org/jira/browse/ZOOKEEPER-2212) (race): [blog article](http://osrg.github.io/earthquake/post/zookeeper-2212/) ([repro code](example/zk-found-2212.ryu))
* Reproduced [ZOOKEEPER-2080](https://issues.apache.org/jira/browse/ZOOKEEPER-2080) (race): [blog article](http://osrg.github.io/earthquake/post/zookeeper-2080/) ([repro code](example/zk-repro-2080.nfqhook))
* Found [ZOOKEEPER-2212](https://issues.apache.org/jira/browse/ZOOKEEPER-2212) (race): [blog article](http://osrg.github.io/namazu/post/zookeeper-2212/) ([repro code](example/zk-found-2212.ryu))
* Reproduced [ZOOKEEPER-2080](https://issues.apache.org/jira/browse/ZOOKEEPER-2080) (race): [blog article](http://osrg.github.io/namazu/post/zookeeper-2080/) ([repro code](example/zk-repro-2080.nfqhook))
* etcd:
* Found an etcd command line client (etcdctl) bug [#3517](https://github.com/coreos/etcd/issues/3517) (timing specification), fixed in [#3530](https://github.com/coreos/etcd/pull/3530): ([repro code](example/etcd/3517-reproduce)). The fix also resulted a hint of [#3611](https://github.com/coreos/etcd/pull/3611).
* Reproduced flaky tests {[#4006](https://github.com/coreos/etcd/pull/4006), [#4039](https://github.com/coreos/etcd/issues/4039)} ([repro instruction](http://www.slideshare.net/AkihiroSuda/tackling-nondeterminism-in-hadoop-testing-and-debugging-distributed-systems-with-earthquake-57866497/42))
Expand All @@ -32,24 +36,24 @@ Basically, Earthquake permutes events in a random order, but you can write your
The installation process is very simple:

$ sudo apt-get install libzmq3-dev libnetfilter-queue-dev
$ go get github.com/osrg/earthquake/earthquake
$ go get github.com/osrg/namazu/nmz


## Quick Start (Container mode)
The following instruction shows how you can start *Earthquake Container*, the simplified, Docker-like CLI for Earthquake.
The following instruction shows how you can start *Namazu Container*, the simplified, Docker-like CLI for Namazu.

$ sudo earthquake container run -it --rm -v /foo:/foo ubuntu bash
$ sudo nmz container run -it --rm -v /foo:/foo ubuntu bash


In *Earthquake Container*, you can run arbitrary command that might be *flaky*.
In *Namazu Container*, you can run arbitrary command that might be *flaky*.
JUnit tests are interesting to try.

earthquake-container$ git clone something
earthquake-container$ cd something
earthquake-container$ for f in $(seq 1 1000);do mvn test; done
nmzc$ git clone something
nmzc$ cd something
nmzc$ for f in $(seq 1 1000);do mvn test; done


You can also specify a config file (`--eq-config` option for `earthquake container`.)
You can also specify a config file (`--nmz-autopilot` option for `nmz container`.)
A typical configuration file (`config.toml`) is as follows:

```toml
Expand Down Expand Up @@ -84,44 +88,44 @@ explorePolicy = "random"
enableProcInspector = true
procWatchInterval = "1s"
```
For other parameters, please refer to [`config.go`](earthquake/util/config/config.go) and [`randompolicy.go`](earthquake/explorepolicy/random/randompolicy.go).
For other parameters, please refer to [`config.go`](nmz/util/config/config.go) and [`randompolicy.go`](nmz/explorepolicy/random/randompolicy.go).


## Quick Start (Non-container mode)

### Process inspector

$ sudo earthquake inspectors proc -root-pid $TARGET_PID -watch-interval 1s
$ sudo nmz inspectors proc -pid $TARGET_PID -watch-interval 1s

By default, all the processes and the threads under `$TARGET_PID` are randomly scheduled.

You can also specify a config file by running with `-autopilot config.toml`.

You can also set `-orchestrator-url` (e.g. `http://127.0.0.1:10080/api/v3`) and `-entity-id` for distributed execution.

Note that the process inspector may be not effective for reproducing short-running flaky tests, but it's still effective for long-running tests: [issue #125](https://github.com/osrg/earthquake/issues/125).
Note that the process inspector may be not effective for reproducing short-running flaky tests, but it's still effective for long-running tests: [issue #125](https://github.com/osrg/namazu/issues/125).


The guide for reproducing flaky Hadoop tests (please use `earthquake` instead of `microearthquake`): [FOSDEM slide 42](http://www.slideshare.net/AkihiroSuda/tackling-nondeterminism-in-hadoop-testing-and-debugging-distributed-systems-with-earthquake-57866497/42).
The guide for reproducing flaky Hadoop tests (please use `nmz` instead of `microearthquake`): [FOSDEM slide 42](http://www.slideshare.net/AkihiroSuda/tackling-nondeterminism-in-hadoop-testing-and-debugging-distributed-systems-with-earthquake-57866497/42).


### Filesystem inspector (FUSE)

$ mkdir /tmp/{eqfs-orig,eqfs}
$ sudo earthquake inspectors fs -original-dir /tmp/eqfs-orig -mount-point /tmp/eqfs
$ $TARGET_PROGRAM_WHICH_ACCESSES_TMP_EQFS
$ sudo fusermount -u /tmp/eqfs
$ mkdir /tmp/{nmzfs-orig,nmzfs}
$ sudo nmz inspectors fs -original-dir /tmp/nmzfs-orig -mount-point /tmp/nmzfs
$ $TARGET_PROGRAM_WHICH_ACCESSES_TMP_NMZFS
$ sudo fusermount -u /tmp/nmzfs

By default, all the `read`, `mkdir`, and `rmdir` accesses to the files under `/tmp/eqfs` are randomly scheduled.
`/tmp/eqfs-orig` is just used as the backing storage.
By default, all the `read`, `mkdir`, and `rmdir` accesses to the files under `/tmp/nmzfs` are randomly scheduled.
`/tmp/nmzfs-orig` is just used as the backing storage.
(Note that you have to set `explorePolicyParam.minInterval` and `explorePolicyParam.maxInterval` in the config file.)

You can also inject faullts (currently just injects `-EIO`) by setting `explorePolicyParam.faultActionProbability` in the config file.

### Ethernet inspector (Linux netfilter_queue)

$ iptables -A OUTPUT -p tcp -m owner --uid-owner $(id -u johndoe) -j NFQUEUE --queue-num 42
$ sudo earthquake inspectors ethernet -nfq-number 42
$ sudo nmz inspectors ethernet -nfq-number 42
$ sudo -u johndoe $TARGET_PROGRAM
$ iptables -D OUTPUT -p tcp -m owner --uid-owner $(id -u johndoe) -j NFQUEUE --queue-num 42

Expand All @@ -135,7 +139,7 @@ You have to install [ryu](https://github.com/osrg/ryu) and [hookswitch](https://

$ sudo pip install ryu hookswitch
$ sudo hookswitch-of13 ipc:///tmp/hookswitch-socket --tcp-ports=4242,4243,4244
$ sudo earthquake inspectors ethernet -hookswitch ipc:///tmp/hookswitch-socket
$ sudo nmz inspectors ethernet -hookswitch ipc:///tmp/hookswitch-socket

Please also refer to [doc/how-to-setup-env-full.md](doc/how-to-setup-env-full.md) for this feature.

Expand All @@ -151,16 +155,16 @@ Basically please follow these examples: [example/zk-found-2212.ryu](example/zk-f
Prepare `config.toml` for distributed execution.
Example:
```toml
# executed in `earthquake init`
# executed in `nmz init`
init = "init.sh"

# executed in `earthquake run`
# executed in `nmz run`
run = "run.sh"

# executed in `earthquake run` as the test oracle
# executed in `nmz run` as the test oracle
validate = "validate.sh"

# executed in `earthquake run` as the clean-up script
# executed in `nmz run` as the clean-up script
clean = "clean.sh"

# REST port for the communication.
Expand All @@ -174,28 +178,28 @@ restPort = 10080
Create `materials` directory, and put `*.sh` into it.

#### Step 3
Run `earthquake init --force config.toml materials /tmp/x`.
Run `nmz init --force config.toml materials /tmp/x`.

This command executes `init.sh` for initializing the workspace `/tmp/x`.
`init.sh` can access the `materials` directory as `${EQ_MATERIALS_DIR}`.
`init.sh` can access the `materials` directory as `${NMZ_MATERIALS_DIR}`.

#### Step 4
Run `for f in $(seq 1 100);do earthquake run /tmp/x; done`.
Run `for f in $(seq 1 100);do nmz run /tmp/x; done`.

This command starts the orchestrator, and executes `run.sh`, `validate.sh`, and `clean.sh` for testing the system (100 times).

`run.sh` should invoke multiple Earthquake inspectors: `earthquake inspectors <proc|fs|ethernet> -entity-id _some_unique_string -orchestrator-url http://127.0.0.1:10080/api/v3`
`run.sh` should invoke multiple Namazu inspectors: `nmz inspectors <proc|fs|ethernet> -entity-id _some_unique_string -orchestrator-url http://127.0.0.1:10080/api/v3`

`*.sh` can access the `/tmp/x/{00000000, 00000001, 00000002, ..., 00000063}` directory as `${EQ_WORKING_DIR}`, which is intended for putting test results and some relevant information. (Note: 0x63==99)
`*.sh` can access the `/tmp/x/{00000000, 00000001, 00000002, ..., 00000063}` directory as `${NMZ_WORKING_DIR}`, which is intended for putting test results and some relevant information. (Note: 0x63==99)

`validate.sh` should exit with zero for successful executions, and with non-zero status for failed executions.

`clean.sh` is an optional clean-up script for each of the execution.

#### Step 5
Run `earthquake summary /tmp/x` for summarizing the result.
Run `nmz summary /tmp/x` for summarizing the result.

If you have [JaCoCo](http://eclemma.org/jacoco/) coverage data, you can run `java -jar bin/earthquake-analyzer.jar --classes-path /somewhere/classes /tmp/x` for counting execution patterns as in [FOSDEM slide 18](http://www.slideshare.net/AkihiroSuda/tackling-nondeterminism-in-hadoop-testing-and-debugging-distributed-systems-with-earthquake-57866497/18).
If you have [JaCoCo](http://eclemma.org/jacoco/) coverage data, you can run `java -jar bin/nmz-analyzer.jar --classes-path /somewhere/classes /tmp/x` for counting execution patterns as in [FOSDEM slide 18](http://www.slideshare.net/AkihiroSuda/tackling-nondeterminism-in-hadoop-testing-and-debugging-distributed-systems-with-earthquake-57866497/18).

![doc/img/exec-pattern.png](doc/img/exec-pattern.png)

Expand All @@ -207,7 +211,7 @@ If you have [JaCoCo](http://eclemma.org/jacoco/) coverage data, you can run `jav
* The poster session of [ACM Symposium on Cloud Computing (SoCC)](http://acmsocc.github.io/2015/) (August 27-29, 2015, Hawaii)

## How to Contribute
We welcome your contribution to Earthquake.
We welcome your contribution to Namazu.
Please feel free to send your pull requests on github!

## Copyright
Expand All @@ -220,7 +224,7 @@ Released under [Apache License 2.0](LICENSE).
## API for your own exploration policy

```go
// implements earthquake/explorepolicy/ExplorePolicy interface
// implements nmz/explorepolicy/ExplorePolicy interface
type MyPolicy struct {
actionCh chan Action
}
Expand All @@ -247,7 +251,7 @@ func (p *MyPolicy) QueueEvent(event Event) {
panic(err)
}
// send in a goroutine so as to make the function non-blocking.
// (Note that earthquake/util/queue/TimeBoundedQueue provides
// (Note that nmz/util/queue/TimeBoundedQueue provides
// better semantics and determinism, this is just an example.)
go func() {
fmt.Printf("Action ready: %s\n", action)
Expand All @@ -268,5 +272,5 @@ func main(){
Please refer to [example/template](example/template) for further information.

## Known Limitation
After running Earthquake (process inspector with `exploreParam.procPolicyParam="dirichlet"`) many times, `sched_setattr(2)` can fail with `EBUSY`.
After running Namazu (process inspector with `exploreParam.procPolicyParam="dirichlet"`) many times, `sched_setattr(2)` can fail with `EBUSY`.
This seems to be a bug of kernel; We're looking into this.
Loading