fuzzcheck 0.9.0

A modular, structure-aware, and feedback-driven fuzzing engine for Rust functions
Documentation
# Fuzzcheck

[![CI](https://github.com/loiclec/fuzzcheck-rs/actions/workflows/cargo.yml/badge.svg)](https://github.com/loiclec/fuzzcheck-rs/actions/workflows/cargo.yml)
[![Docs](https://img.shields.io/docsrs/fuzzcheck?color=blueviolet)](https://docs.rs/fuzzcheck)
[![MIT licensed](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE.txt)
[![crates.io](https://img.shields.io/crates/v/fuzzcheck)](https://crates.io/crates/fuzzcheck)

Fuzzcheck is a modular, structure-aware, and feedback-driven fuzzing engine for Rust 
functions. 

Given a function `test: (T) -> bool`, you can use fuzzcheck to find a value of
type `T` that fails the test or leads to a crash. 

The tool [`fuzzcheck-view`](https://github.com/loiclec/fuzzcheck-view) is available
to visualise the code coverage of each/all test cases generated by fuzzcheck. It is still just
a prototype though.

Follow the [guide at fuzzcheck.neocities.org](https://fuzzcheck.neocities.org) to get 
started or read [the documentation on docs.rs](https://docs.rs/fuzzcheck).

## Setup

Linux or macOS is required. [Windows support is planned but I need help with it](https://github.com/loiclec/fuzzcheck-rs/issues/8).

Rust nightly is also required. You can install it with:
```
rustup toolchain install nightly
```

While it is not strictly necessary, installing the `cargo-fuzzcheck` 
executable will make it easier to run fuzzcheck.
```bash
cargo install cargo-fuzzcheck
```

In your `Cargo.toml` file, add `fuzzcheck` as a dev-dependency:
```toml
[dev-dependencies]
fuzzcheck = "0.9"
```

Then, we need a way to serialize values. By default, fuzzcheck uses `serde_json`
for that purpose (but it can be changed). That means our data types should 
implement serde's traits. In `Cargo.toml`, add:
```
[dependencies]
serde = { version = "1.0", features = ["derive"] }
```

## Usage

Below is an example of how to use fuzz test. Note:
1. every code related to fuzzcheck is conditional on `#[cfg(test)]` because we 
don't want to carry the fuzzcheck dependency in normal builds
2. the `#![cfg_attr(test, feature(no_coverage))]` is required by fuzzcheck’s procedural macros
3. the use of `derive(fuzzcheck::DefaultMutator)` makes a custom type fuzzable

```rust
#![cfg_attr(test, feature(no_coverage))]
use serde::{Deserialize, Serialize};

#[cfg_attr(test, derive(fuzzcheck::DefaultMutator))]
#[derive(Clone, Serialize, Deserialize)]
struct SampleStruct<T, U> {
    x: T,
    y: U,
}

#[cfg_attr(test, derive(fuzzcheck::DefaultMutator))]
#[derive(Clone, Serialize, Deserialize)]
enum SampleEnum {
    A(u16),
    B,
    C { x: bool, y: bool },
}

fn should_not_crash(xs: &[SampleStruct<u8, SampleEnum>]) {
    if xs.len() > 3
        && xs[0].x == 100
        && matches!(xs[0].y, SampleEnum::C { x: false, y: true })
        && xs[1].x == 55
        && matches!(xs[1].y, SampleEnum::C { x: true, y: false })
        && xs[2].x == 87
        && matches!(xs[2].y, SampleEnum::C { x: false, y: false })
        && xs[3].x == 24
        && matches!(xs[3].y, SampleEnum::C { x: true, y: true })
    {
        panic!()
    }
}

// fuzz tests reside along your other tests and have the #[test] attribute
#[cfg(test)]
mod tests {
    #[test]
    fn test_function_shouldn_t_crash() {
        let result = fuzzcheck::fuzz_test(super::should_not_crash) // the test function to fuzz
            .default_mutator() // the mutator to generate values of &[SampleStruct<u8, SampleEnum>]
            .serde_serializer() // save the test cases to the file system using serde
            .default_sensor_and_pool() // gather observations using the default sensor (i.e. recording code coverage)
            .arguments_from_cargo_fuzzcheck() // take arguments from the cargo-fuzzcheck command line tool
            .stop_after_first_test_failure(true) // stop the fuzzer as soon as a test failure is found
            .launch();
        assert!(!result.found_test_failure);
    }
}
```

We can now use `cargo-fuzzcheck` to launch the test, using Rust nightly:
```sh
rustup override set nightly
# the argument is the *exact* path to the test function
cargo fuzzcheck tests::test_function_shouldn_t_crash
```

This starts a loop that will stop when a failing test has been found. After about ~50ms of fuzz-testing on my machine, 
the following line is printed:
```
Failing test case found. Saving at "fuzz/tests::test_function_shouldn_t_crash/artifacts/59886edc1de2dcc1.json"
```
The file `59886edc1de2dcc1.json` contains the JSON-encoded input that failed the test.

```json
[
  {
    "x": 100,
    "y": {
      "C": {
        "x": false,
        "y": true
      }
    }
  },
  {
    "x": 55,
    "y": {
      "C": {
        "x": true,
        "y": false
      }
    }
  },
  ..
]
```

## Minifying failing test inputs

Fuzzcheck can also be used to *minify* a large input that fails a test.
If the failure is recoverable (i.e. it is not a segfault/stack overflow), and 
the fuzzer is not instructed to stop after the first failure, then the failing
test cases will be minified automatically. Otherwise, you can use the `minify`
command.

Let's say you have a file `crash.json` containing an input that you would like
to minify. Launch `cargo fuzzcheck <exact name of fuzz test>` with the `minify` command
and an `--input-file` option.

```bash
cargo fuzzcheck "tests::test_function_shouldn_t_crash" --command minify --input-file "crash.json"
```

This will repeatedly launch the fuzzer in “minify” mode and save the
artifacts in the folder `artifacts/crash.minified`. The name of each artifact 
will be prefixed with the complexity of its input. For example,
`crash.minified/800--fe958d4f003bd4f5.json` has a complexity of `8.00`.

You can stop the minifying fuzzer at any point and look for the least complex
input in the `crash.minified` folder.

## Alternatives

Other crates with the same goal are [`quickcheck`](https://crates.io/crates/quickcheck) 
and [`proptest`](https://crates.io/crates/proptest). Fuzzcheck can be more powerful 
than these because it guides the generation of test cases based on feedback
generated from running the test function. This feedback is most often code coverage,
but can be different.

Another similar crate is [`cargo-fuzz`](https://crates.io/crates/cargo-fuzz), often paired 
with [`arbitrary`](https://crates.io/crates/arbitrary). In this case, 
fuzzcheck has an advantage by being easier to use, more modular, and being more 
fundamentally structure-aware and thus potentially more efficient.

## Previous work on fuzzing engines

As far as I know, evolutionary, coverage-guided fuzzing engines were
popularized by [American Fuzzy Lop (AFL)](http://lcamtuf.coredump.cx/afl/).  
Fuzzcheck is also evolutionary and coverage-guided.

Later on, LLVM released its own fuzzing engine, 
[libFuzzer](https://www.llvm.org/docs/LibFuzzer.html), which is based on the
same ideas as AFL, but it uses Clang’s 
[SanitizerCoverage](https://clang.llvm.org/docs/SanitizerCoverage.html) and is
in-process (it lives in the same process as the program being fuzz-tested.  
Fuzzcheck is also in-process. It uses rustc’s `-Z instrument-coverage` option 
instead of SanitizerCoverage for code coverage instrumentation.

Both AFL and libFuzzer work by manipulating bitstrings (e.g. `1011101011`).
However, many programs work on structured data, and mutations at the
bitstring level may not map to meaningful mutations at the level of the
structured data. This problem can be partially addressed by using a compact
binary encoding such as protobuf and providing custom mutation functions to
libFuzzer that work on the structured data itself. This is a way to perform
“structure-aware fuzzing” ([talk](https://www.youtube.com/watch?v=U60hC16HEDY),
[tutorial](https://github.com/google/fuzzer-test-suite/blob/master/tutorial/structure-aware-fuzzing.md)).

An alternative way to deal with structured data is to use generators just like
QuickCheck’s `Arbitrary` trait. And then to “treat the raw byte buffer input 
provided by the coverage-guided fuzzer as a sequence of random values and
implement a “random” number generator around it.” 
([cited blog post by @fitzgen](https://fitzgeraldnick.com/2019/09/04/combining-coverage-guided-and-generation-based-fuzzing.html)). 
The tool `cargo-fuzz` has
[recently](https://fitzgeraldnick.com/2020/01/16/better-support-for-fuzzing-structured-inputs-in-rust.html) 
implemented that approach.

Fuzzcheck is also structure-aware, but unlike previous attempts at
structure-aware fuzzing, it doesn't use an intermediary binary encoding such as
protobuf nor does it use Quickcheck-like generators.
Instead, it directly mutates the typed values in-process.
This is better many ways. First, it is faster because there is no
need to encode and decode inputs at each iteration. Second, the complexity of
the input is given by a user-defined function, which will be more accurate than
counting the bytes of the protobuf encoding.
Finally, and most importantly, the mutations are faster and more meaningful 
than those done on protobuf or `Arbitrary`’s byte buffer-based RNG.
A detail that I particularly like about fuzzcheck, and that is possible only 
because it mutates typed values, is that every mutation is done **in-place**
and is reversable. That means that generating a new test case is super fast, 
and can often even be done with zero allocations.

As I was developing Fuzzcheck for Swift, a few researchers developed Fuzzchick
for Coq ([paper](https://www.cs.umd.edu/~mwh/papers/fuzzchick-draft.pdf)). It 
is a coverage-guided property-based testing tool implemented as an extension to
Quickchick. As far as I know, it is the only other tool with the same philosophy
as fuzzcheck. The similarity between the names `fuzzcheck` and `Fuzzchick` is a 
coincidence.

[LibAFL](https://github.com/AFLplusplus/LibAFL) is another modular fuzzer written
in Rust. It was released relatively recently.