pub trait Arbitrary<'a>: Sized {
// Required method
fn arbitrary(u: &mut Unstructured<'a>) -> Result<Self>;
// Provided methods
fn arbitrary_take_rest(u: Unstructured<'a>) -> Result<Self> { ... }
fn size_hint(depth: usize) -> (usize, Option<usize>) { ... }
}
Expand description
Generate arbitrary structured values from raw, unstructured data.
The Arbitrary
trait allows you to generate valid structured values, like
HashMap
s, or ASTs, or MyTomlConfig
, or any other data structure from
raw, unstructured bytes provided by a fuzzer.
Deriving Arbitrary
Automatically deriving the Arbitrary
trait is the recommended way to
implement Arbitrary
for your types.
Using the custom derive requires that you enable the "derive"
cargo
feature in your Cargo.toml
:
[dependencies]
arbitrary = { version = "1", features = ["derive"] }
Then, you add the #[derive(Arbitrary)]
annotation to your struct
or
enum
type definition:
use arbitrary::Arbitrary;
use std::collections::HashSet;
#[derive(Arbitrary)]
pub struct AddressBook {
friends: HashSet<Friend>,
}
#[derive(Arbitrary, Hash, Eq, PartialEq)]
pub enum Friend {
Buddy { name: String },
Pal { age: usize },
}
Every member of the struct
or enum
must also implement Arbitrary
.
Implementing Arbitrary
By Hand
Implementing Arbitrary
mostly involves nested calls to other Arbitrary
arbitrary implementations for each of your struct
or enum
’s members. But
sometimes you need some amount of raw data, or you need to generate a
variably-sized collection type, or something of that sort. The
Unstructured
type helps you with these tasks.
use arbitrary::{Arbitrary, Result, Unstructured};
impl<'a, T> Arbitrary<'a> for MyCollection<T>
where
T: Arbitrary<'a>,
{
fn arbitrary(u: &mut Unstructured<'a>) -> Result<Self> {
// Get an iterator of arbitrary `T`s.
let iter = u.arbitrary_iter::<T>()?;
// And then create a collection!
let mut my_collection = MyCollection::new();
for elem_result in iter {
let elem = elem_result?;
my_collection.insert(elem);
}
Ok(my_collection)
}
}
Required Methods§
sourcefn arbitrary(u: &mut Unstructured<'a>) -> Result<Self>
fn arbitrary(u: &mut Unstructured<'a>) -> Result<Self>
Generate an arbitrary value of Self
from the given unstructured data.
Calling Arbitrary::arbitrary
requires that you have some raw data,
perhaps given to you by a fuzzer like AFL or libFuzzer. You wrap this
raw data in an Unstructured
, and then you can call <MyType as Arbitrary>::arbitrary
to construct an arbitrary instance of MyType
from that unstructured data.
Implementations may return an error if there is not enough data to
construct a full instance of Self
, or they may fill out the rest of
Self
with dummy values. Using dummy values when the underlying data is
exhausted can help avoid accidentally “defeating” some of the fuzzer’s
mutations to the underlying byte stream that might otherwise lead to
interesting runtime behavior or new code coverage if only we had just a
few more bytes. However, it also requires that implementations for
recursive types (e.g. struct Foo(Option<Box<Foo>>)
) avoid infinite
recursion when the underlying data is exhausted.
use arbitrary::{Arbitrary, Unstructured};
#[derive(Arbitrary)]
pub struct MyType {
// ...
}
// Get the raw data from the fuzzer or wherever else.
let raw_data: &[u8] = get_raw_data_from_fuzzer();
// Wrap that raw data in an `Unstructured`.
let mut unstructured = Unstructured::new(raw_data);
// Generate an arbitrary instance of `MyType` and do stuff with it.
if let Ok(value) = MyType::arbitrary(&mut unstructured) {
do_stuff(value);
}
See also the documentation for Unstructured
.
Provided Methods§
sourcefn arbitrary_take_rest(u: Unstructured<'a>) -> Result<Self>
fn arbitrary_take_rest(u: Unstructured<'a>) -> Result<Self>
Generate an arbitrary value of Self
from the entirety of the given
unstructured data.
This is similar to Arbitrary::arbitrary, however it assumes that it is
the last consumer of the given data, and is thus able to consume it all
if it needs. See also the documentation for
Unstructured
.
sourcefn size_hint(depth: usize) -> (usize, Option<usize>)
fn size_hint(depth: usize) -> (usize, Option<usize>)
Get a size hint for how many bytes out of an Unstructured
this type
needs to construct itself.
This is useful for determining how many elements we should insert when creating an arbitrary collection.
The return value is similar to
Iterator::size_hint
: it returns a tuple where
the first element is a lower bound on the number of bytes required, and
the second element is an optional upper bound.
The default implementation return (0, None)
which is correct for any
type, but not ultimately that useful. Using #[derive(Arbitrary)]
will
create a better implementation. If you are writing an Arbitrary
implementation by hand, and your type can be part of a dynamically sized
collection (such as Vec
), you are strongly encouraged to override this
default with a better implementation. The
size_hint
module will help with this task.
Invariant
It must be possible to construct every possible output using only inputs
of lengths bounded by these parameters. This applies to both
Arbitrary::arbitrary
and Arbitrary::arbitrary_take_rest
.
This is trivially true for (0, None)
. To restrict this further, it
must be proven that all inputs that are now excluded produced redundant
outputs which are still possible to produce using the reduced input
space.
The depth
Parameter
If you 100% know that the type you are implementing Arbitrary
for is
not a recursive type, or your implementation is not transitively calling
any other size_hint
methods, you can ignore the depth
parameter.
Note that if you are implementing Arbitrary
for a generic type, you
cannot guarantee the lack of type recursion!
Otherwise, you need to use
arbitrary::size_hint::recursion_guard(depth)
to prevent potential infinite recursion when calculating size hints for
potentially recursive types:
use arbitrary::{Arbitrary, Unstructured, size_hint};
// This can potentially be a recursive type if `L` or `R` contain
// something like `Box<Option<MyEither<L, R>>>`!
enum MyEither<L, R> {
Left(L),
Right(R),
}
impl<'a, L, R> Arbitrary<'a> for MyEither<L, R>
where
L: Arbitrary<'a>,
R: Arbitrary<'a>,
{
fn arbitrary(u: &mut Unstructured) -> arbitrary::Result<Self> {
// ...
}
fn size_hint(depth: usize) -> (usize, Option<usize>) {
// Protect against potential infinite recursion with
// `recursion_guard`.
size_hint::recursion_guard(depth, |depth| {
// If we aren't too deep, then `recursion_guard` calls
// this closure, which implements the natural size hint.
// Don't forget to use the new `depth` in all nested
// `size_hint` calls! We recommend shadowing the
// parameter, like what is done here, so that you can't
// accidentally use the wrong depth.
size_hint::or(
<L as Arbitrary>::size_hint(depth),
<R as Arbitrary>::size_hint(depth),
)
})
}
}