Crate failpoints
source · [−]Expand description
A fail point implementation for Rust.
Fail points are code instrumentations that allow errors and other behavior to be injected dynamically at runtime, primarily for testing purposes. Fail points are flexible and can be configured to exhibit a variety of behavior, including panics, early returns, and sleeping. They can be controlled both programmatically and via the environment, and can be triggered conditionally and probabilistically.
This crate is inspired by FreeBSD’s failpoints.
Usage
First, add this to your Cargo.toml
:
[dependencies]
failpoints = "0.1"
Now you can import the failpoint!
macro from the fail
crate and use it
to inject dynamic failures.
As an example, here’s a simple program that uses a fail point to simulate an I/O panic:
use failpoints::{failpoint, FailScenario};
fn do_fallible_work() {
failpoint!("read-dir");
let _dir: Vec<_> = std::fs::read_dir(".").unwrap().collect();
// ... do some work on the directory ...
}
let scenario = FailScenario::setup();
do_fallible_work();
scenario.teardown();
println!("done");
Here, the program calls unwrap
on the result of read_dir
, a function
that returns a Result
. In other words, this particular program expects
this call to read_dir
to always succeed. And in practice it almost always
will, which makes the behavior of this program when read_dir
fails
difficult to test. By instrumenting the program with a fail point we can
pretend that read_dir
failed, causing the subsequent unwrap
to panic,
and allowing us to observe the program’s behavior under failure conditions.
When the program is run normally it just prints “done”:
$ cargo run --features fail/failpoints
Finished dev [unoptimized + debuginfo] target(s) in 0.01s
Running `target/debug/failpointtest`
done
But now, by setting the FAILPOINTS
variable we can see what happens if the
read_dir
fails:
FAILPOINTS=read-dir=panic cargo run --features fail/failpoints
Finished dev [unoptimized + debuginfo] target(s) in 0.01s
Running `target/debug/failpointtest`
thread 'main' panicked at 'failpoint read-dir panic', /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/fail-0.2.0/src/lib.rs:286:25
note: Run with `RUST_BACKTRACE=1` for a backtrace.
Usage in tests
The previous example triggers a fail point by modifying the FAILPOINT
environment variable. In practice, you’ll often want to trigger fail points
programmatically, in unit tests.
Fail points are global resources, and Rust tests run in parallel,
so tests that exercise fail points generally need to hold a lock to
avoid interfering with each other. This is accomplished by FailScenario
.
Here’s a basic pattern for writing unit tests tests with fail points:
use failpoints::{failpoint, FailScenario};
fn do_fallible_work() {
failpoint!("read-dir");
let _dir: Vec<_> = std::fs::read_dir(".").unwrap().collect();
// ... do some work on the directory ...
}
#[test]
#[should_panic]
fn test_fallible_work() {
let scenario = FailScenario::setup();
failpoints::cfg("read-dir", "panic").unwrap();
do_fallible_work();
scenario.teardown();
}
Even if a test does not itself turn on any fail points, code that it runs
could trigger a fail point that was configured by another thread. Because of
this it is a best practice to put all fail point unit tests into their own
binary. Here’s an example of a snippet from Cargo.toml
that creates a
fail-point-specific test binary:
[[test]]
name = "failpoints"
path = "tests/failpoints/mod.rs"
required-features = ["fail/failpoints"]
Early return
The previous examples illustrate injecting panics via fail points, but
panics aren’t the only — or even the most common — error pattern
in Rust. The more common type of error is propagated by Result
return
values, and fail points can inject those as well with “early returns”. That
is, when configuring a fail point as “return” (as opposed to “panic”), the
fail point will immediately return from the function, optionally with a
configurable value.
The setup for early return requires a slightly diferent invocation of the
failpoint!
macro. To illustrate this, let’s modify the do_fallible_work
function we used earlier to return a Result
:
use failpoints::{failpoint, FailScenario};
use std::io;
fn do_fallible_work() -> io::Result<()> {
failpoint!("read-dir");
let _dir: Vec<_> = std::fs::read_dir(".")?.collect();
// ... do some work on the directory ...
Ok(())
}
fn main() -> io::Result<()> {
let scenario = FailScenario::setup();
do_fallible_work()?;
scenario.teardown();
println!("done");
Ok(())
}
This example has more proper Rust error handling, with no unwraps
anywhere. Instead it uses ?
to propagate errors via the Result
type
return values. This is more realistic Rust code.
The “read-dir” fail point though is not yet configured to support early return, so if we attempt to configure it to “return”, we’ll see an error like
$ FAILPOINTS=read-dir=return cargo run --features fail/failpoints
Finished dev [unoptimized + debuginfo] target(s) in 0.13s
Running `target/debug/failpointtest`
thread 'main' panicked at 'Return is not supported for the fail point "read-dir"', src/main.rs:7:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.
This error tells us that the “read-dir” fail point is not defined correctly
to support early return, and gives us the line number of that fail point.
What we’re missing in the fail point definition is code describring how to
return an error value, and the way we do this is by passing failpoint!
a
closure that returns the same type as the enclosing function.
Here’s a variation that does so:
fn do_fallible_work() -> io::Result<()> {
failpoints::failpoint!("read-dir", |_| {
Err(io::Error::new(io::ErrorKind::PermissionDenied, "error"))
});
let _dir: Vec<_> = std::fs::read_dir(".")?.collect();
// ... do some work on the directory ...
Ok(())
}
And now if the “read-dir” fail point is configured to “return” we get a different result:
$ FAILPOINTS=read-dir=return cargo run --features fail/failpoints
Compiling failpointtest v0.1.0
Finished dev [unoptimized + debuginfo] target(s) in 2.38s
Running `target/debug/failpointtest`
Error: Custom { kind: PermissionDenied, error: StringError("error") }
This time, do_fallible_work
returned the error defined in our closure,
which propagated all the way up and out of main.
Advanced usage
That’s the basics of fail points: defining them with failpoint!
,
configuring them with FAILPOINTS
and failpoints::cfg
, and configuring them to
panic and return early. But that’s not all they can do. To learn more see
the documentation for cfg
,
cfg_callback
and
failpoint!
.
Usage considerations
For most effective fail point usage, keep in mind the following:
- Fail points are disabled by default and can be enabled via the
failpoints
feature. When failpoints are disabled, no code is generated by the macro. - Carefully consider complex, concurrent, non-deterministic combinations of fail points. Put test cases exercising fail points into their own test crate.
- Fail points might have the same name, in which case they take the same actions. Be careful about duplicating fail point names, either within a single crate, or across multiple crates.
Macros
Define a fail point (requires failpoints
feature).
Structs
Test scenario with configured fail points.
Functions
Configure the actions for a fail point at runtime.
Configure the actions for a fail point at runtime.
Returns whether code generation for failpoints is enabled.
Get all registered fail points.
Remove a fail point.