Crate refault

Crate refault 

Source
Expand description

A deterministic simulation framework for distributed systems using async.

Refault takes your existing async Rust systems and runs one or multiple instances of it in a deterministic simulation. Determinism allows you to reliably reproduce test failures.

SimBuilder::new().run(||async move{
    let t1 = Instant::now();
    sleep(Duration::from_nanos(5)).await;
    assert_eq!(t1.elapsed().as_nanos(),5);
}).unwrap()

§How it works

Refault aims to eliminate sources of non-determinism from your (distributed) system:

  • Random number generation is intercepted to return deterministic PRNG sequences, including:.
  • Timekeeping functions are intercepted to return a simulated time that progresses deterministically:
  • Custom Async Executor:
    • Single threaded deterministic scheduling
    • Support for multiple nodes to simulate distributed systems
    • Fast forward simulated time when all tasks are sleeping

§Utilities

Refault ships with a few useful building blocks to help you port your system to the refault simulation.

§Simulated Packet network

Net implements a simulated packet network. Network abnormalities can be simulated using a user provided SendFunction.

§Send type wrappers

Many refault types are often not Send or Sync, as synchronization is not needed in the single-threaded runtime. For compatibility with the Rust async ecosystem, provides wrappers in send_bind that make the wrapped value Send and Sync and ensure it is not accidentally moved out of the simulation or between nodes.

§Id generation

You can generate unique Ids for use in your simulation code using id.

§Non-Determinism detection:

If non-determinism is accidentally introduced, the cause can be very hard to debug. By default, refault runs your simulation multiple times and compares event traces to detect non-determinism. It will tell you at which point the executions diverged. To speed up tests, you may opt out of this via with_determinsim_check.

§The Simulator trait

users can register Simulators. These are notified about various events, like the creation, stopping, and starting of nodes. They can also be used to hold simulation-scoped global state, for example to simulate some external system that your system interacts with.

§Third Party integration

Refault integrates with various third party crates to reduce the amount of code you need to cahnge to get your system to run in the simulation:

  • agnostic-lite: An abstraction layer for any async runtimes. Refault implements RuntimeLite.
  • [tower]: Refault comes with an RPC implementation based on the [tower] abstraction
  • [serde]: Refault types imlement Serialize and Deserialize where it makes sense

§Caveats

§Configure getrandom

Many crates, most notably rand, use getrandom to fetch random numbers from the OS. If you are using rand (or getrandom), you should explicitly configure the getrandom backend so that refault can intercept it. You can do so by adding this to your .cargo/config.toml:

rustflags = ['--cfg', 'getrandom_backend="linux_getrandom"']

Refault may work out of the box on your platform if getrandom chooses this backend by default, but this may break on other platforms or when getrandom updates.

§Avoid Spawning your own threads

Refault uses thread local storage to manage its state. Many refault functions will panic if invoked from a different thread. Moreover, multithreading has the potential to introduce all sorts of non-determinsm. If you have a reasonable large number of tests or are fuzz-testing, you can get plenty of parallelism by running one test per thread. If you have only a small number of tests, running each one on a single thread is probably fast enough.

§Avoid Global variables

The state of global variables at the begin of a simulation has the potential to introduce non-determinism. Refault cannot ensure that user-defined global variables always start out in the same state when the simulation starts. If you want to keep global state, consider using Simulators.

Refault runs each simulation in a fresh thread, so using thread local variables might be fine, depending on how you initialize them. Facilities of the standrad library such as std::thread_local should work fine. However, platform specific mechanisms may perform initialization before refault sets up the simulation context on the thread, so they might bypass the interception of OS facilities.

§Cargo features

namedescription
serdederive Serialize and Deserialize on refault types where applicable.
towerprovide RPC functionality via Net based on [tower]
agnostic-liteprovide a RuntimeLite implementation
emit-tracingemit tracing data via [tracing]. This is fairly noisy but sometimes helpful for debugging
send-bindProvide wrappers for ensuring data is not moved inappropriately out of the simulation or between nodes and to make types Send and Sync

Modules§

executor
Functions for handling tasks.
id
Unique Id generation.
net
A simulated packet network.
send_bind
Prevent Values from accidentally moving between nodes or out of the simulation.
sim_builder
Running simulations.
simulator
Simulation-scoped singletons.
time
Waiting for the passage of simulated time.

Structs§

NodeId
A unique identifier for a node within a simulation.

Functions§

is_in_simulation
Returns true if called from within a simulation.