Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
haybale: Symbolic execution of LLVM IR, written in Rust
haybale is a general-purpose symbolic execution engine written in Rust.
It operates on LLVM IR, which allows it to analyze programs written in C/C++,
Rust, Swift, or any other language which compiles to LLVM IR.
In this way, it may be compared to KLEE, as it has similar goals, except
that haybale is written in Rust and makes some different design decisions.
That said, haybale makes no claim of being at feature parity with KLEE.
Okay, but what is a symbolic execution engine?
A symbolic execution engine is a way of reasoning - rigorously and
mathematically - about the behavior of a function or program.
It can reason about all possible inputs to a function without literally
brute-forcing every single one.
For instance, a symbolic execution engine like haybale can answer questions
like:
- Are there any inputs to (some function) that cause it to return 0? What are they?
- Is it possible for this loop to execute exactly 17 times?
- Can this pointer ever be NULL?
Symbolic execution engines answer these questions by converting each variable in the program or function into a mathematical expression which depends on the function or program inputs. Then they use an SMT solver to answer questions about these expressions, such as the questions listed above.
Getting started
1. Install
haybale is on crates.io, so you can simply
add it as a dependency in your Cargo.toml:
[]
= "0.2.0"
haybale also depends (indirectly) on the LLVM 9 and Boolector libraries, which
must both be available on your system.
See the llvm-sys or boolector-sys READMEs for more details and instructions.
2. Acquire bitcode to analyze
Since haybale operates on LLVM bitcode, you'll need some bitcode to get started.
If the program or function you want to analyze is written in C, you can generate
LLVM bitcode (*.bc files) with clang's -c and -emit-llvm flags:
For debugging purposes, you may also want LLVM text-format (*.ll) files, which
you can generate with clang's -S and -emit-llvm flags:
If the program or function you want to analyze is written in Rust, you can likewise
use rustc's --emit=llvm-bc and --emit=llvm-ir flags.
3. Create a Project
A haybale Project contains all of the code currently being analyzed, which
may be one or more LLVM modules.
To get started, simply create a Project from a single bitcode file:
let project = from_bc_path?;
For more ways to create Projects, including analyzing entire libraries, see
the Project documentation.
4. Use built-in analyses
haybale currently includes two simple built-in analyses:
get_possible_return_values_of_func(), which describes all the possible
values a function could return for any input, and find_zero_of_func(),
which finds a set of inputs to a function such that it returns 0.
These analyses are provided both because they may be of some use themselves,
but also because they illustrate how to use haybale.
For an introductory example, let's suppose foo is the following C function:
int
We can use find_zero_of_func() to find inputs such that foo will return 0:
match find_zero_of_func
Writing custom analyses
haybale can do much more than just describe possible function return values
and find function zeroes.
In this section, we'll walk through how we could find a zero of the function
foo above without using the built-in find_zero_of_func().
This will illustrate how to write a custom analysis using haybale.
ExecutionManager
All analyses will use an ExecutionManager to control the progress of the
symbolic execution.
In the code snippet below, we call symex_function() to create an
ExecutionManager which will analyze the function foo - it will start at
the top of the function, and end when the function returns. In between, it
will also analyze any functions called by foo, as necessary and depending
on the Config settings.
let mut em = symex_function;
Here it was necessary to not only specify the default haybale
configuration, as we did when calling find_zero_of_func(), but also what
"backend" we want to use.
The default BtorBackend should be fine for most purposes.
Paths
The ExecutionManager acts like an Iterator over paths through the function foo.
Each path is one possible sequence of control-flow decisions (e.g., which direction
do we take at each if statement) leading to the function returning some value.
The function foo in this example has two paths, one following the "true" branch and
one following the "false" branch of the if.
Let's examine the first path through the function:
let retval = em.next.expect?;
We're given the function return value, retval, as a Boolector BV (bitvector)
wrapped in the ReturnValue enum.
Since we know that foo isn't a void-typed function (and won't throw an
exception or abort), we can simply unwrap the ReturnValue to get the BV:
let retval = match retval ;
States
Importantly, the ExecutionManager provides not only the final return value of
the path as a BV, but also the final program State at the end of that path,
either immutably with state() or mutably with mut_state(). (See the
ExecutionManager documentation for more.)
let state = em.mut_state; // the final program state along this path
To test whether retval can be equal to 0 in this State, we can use
state.bvs_can_be_equal():
let zero = state.zero; // The 32-bit constant 0
if state.bvs_can_be_equal?
Getting solutions for variables
If retval can be 0, let's find what values of the function parameters
would cause that.
First, we'll add a constraint to the State requiring that the return value
must be 0:
retval._eq.assert;
and then we'll ask for solutions for each of the parameters, given this constraint:
// Get a possible solution for the first parameter.
// In this case, from looking at the text-format LLVM IR, we know the variable
// we're looking for is variable #0 in the function "foo".
let a = state.get_a_solution_for_irname?
.expect
.as_u64
.expect;
// Likewise the second parameter, which is variable #1 in "foo"
let b = state.get_a_solution_for_irname?
.expect
.as_u64
.expect;
println!;
Alternately, we could also have gotten the parameter BVs from the ExecutionManager
like this:
let a_bv = em.param_bvs.clone;
let b_bv = em.param_bvs.clone;
let a = em.state.get_a_solution_for_bv?
.expect
.as_u64
.expect;
let b = em.state.get_a_solution_for_bv?
.expect
.as_u64
.expect;
println!;
Documentation
Full documentation for haybale can be found here,
or of course you can generate local documentation with cargo doc --open.
Compatibility
Currently, haybale only supports LLVM 9. A version of haybale supporting
LLVM 8 is available on the llvm-8 branch of this repo, and is approximately
at feature parity with haybale version 0.2.0. However, there is no promise
that future haybale features will be backported to the llvm-8 branch.
haybale works on stable Rust, and requires Rust 1.36+.
Under the hood
haybale is built using the Rust llvm-ir crate and the Boolector SMT
solver (via the Rust boolector crate).
Changelog
Version 0.2.0 (Jan 8, 2020)
- Support LLVM
extractvalueandinsertvalueinstructions - Support LLVM
invoke,resume, andlandingpadinstructions, and thus C++throw/catch. Also provide built-in hooks for some related C++ ABI functions such as__cxa_throw(). This support isn't perfect, particularly surrounding the matching of catch blocks to exceptions:haybalemay explore some additional paths which aren't actually valid. But all actually valid paths should be found and explored correctly. - Since functions can be called not only with the LLVM
callinstruction but also with the LLVMinvokeinstruction, function hooks now receive a&dyn IsCallobject which may represent either acallorinvokeinstruction. haybalenow uses LLVM 9 rather than LLVM 8. See the "Compatibility" section above.- Improvements for
Projects containing C++ and/or Rust code:- For the function-name arguments to
symex_function(),get_possible_return_values_of_func(),find_zero_of_func(), andProject::get_func_by_name(), you may now pass either the (mangled) function name as it appears in LLVM (as was supported previously), or the demangled function name. That is, you can pass in"foo::bar"rather than"_ZN3foo3barE". - Likewise, you may add function hooks based on the demangled name of
the hooked function. See
FunctionHooks::add_cpp_demangled()andFunctionHooks::add_rust_demangled(). - Also,
llvm-irversions 0.3.3 and later contain an important bugfix for parsing IR generated byrustc.haybale0.2.0 usesllvm-ir0.4.1.
- For the function-name arguments to
- The
ReturnValueenum now has additional optionsThrow, indicating an uncaught exception, andAbort, indicating a program abort (e.g. Rust panic, or call to Cexit()). - Relatedly,
haybalenow has built-in hooks for the Cexit()function and for Rust panics (and for a few more LLVM intrinsics). haybalealso now contains a built-ingeneric_stub_hookandabort_hookwhich you can supply as hooks for any functions which you want to ignore the implementation of, or which always abort, respectively. See docs on thefunction_hooksmodule.Config.initial_mem_watchpointsis now aHashMapinstead of aHashSetof pairs.
Version 0.1.3 (Jan 1, 2020)
- Memory watchpoints: specify a range of memory addresses, and get
a log message for any memory operation which reads or writes any data in
that range. See
State::add_mem_watchpoint(). - Convenience methods on
Statefor constructing constant-valuedBVs (rather than having to use the corresponding methods onBVand passstate.solver):bv_from_i32(),bv_from_u32(),bv_from_i64(),bv_from_u64(),bv_from_bool(),zero(),one(), andones(). - Some internal code refactoring to prepare for 0.2.0 features
Version 0.1.2 (Dec 18, 2019)
- New method
Project::get_inner_struct_type_from_named()which handles opaque struct types by searching the entireProjectfor a definition of the given struct - Support memory reads of size 1-7 bits (in particular, reads of LLVM
i1) - Performance optimization: during
Stateinitialization, global variables are now only allocated, and not initialized until first use (lazy initialization). This gives the SMT solver fewer memory writes to think about, and helps especially for largeProjects which may contain many global variables that won't actually be used in a given analysis. - Minor bugfixes and improved error messages
Version 0.1.1 (Nov 26, 2019)
Changes to README text only; no functional changes.
Version 0.1.0 (Nov 25, 2019)
Initial release!