pub struct Hooks<LD, GD> { /* private fields */ }
Expand description
Manages and handles instruction hooks.
Role of Instruction Hooking in the Fuzzer
The ability to place arbitrary function hooks is a fondamental part of the fuzzer. It allows us to instrument and get information about binary programs at runtime. With the current implementation in the fuzzer we can:
- get coverage information;
- trace executed instructions;
- call user-defined functions.
Hooking Implementation
Instructions That Do Not Alter the Execution Flow
In the current implementation of the hypervisor by Apple, there are no built-in mechanism for hooking. We can’t, for example, stop the execution at a specific address and execute arbitrary functions from there.
The only way we have to stop the execution of a guest VM at a specific address is through exceptions. In our implementation this is achieved using breakpoints.
A breakpoint is placed at an arbitrary address, when the program reaches it, it raises an exception to the hypervisor and then we can check in the fuzzer if a handler exists at this address.
Original instructions Hooked instructions
+----------------------+ +----------------------+
| 0x00: mov x0, 0x42 | | 0x00: mov x0, 0x42 |
| 0x04: mov x1, 0x43 |------->| 0x04: brk #0 --------------> executes handler
| 0x08: add x0, x0, x1 | | 0x08: add x0, x0, x1 | for address 0x04
| 0x0c: ret | | 0x0c: ret |
+----------------------+ +----------------------+
Running a hook when a given address is reached is easy. Resuming the execution from this state is the hard part. The first issue we encounter is that we need to execute the instruction that was replaced by the breakpoint.
If we could execute a given number of instructions and have the hypervisor return on its own, we could just replace the breakpoint by the original instruction, run this one instruction, return to the fuzzer and place the hook again. Unfortunately it’s not currently possible.
Another possible solution would have been to write small handlers somewhere in memory that contain the instructions we want to run. After the hook has returned, we could jump to one of them, execute the instruction, before jumping back right after the original hook’s location. However, many ARM instructions are PC-relative and we would either have to reassemble them or move some other code chunks (such as ARM literal pools) to match the expected memory layout.
As we’ll see in this section and the next, the solution implemented in this fuzzer is a rough combination of both of these approaches. First we’ll explain how we can execute one instruction at a time by dividing hook handling in two stages.
-
In the first stage, a breakpoint
brk #0
is placed on the instruction we want to place a hook on. When the guest reaches the breapoint an exception is raised to the hypervisor. The hypervisor retrieves the address of the instruction and checks if a hook exists. If it’s the case, the hook is executed. Then the original instruction is restored while the instruction that follows is replaced by a second breakpoint. Execution is resumed from the instruction that was restored. -
In the second stage, when the second breakpoint is hit, another exception is raised to the hypervisor. We restore the second instruction and we reapply the breakpoint on the first instruction. This way, the hook can still trigger if the execution flow reaches it again.
+----------------------+
| Saves instruction at |
| address 0x04 and |
| replaces it with a |
| breakpoint |
+----------------------+
^ |
HYPERVISOR | |
---------------------|----------------|-------------------------------------------------------
GUEST VM | |
| v
+------------------+---+ +----------------------+
| 0x00: mov x0, 0x42 | | 0x00: mov x0, 0x42 |
| 0x04: mov x1, 0x43 | | 0x04: brk #0 -------------+
| 0x08: add x0, x0, x1 | | 0x08: add x0, x0, x1 | |
| 0x0c: ret | | 0x0c: ret | |
+----------------------+ +----------------------+ | Exception raised by
Original instructions Hooked instructions | the stage 1 breakpoint
(stage 1) |
|
+-----------------------------------------+
GUEST VM |
---------------------|------------------------------------------------------------------------
HYPERVISOR |
v +----------------------+
+----------------------+ | Restores the first | +----------------------+
| Finds the handler | | instruction, saves | | Resumes execution |
| for address 0x04 and |------->| the next one and |------->| from the instruction |
| runs it | | replaces it by a | | at address 0x04 |
+----------------------+ | breakpoint | +----------+-----------+
+----------+-----------+ |
| |
HYPERVISOR +----------------+-------------------------------+
----------------------------|----------------|------------------------------------------------
GUEST VM | |
| v
| +----------------------+
| | 0x00: mov x0, 0x42 |
+---->| 0x04: mov x1, 0x43 |
| 0x08: brk #1 -------------+
| 0x0c: ret | |
+----------------------+ |
Hooked instructions | Exception raised by
(stage 2) | the stage 2 breakpoint
|
+-----------------+
GUEST VM |
---------------------------------------------|------------------------------------------------
HYPERVISOR v
+----------------------+ +----------------------+
| Restores the second | | Resumes execution |
| instruction and |------->| from the instruction |
| restores the hook | | at address 0x08 |
+----------------------+ +----------+-----------+
| |
HYPERVISOR +----------------+-------------------------------+
----------------------------|----------------|------------------------------------------------
GUEST VM | |
| v
| +----------------------+
| | 0x00: mov x0, 0x42 |
| | 0x04: brk #0 |
+---->| 0x08: add x0, x0, x1 |
| 0x0c: ret |
+----------------------+
Hooked instructions
(hook is reset back to
stage 1)
Instructions Changing the Execution Flow
Things get a little bit more complicated for instructions that alter the execution flow,
instructions such as bl
, ret
, etc. We can place the first stage breakpoint, but if we
set the second stage breakpoint on the instruction right after, we’ll never reach it.
Fortunately for us, there aren’t that many instructions that modify PC and we can simply disassemble them and emulate their behavior.
For example, let’s imagine that we put a hook on a blr
instruction.
+----------------------+ +----------------------+
| 0x00: mov x0, 0x1000 | | 0x00: mov x0, 0x1000 |
| 0x08: blr x0 | | 0x08: brk #0 |
| 0x0c: ret | | 0x0c: ret |
+----------------------+ +----------------------+
Original instructions Hooked instructions
When the execution reaches the breakpoint, it raises an exception to the hypervisor.
We disassemble the instruction and since it’s a branch, we won’t set a stage 2 breakpoint.
Instead we retrieve information from the instruction to emulate its behaviour. In our example
this means that we read the address to jump to from x0
and set lr
to the intruction after
the branch, which is 0xc
. The operation performed will depend on the hooked instruction.
You can refer to the source code of the private Emulator
implementation for more
information.
With both categories of instructions handled, we now have a full hooking system that can be used to implement features such as coverage, tracing, etc. The next section describes the types of hooks that can be used with this fuzzer.
Hook Types
There is currently three types of hooks that can be called when hooking an instruction. One instruction can have only one hook of each type. This system could be refactored in the future to be more generic and allow more hooks to be executed, but it should enough for most use-cases.
- Tracer: hook applied on all instructions by
Tracer::add_hooks
. - Coverage: hook applied on branch and comparison instructions by
GlobalCoverage::add_hooks
. - Backtrace: hook applied on function entries and exits by
Backtrace::add_hooks
. - Custom: user-defined hook applied by the user through
Executor::add_custom_hook
. - Exit: hook that stops the program when it’s reached, can applied using
Executor::add_exit_hook
.
Implementations
sourceimpl<LD: Clone, GD: Clone> Hooks<LD, GD>
impl<LD: Clone, GD: Clone> Hooks<LD, GD>
sourcepub fn handle(
&mut self,
vcpu: &mut Vcpu,
vma: &mut VirtMemAllocator,
vma_snapshot: &VirtMemAllocator,
ldata: &mut LD,
gdata: &Arc<RwLock<GD>>,
cdata: &mut Coverage,
bdata: &mut Backtrace
) -> Result<ExitKind>
pub fn handle(
&mut self,
vcpu: &mut Vcpu,
vma: &mut VirtMemAllocator,
vma_snapshot: &VirtMemAllocator,
ldata: &mut LD,
gdata: &Arc<RwLock<GD>>,
cdata: &mut Coverage,
bdata: &mut Backtrace
) -> Result<ExitKind>
Handles the exception raised by a breakpoint which are used by the fuzzer to hook instructions. Unrecognized breakpoint values return an error.
Return Value
Returns true
if the thread needs to exit, false otherwise.
sourcepub fn add_exit_hook(&mut self, addr: u64)
pub fn add_exit_hook(&mut self, addr: u64)
Tries to add a hook into the custom handler hashmap using its address.
sourcepub fn remove_exit_hook(&mut self, addr: u64)
pub fn remove_exit_hook(&mut self, addr: u64)
Tries to add a hook into the custom handler hashmap using its address.
sourcepub fn add_custom_hook(&mut self, addr: u64, handler: HookFn<LD, GD>)
pub fn add_custom_hook(&mut self, addr: u64, handler: HookFn<LD, GD>)
Tries to add a hook into the custom handler hashmap using its address.
sourcepub fn remove_custom_hook(&mut self, addr: u64) -> bool
pub fn remove_custom_hook(&mut self, addr: u64) -> bool
Removes a hook from the custom handler hashmap using its address. Returns whether or not the corresponding hook object was deleted.
sourcepub fn add_coverage_hook(&mut self, addr: u64, handler: HookFn<LD, GD>)
pub fn add_coverage_hook(&mut self, addr: u64, handler: HookFn<LD, GD>)
Tries to add a hook into the coverage handler hashmap using its address.
sourcepub fn remove_coverage_hook(&mut self, addr: u64) -> bool
pub fn remove_coverage_hook(&mut self, addr: u64) -> bool
Removes a hook from the coverage handler hashmap using its address. Returns whether or not the corresponding hook object was deleted.
sourcepub fn add_backtrace_hook(&mut self, addr: u64, handler: HookFn<LD, GD>)
pub fn add_backtrace_hook(&mut self, addr: u64, handler: HookFn<LD, GD>)
Tries to add a hook into the backtrace handler hashmap using its address.
sourcepub fn remove_backtrace_hook(&mut self, addr: u64) -> bool
pub fn remove_backtrace_hook(&mut self, addr: u64) -> bool
Removes a hook from the backtrace handler hashmap using its address. Returns whether or not the corresponding hook object was deleted.
sourcepub fn add_tracer_hook(&mut self, addr: u64, handler: HookFn<LD, GD>)
pub fn add_tracer_hook(&mut self, addr: u64, handler: HookFn<LD, GD>)
Tries to add a hook into the tracer handler hashmap using its address.
sourcepub fn remove_tracer_hook(&mut self, addr: u64) -> bool
pub fn remove_tracer_hook(&mut self, addr: u64) -> bool
Removes a hook from the tracer handler hashmap using its address. Returns whether or not the corresponding hook object was deleted.
sourcepub fn apply(&mut self, vma: &mut VirtMemAllocator) -> Result<()>
pub fn apply(&mut self, vma: &mut VirtMemAllocator) -> Result<()>
Iterates over the hooks in the hooks
hashmap, saves the instructions at the corresponding
addresses and replaces them with HandlerStage1
breakpoints.
sourcepub fn fill_instructions(&mut self, vma: &mut VirtMemAllocator) -> Result<()>
pub fn fill_instructions(&mut self, vma: &mut VirtMemAllocator) -> Result<()>
Iterates over the hooks in the hooks
hashmap, saves the instructions at the corresponding
addresses and replaces them with HandlerStage1
breakpoints.
sourcepub fn apply_inner(
&mut self,
vma: &mut VirtMemAllocator,
apply: bool
) -> Result<()>
pub fn apply_inner(
&mut self,
vma: &mut VirtMemAllocator,
apply: bool
) -> Result<()>
Iterates over the hooks in the hooks
hashmap, saves the instructions at the corresponding
addresses and replaces them with HandlerStage1
breakpoints.
sourcepub fn revert_coverage_hooks(
&mut self,
addr: u64,
vma: &mut VirtMemAllocator,
vma_snapshot: &mut VirtMemAllocator
) -> Result<()>
pub fn revert_coverage_hooks(
&mut self,
addr: u64,
vma: &mut VirtMemAllocator,
vma_snapshot: &mut VirtMemAllocator
) -> Result<()>
Removes coverage hooks in the current virtual address space and in its snapshot at address
addr
.