Trait yaxpeax_core::analyses::DFG[][src]

pub trait DFG<V: Value, A: Arch + ValueLocations, When = <A as Arch>::Address> where
    When: Copy
{ type Indirect: IndirectQuery<V>; fn read_loc(&self, when: When, loc: A::Location) -> V;
fn write_loc(&mut self, when: When, loc: A::Location, value: V);
fn indirect_loc(&self, _when: When, _loc: A::Location) -> Self::Indirect; fn read<T: ToDFGLoc<A::Location>>(&self, when: When, loc: &T) -> V { ... }
fn write<T: ToDFGLoc<A::Location>>(&mut self, when: When, loc: &T, value: V) { ... }
fn indirect<T: ToDFGLoc<A::Location>>(
        &self,
        when: When,
        loc: &T
    ) -> Self::Indirect { ... }
fn query_at(
        &self,
        when: When
    ) -> DFGLocationQueryCursor<'_, When, V, A, Self> { ... }
fn query_at_mut(
        &mut self,
        when: When
    ) -> DFGLocationQueryCursorMut<'_, When, V, A, Self> { ... } }
Expand description

interface to query a data flow graph (dfg). this interface is …. in flux.

TODOs in order of “how hard i think they are”:

  • it should be possible to look up a def site for a value
  • it should be possible to iterate the use sites of a value
  • perhaps it should be possible to insert new values to the dfg? optionally? this approaches supporting general patching
  • it should be possible to detach and move values

conceptually, these graphs have vertices at places where values are read or written, edges from uses to some write, and a value associated with the write describing what subsequent reads will see. these graphs describe the relation between values in a machine with architecture-defined locations for values to exist. in many cases these graphs are operated on in a manner consistent with the most atomic changes for a given architcture - typically an instruction’s execution. in an ideal world, this means DFG would have vertices at a pair (A::Address, A::Instruction, A::Location); “at a given address in memory, with a corresponding instruction, the value at a specific architectural location is ___”.

why is using an (Address, Location) pair, like (0x1234, rdi) not sufficient to uniquely identify a location? because, dear reader, data at an address is not constant. if you decode data at address 0x1234, is that before or after relocations are applied? if that address is known to be modified after loading, is the instruction there before or after the modification? different answers to this temporal question mean the architectural locations referenced by the corresponding instruction can be totally different!

so, really, a DFG describes the architectural state of a program at every discrete point of change for any point in the program. an eventual TODO is to key on (Address, Generation) where a “Generation” describes some series of memory edits. this is approximately supported in SSA-based DFG construction, where Memory is a single architectural location that can be versioned - perhaps “the program” may be inferred to a distinct memory region from unknown-destination memorry accesses by default? in a way, a DFG might be self-describing if at some location (0x1234, Gen1) the instruction modifies code memory by writing (0x1236, Gen2), where finding bytes to decode the next instruction would have to be a DFG query? this suggests that in the most precise case, a DFG might be backed by a MemoryRepr with a series of edits for each generation layered on top? it’s not clear how this might interact with disjoint memory regions that are versioned independently.

Associated Types

Required methods

Provided methods

Implementors