Expand description
Function ID database — operand-masked body hashing for function identification across stripped binaries. Ghidra FID semantic clone with pure-Rust ingest/match pipeline.
§Algorithm
- Disassemble function linearly.
- For each instruction: keep opcode + prefix bytes, zero operand slots (registers, immediates, displacements) per arch mask table.
- Hash masked byte stream with xxh3-64 →
full_hash. - Combine with hashes of direct call targets →
specific_hash. - Persist rows (full_hash, specific_hash, name, lib_id) in compact binary format; ship as gzipped blob for runtime match pass.
Re-exports§
pub use db::FidDb;pub use db::FidEntry;pub use hash::FidHashQuad;
Modules§
- db
- FID database: compact binary format for (hash_quad, name, lib_id) rows.
- hash
- Hash quad: full body hash + specific (children-aware) hash + size.
- ingest
- Build FID database rows from ELF/Mach-O/PE binaries using symbol tables.
- mask
- Per-architecture operand masking.
Functions§
- bundled_
dbs - Load all bundled FID databases matching the given architecture. Returns a list of (library_name, db) pairs. Empty if no bundled DBs ship for the architecture (e.g. ARM32, MIPS32, RISC-V today).
- identify
- Convenience: fingerprint a function body and return matching name(s)
from the database. Prefers
specific_hash(callee-aware) matches; falls back tofull_hashif specific yields nothing.