Skip to main content

Crate pe_sigscan

Crate pe_sigscan 

Source
Expand description

§pe-sigscan

Fast in-process byte-pattern (“signature”) scanning over the executable sections of a loaded PE (Portable Executable) module on Windows.

This crate is a building block for game mods, hookers, debuggers, and any other in-process tool that needs to locate non-exported, non-vtable- accessible code by its byte signature. It mirrors the workflow common across the reverse-engineering ecosystem — derive a pattern from a disassembler (IDA, Ghidra, Binary Ninja, Cutter), then scan the live process’s mapped image for it at runtime.

§Quick start

use pe_sigscan::{find_in_text, Pattern};

// Get a module base via your preferred means (GetModuleHandleW,
// PEB walk, etc.). For demonstration we assume a known base.

// Build a pattern from an IDA-style hex string. `?` and `??` are
// wildcards; whitespace between bytes is ignored.
let pat = Pattern::from_ida("48 8B 05 ?? ?? ?? ?? 48 89 41 08").unwrap();

if let Some(addr) = find_in_text(module_base, pat.as_slice()) {
    println!("matched at {addr:#x}");
}

Or with the pattern! macro (no allocation, fully const-eligible):

use pe_sigscan::pattern;

const SIG: &[Option<u8>] = pattern![0x48, 0x8B, _, _, 0x48, 0x89];
assert_eq!(SIG.len(), 6);
assert_eq!(SIG[0], Some(0x48));
assert_eq!(SIG[2], None);

§Two scanning modes

  • find_in_text / count_in_text / iter_in_text — walk only the section literally named .text. The simplest case, suitable for MSVC-built DLLs that put everything in one code section.
  • find_in_exec_sections / count_in_exec_sections / iter_in_exec_sections — walk every section whose IMAGE_SCN_MEM_EXECUTE characteristic is set. Required when the function you’re scanning for might live in a companion section like .text$mn, .textbss, a jump-table arena, or any of the optimized-layout code sections that some compilers and linkers emit.

Both modes have find_in_slice / count_in_slice / iter_in_slice companions that work on a &[u8] instead of a loaded PE — useful for offline analysis, unit testing, and scanning extracted bytes.

§Resolving rel32 displacements

Real signature workflows almost always end with “match the instruction, then follow its rel32 displacement to the actual target address”. The resolve_rel32 / resolve_rel32_at helpers package that arithmetic so callers don’t reinvent the off-by-one-prone next_ip + disp32 calculation:

use pe_sigscan::{find_in_text, pattern, resolve_rel32_at};

// mov rax, [rip+disp32]: 48 8B 05 ?? ?? ?? ?? (7 bytes total).
const SIG: &[Option<u8>] = pattern![0x48, 0x8B, 0x05, _, _, _, _];
if let Some(addr) = find_in_text(module_base, SIG) {
    let target = unsafe { resolve_rel32_at(addr, 3, 7) };
    println!("global at {target:#x}");
}

§Why direct memory reads?

The .text section of a loaded DLL is page-aligned, RX-protected, and stays committed for the lifetime of the module. There is no TOCTOU concern; bytes don’t change between reads. A typical scan walks tens of megabytes of bytes — routing every probe through ReadProcessMemory would cost tens of millions of syscalls (minutes of wall time). This crate reads directly via raw pointer dereference, bounded to PE-declared section ranges.

§Safety

Public functions take a module_base: usize you must obtain from the OS (e.g. GetModuleHandleW). The implementation parses the PE headers at that base before any other access, so a non-PE pointer is rejected cleanly. Inside the validated section ranges, the unsafe pointer reads are bounded by the VirtualSize field from the section header — outside the loader handing us a malformed PE (which the loader itself would have rejected), there is no path to an out-of-bounds read.

The slice variants are safe by Rust’s slice invariants and need no further trust from the caller.

§Platform

Windows / PE only.

The crate compiles on every platform — the parsing is pure compute — but the in-process function signatures assume a module_base that came from the Windows loader. On non-Windows targets, the slice variants still work for analysing PE bytes you have mapped manually.

§License

MIT OR Apache-2.0.

Macros§

pattern
Build a &'static [Option<u8>; N] at compile time from a list of byte literals and _ wildcards.

Structs§

Matches
Iterator over non-overlapping match addresses within one or more raw byte ranges of a loaded PE module. Returned by iter_in_text and iter_in_exec_sections.
ParsePatternError
Error returned by crate::Pattern::from_ida when the input string contains an invalid token.
Pattern
A wildcard byte pattern with parsed-from-string ergonomics.
SliceMatches
Iterator over non-overlapping match addresses within a &[u8] haystack. Returned by iter_in_slice.

Enums§

ParseErrorKind
Categories of ParsePatternError.

Functions§

count_in_exec_sections
Count occurrences of pattern across ALL executable sections of the PE module loaded at module_base. Companion to find_in_exec_sections; same hook-install uniqueness contract as count_in_text.
count_in_slice
Count occurrences of pattern within the slice haystack. Non- overlapping: a pattern that matches at offset i advances the search past i + pattern.len() rather than i + 1.
count_in_text
Count occurrences of pattern within the named .text section of the PE module loaded at module_base.
find_in_exec_sections
Find the first occurrence of pattern within ANY executable section of the PE module loaded at module_base.
find_in_slice
Find the first occurrence of pattern within the slice haystack.
find_in_text
Find the first occurrence of pattern within the named .text section of the PE module loaded at module_base.
iter_in_exec_sections
Iterate over every non-overlapping occurrence of pattern across ALL executable sections of the PE module loaded at module_base.
iter_in_slice
Iterate over every non-overlapping occurrence of pattern within the slice haystack.
iter_in_text
Iterate over every non-overlapping occurrence of pattern within the section literally named .text of the PE module loaded at module_base.
module_size
Read IMAGE_OPTIONAL_HEADER.SizeOfImage — the total mapped size of the module in bytes. The virtual address range [module_base, module_base + module_size) covers every section the loader mapped.
read_rel32
Read a little-endian signed 32-bit displacement from a byte slice.
resolve_rel32
Read a signed 32-bit displacement at rel32_addr and add it to next_ip to produce an absolute target address.
resolve_rel32_at
Convenience wrapper over resolve_rel32 for the typical workflow: you have a match_addr from find_in_text, and you know the byte offset of the displacement inside the matched instruction (rel32_offset) and the total length of the instruction (instr_len).

Type Aliases§

WildcardPattern
Wildcard-aware byte pattern as a slice of Option<u8>. Some(b) matches the literal byte b; None matches any byte (the IDA-style ? token).