Skip to main content

parse

Function parse 

Source
pub fn parse(pat: &str) -> Result<Pattern, ParsePatError>
Expand description

Parses a pelite-style signature string into atoms.

Parsing injects an implicit Save(0) at the beginning so slot 0 always represents the match base cursor for parsed patterns.

This is the main runtime entry point for pattern text.

§Syntax tutorial

Following are examples of the syntax supported by goblin-sigscan.

55 89 e5 83 ? ec

Case-insensitive hexadecimal pairs match exact bytes and question marks are wildcard bytes.

A single ? matches a full byte. Partial nibble masks are not currently supported.

Whitespace has no semantic meaning and is only for readability.

b9 ' 37 13 00 00

A single quote (') stores the current cursor into the next save slot.

Save slot ordering is deterministic:

  • save[0] is always the overall match start (Save(0) injected by parser)
  • save[1..] are captures in order of appearance (', i*, u*, z, …)
b8 [16] 50 [13-42] ff

Bracket operands skip bytes:

  • [N] skips exactly N bytes
  • [A-B] tries the range non-greedily (smallest skip first)

Internally [A-B] compiles to SkipRange(A, B - 1).

31 c0 74 % ' c3
e8 $ ' 31 c0 c3

% follows a signed rel8 target and $ follows a signed rel32 target.

This composes with captures and read ops to recover referenced addresses and values without manual offset arithmetic.

e8 $ { ' } 83 f0 5c c3

Curly braces must follow %, $, or *. The sub-pattern inside {...} runs at the jump destination. After it succeeds, scanning returns to the original stream position, skips jump bytes, and continues.

e8 $ @4

@n checks alignment at that point in the scan. Alignment is 1 << n bytes, so @4 means 16-byte alignment.

e8 i1 a0 u4 z

i/u read memory into save slots and advance the cursor by operand size:

  • signed reads: i1, i2, i4
  • unsigned reads: u1, u2, u4
  • z writes a literal zero to a fresh slot
83 c0 2a ( 6a ? | 68 ? ? ? ? ) e8

Parentheses define alternatives separated by |. Arms are attempted from left to right and the pattern fails only if every arm fails.

b8 "MZ" 00

Double-quoted strings emit literal byte sequences.

§Pelite compatibility notes

goblin-sigscan intentionally tracks a practical subset of pelite syntax.

  • Supported: hex bytes, ?, ', %, $, *, {...}, [N], [A-B], @n, i1/i2/i4, u1/u2/u4, z, alternation, and strings.
  • Programmatic-only atoms (not parser syntax): Pir(slot).

§Save-slot semantics

parse always prepends Save(0), so parsed patterns always require at least one save slot. Use save_len to allocate scanner buffers.

If you are calling scanner APIs from goblin-sigscan, this means save[0] is always the match start for parsed patterns.

§Examples

use goblin_sigscan_pattern::{Atom, parse, save_len};

let atoms = parse("48 8B ? ? ? ? 48 89")?;
assert_eq!(atoms.first(), Some(&Atom::Save(0)));
assert_eq!(save_len(&atoms), 1);

Capturing a jump target plus a post-jump cursor capture:

use goblin_sigscan_pattern::{Atom, parse, save_len};

let atoms = parse("e8 ${'}")?;
assert!(matches!(atoms[0], Atom::Save(0)));
assert!(atoms.iter().any(|atom| matches!(atom, Atom::Jump4)));
assert!(save_len(&atoms) >= 2);

Group alternatives:

use goblin_sigscan_pattern::{Atom, parse};

let atoms = parse("(85 c0 | 48 85 c0)")?;
assert!(atoms.iter().any(|atom| matches!(atom, Atom::Case(_))));
assert!(atoms.iter().any(|atom| matches!(atom, Atom::Break(_))));

§Errors

Returns ParsePatError with:

  • a kind (PatError)
  • a byte position in the source string

Common error kinds include:

§Quick compile-checked examples

use goblin_sigscan_pattern::{Atom, parse, save_len};

let atoms = parse("48 8B ? ? ? ? 48 89")?;
assert_eq!(atoms.first(), Some(&Atom::Save(0)));
assert_eq!(save_len(&atoms), 1);
use goblin_sigscan_pattern::{Atom, parse, save_len};

let atoms = parse("e8 ${'}")?;
assert!(matches!(atoms[0], Atom::Save(0)));
assert!(atoms.iter().any(|atom| matches!(atom, Atom::Jump4)));
assert!(save_len(&atoms) >= 2);
use goblin_sigscan_pattern::{Atom, parse};

let atoms = parse("(85 c0 | 48 85 c0)")?;
assert!(atoms.iter().any(|atom| matches!(atom, Atom::Case(_))));
assert!(atoms.iter().any(|atom| matches!(atom, Atom::Break(_))));