nimrod 0.3.0

Parse and inspect Nim-compiled native binaries
Documentation

nimrod

A pure-Rust parser and forensic-artifact extractor for Nim-compiled native binaries. Built for malware analysis and reverse engineering.

Nim compiles to C/C++/ObjC and then to a standard ELF, PE, or Mach-O binary — there is no Nim-specific container. This crate recovers the runtime artifacts that the Nim compiler leaves behind: entry shims, module init functions, RTTI tables, string literals, stack-trace metadata, build-host attribution paths, and exception raise sites.

What it extracts

Artifact Description
Detection 11 independent fingerprint probes — reliable even on stripped -d:danger builds
GC mode / version refc vs arc/orc, plus a best-effort Nim1xRefc / Nim2xArc / Nim2xOrc hint
Entry shims NimMain, PreMain, NimMainModule, etc. with addresses
Init functions *Init000 / *DatInit000 with decoded build-host module paths
Type graph Cross-linked types: members, offsets, sizes, alignment, inheritance, enum values, resolved destructors
Code entrypoints One VA-tagged stream of shims, inits, procs, raise-enclosing fns, and RTTI procs for disassembler labelling
Module map Every Nim module compiled into the binary, with per-function name, address, and size (ELF)
Symbol demangling Reverses Nim's <ident>__<module>_u<id> mangling back to identifiers
RTTI TNimTypeV2 fields (size, align, depth, destructor) and TNimType with field-name recovery
String literals V2 (NIM_STRLIT_FLAG) and V1 (NimStringDesc) scans
Stack traces Proc names and .nim file paths — absolute paths leak the build host
Nimble paths .nimble/pkgs leaks parsed into package name, version, hash, and username
Exception types *Error / *Defect cstrings found in rodata
Raise sites Full (type, proc, file, line) tuples recovered via x86_64/AArch64 instruction analysis

Quick start

use nimrod::NimBinary;

let data = std::fs::read("sample.exe")?;
let bin = NimBinary::from_bytes(&data)?;

if !bin.is_nim() {
    println!("Not a Nim binary");
    return Ok(());
}

println!("Format: {:?}, GC: {:?}", bin.format(), bin.gc_mode());

Type graph

bin.types() recovers every Nim type from RTTI (V1 TNimType and V2 TNimTypeV2) into one cross-linked graph: size, alignment, inheritance depth, member fields with offsets and resolved field types, enum values, and destructor procs resolved to function symbols.

for t in bin.types() {
    let name = t.name.as_deref().or(t.type_fragment.as_deref()).unwrap_or("?");
    println!("{} {} (size={}, align={})", t.version, name, t.size, t.align);

    if let Some(parent) = &t.parent {
        println!("  inherits: {}", parent.name.as_deref().unwrap_or("?"));
    }
    for f in &t.fields {
        let fty = f.type_ref.as_ref().and_then(|r| r.name.as_deref()).unwrap_or("?");
        println!("  +{:<4} {}: {}", f.offset, f.name, fty);
    }
    for e in &t.enum_values {
        println!("  = {} ({})", e.name, e.ordinal);
    }
    if let Some(d) = &t.destructor {
        println!("  =destroy: {}", d.function.as_deref().unwrap_or("?"));
    }
}

V2 (ARC/ORC) object layouts also expose the inheritance chain via the display class-token array. On Mach-O, legacy V1 globals are stored in __DATA,__common with no file backing, so they degrade gracefully to name-only entries (t.is_readable() == false) carrying the type-name fragment — never a panic, never a dropped type.

Code entrypoints

bin.code_entrypoints() collapses every confidently-labelled code address — entry shims, module inits, demangled procs, raise-enclosing functions, and RTTI destructor / trace procs — into one deduplicated, VA-sorted stream so a disassembler front-end can label a whole binary from a single call:

for ep in bin.code_entrypoints() {
    println!("{:#x}  {}  {}", ep.va, ep.kind, ep.name);
}

Module map

The module map cross-references init functions, demangled symbols, and stack-trace file paths into a per-module view. Each module lists every function with its demangled name, virtual address, and size (ELF):

let mmap = bin.module_map();
for (name, info) in &mmap.modules {
    println!("{name}: {} functions", info.symbol_count());
    if let Some(ref path) = info.init_path {
        println!("  source: {path}");
    }
    for sym in &info.symbols {
        println!("  {:#x} {} ({} bytes)", sym.address, sym.name, sym.size);
    }
}
cgen: 650 functions
  source: cgen.nim
  0x7b6100 cProcParams (439 bytes)
  0x7b62c0 genProcPrototype (312 bytes)
  ...
system: 224 functions
  source: system.nim
  0x405e70 rawAlloc (1284 bytes)
  0x406360 collectCyclesBacon (820 bytes)
  ...

This gives downstream tools (Binary Ninja, Ghidra, IDA) the function boundaries they need for disassembly and analysis.

Raise-site recovery

Phase 2 raise-site recovery analyses x86_64 and AArch64 instructions around calls to raiseExceptionEx to extract the full exception tuple:

for rs in &bin.raise_sites() {
    println!(
        "{} in {} at {}:{} [fn: {}]",
        rs.exception_type.as_deref().unwrap_or("?"),
        rs.proc_name.as_deref().unwrap_or("?"),
        rs.file.as_deref().unwrap_or("?"),
        rs.line.map(|l| l.to_string()).unwrap_or("?".into()),
        rs.enclosing_function.as_deref().unwrap_or("?"),
    );
}
ValueError in parseHexInt at strutils.nim:1242  [fn: nsuParseHexInt]
IndexDefect in delete at system.nim:2196        [fn: delete__closureiters_u3150]
MyError in inner at exceptions.nim:7            [fn: outer__exceptions_u129]

Build-host attribution

Debug and standard-release Nim builds leak build-host paths via stack-trace metadata and nimble package paths:

// Absolute .nim file paths (build-host leak)
let harvest = bin.stack_trace();
for f in &harvest.file_paths {
    if f.is_absolute {
        println!("leaked: {}", f.path);
    }
}

// Nimble package paths (username + package intel)
for p in &bin.nimble_paths() {
    println!("pkg: {}@{}", 
        p.pkg_name.as_deref().unwrap_or("?"),
        p.pkg_version.as_deref().unwrap_or("?"));
    if let Some(ref user) = p.user_hint {
        println!("  user: {user}");
    }
}
leaked: /opt/homebrew/Cellar/nim/2.2.8/nim/lib/system.nim
pkg: nimSHA2@0.1.1
  user: alex

Dump example

The included dump example prints every recoverable artifact:

cargo run --example dump -- sample.exe

Supported formats

  • ELF (Linux, BSD) — full support including function sizes from st_size
  • PE (Windows) — COFF symbol table + exports; MinGW and MSVC linked
  • Mach-O (macOS) — single-arch and universal/fat binaries

Dependencies

Deliberately minimal:

  • goblin — ELF/PE/Mach-O parsing
  • memchr — fast byte-level rodata scanning

License

Apache-2.0