ud-arch-codec 0.2.0

Arch-codec trait + open registry: the shared interface every univdreams arch backend implements so the compile/decompile pipeline can dispatch generically.
Documentation
  • Coverage
  • 90.91%
    40 out of 44 items documented0 out of 24 items with examples
  • Size
  • Source code size: 29.19 kB This is the summed size of all the files inside the crates.io package for this release.
  • Documentation size: 554.3 kB This is the summed size of all files generated by rustdoc for all configured targets
  • Ø build duration
  • this release: 4s Average build duration of successful builds.
  • all releases: 5s Average build duration of successful builds in releases after 2024-10-23.
  • Links
  • Homepage
  • KarpelesLab/univdreams
    1 1 0
  • crates.io
  • Dependencies
  • Versions
  • Owners
  • MagicalTux

ud-arch-codec

The shared spine every univdreams arch backend speaks. One trait, one open registry; the compile and decompile pipelines dispatch through it so every arch sees the same plumbing without duplicating it.

What's in the crate

  • ArchCodec — the trait. Methods cover the encoder surface the rest of the pipeline needs (assemble, jump, call, conditional jump, switch dispatch, move, arith, return, size queries) with Unsupported defaults so arches only implement what they model.
  • ArchError — soft (Unsupported) vs hard (Assemble, OutOfRange, UnknownArch) errors. The pipeline treats Unsupported as "fall back to pinned bytes."
  • EncodeHints — per-call interpretation hints (today just wide, x86's short-vs-rel32 toggle).
  • SwitchSpec — shared dispatch descriptor (today only "msvc-jmp-table" is recognised).
  • registry::register / for_arch — open registry of (arch_name, e_machine) -> Box<dyn ArchCodec> factories.

Adding a new arch

  1. Create the arch crate (crates/ud-arch-<name>/).

  2. Add ud-arch-codec to its Cargo.toml.

  3. Implement ArchCodec for your codec type. Start with the four "always-supported" methods (assemble_one, encode_jump, encode_call, encode_cond_jump) and the three size queries — return Unsupported from everything else until you need it.

  4. Expose a pub fn register() that submits your factory:

    pub fn register() {
        ud_arch_codec::register(factory);
    }
    
    fn factory(
        arch_name: Option<&str>,
        e_machine: Option<u64>,
    ) -> Option<Box<dyn ArchCodec>> {
        if arch_name == Some("riscv64") { Some(Box::new(RiscV64Codec)) }
        else { None }
    }
    
  5. Call your_crate::register() from ud_translate::register_all_arches (the workspace's single entry point for arch registration). The CLI and wasm playground both invoke that automatically.

That's the contract. The lower path and decompile-side byte-drop pass start using your codec as soon as for_arch resolves it from a parsed @module block.

Stmt → trait method map

Stmt:: variant Codec method asked at lower time Falls back to
Asm { text } (empty bytes) assemble_one(text, ip) (+ desymbolize) Hard error
Goto { target, wide } encode_jump(source, target, hints) Hard error
IfGoto { cond_code, target, wide } encode_cond_jump_with_code(cond_code, …) Hard error
Call { direct_target } (when set) encode_call(source, target, hints) Hard error
Switch { dispatch, … } encode_switch_dispatch(spec) Hard error
IfBlock cond/tail (empty) encode_cond_jump(text, …) / encode_jump(…) Hard error
WhileBlock entry/tail (empty) same Hard error
Move { dst, src } (empty bytes) encode_move(dst, src) Hard error
Return { value } (empty bytes) encode_return(value) Hard error

The decompile-side byte-drop pass mirrors each of these: it only clears bytes when the codec re-encodes them to match the original. The byte-identity guard is unconditional.

Layering note

ud-arch-codec deliberately doesn't depend on ud-astud-ast already depends on ud-arch-x86 (for the canonical-emit derivation of head_bytes), so making ud-arch-codec depend on ud-ast would close a cycle. The registry takes raw (arch_name, e_machine) pairs that ud-translate extracts from a parsed module.

If you need richer Stmt-aware encoder types here later, do it through a side crate that depends on both — not by broadening ud-arch-codec.