chipi-core 0.9.1

Core library for chipi: parser, IR, and code generation backends for instruction decoder generation
Documentation

chipi

A declarative instruction decoder generator. You define a CPUs encoding in a portable .chipi DSL file. You then describe per-project codegen choices in a *.bindings.chipi file. chipi produces decoders, disassemblers and emulator dispatch code.

.chipi files are language-agnostic. They describe bit patterns, field extractions and display formats. They contain no language-specific information.

*.bindings.chipi files are project-specific. They pick which decoders to lower, pick which language/processor to target (Rust, C++, IDA, Binary Ninja) and much more.

Backends

Backend Output
rust Decoder enum + decode() + Display. Optional emulator dispatch.
cpp Single-header decoder with std::format.
ida IDA Pro 9.x processor module (Python).
binja Binary Ninja Architecture plugin (Python).

The IDA and Binary Ninja outputs are experimental. They do not replace hand-written processor modules.

Architecture

*.chipi --------------+
                      |
*.bindings.chipi -----+--> bindings parser/lower --> codegen
                      |
                      +--> chipi-cli / chipi-build

Three crates:

Crate Purpose
chipi-core Parser, IR, validation, bindings frontend, codegen backends.
chipi-cli CLI: generate, check, explain, preview.
chipi-build build.rs helper for Rust projects.

Quick start

A .chipi instruction spec describes the encoding:

decoder Gekko {
    width = 32
    bit_order = msb0
}

addi   [0:5]=001110 rd:u5[6:10] ra:u5[11:15] simm:s16[16:31]
       | "addi r{rd}, r{ra}, {simm}"

ori    [0:5]=011000 rs:u5[6:10] ra:u5[11:15] uimm:u16[16:31]
       | "ori r{ra}, r{rs}, 0x{uimm:04x}"

A *.bindings.chipi file picks which decoder/dispatch to generate:

include "gekko.chipi"

target rust {
    decoder Gekko {
        output "$OUT_DIR/gekko.rs"

        type gpr = crate::cpu::Gpr
        type simm16 = i32
    }

    dispatch Gekko {
        output "$OUT_DIR/gekko_dispatch.rs"

        context crate::Cpu
        handlers crate::cpu::interpreter
        strategy fn_ptr_lut

        invalid_handler crate::cpu::interpreter::invalid

        instruction_type crate::cpu::Instruction {
            output "$OUT_DIR/gekko_instr.rs"
        }

        handler alu {
            addi
            ori
        }
    }
}

Generate everything:

chipi generate specs/gekko.bindings.chipi

CLI

# Run all targets in the file.
chipi generate <bindings>
chipi generate <bindings> --target rust
chipi generate <bindings> --target rust --decoder Gekko

# Validate without writing.
chipi check <bindings>

# Decode an opcode and print the match.
chipi explain <bindings> --decoder Gekko 0x38600001

# Print the lowered configuration.
chipi preview <bindings>
chipi preview <bindings> --target rust

A bindings file may contain multiple targets. If --target is omitted in that case, chipi reports an error. The same applies to --decoder when more than one decoder or dispatch is reachable.

build.rs

Drive codegen from build.rs via chipi-build:

// build.rs
fn main() {
    chipi_build::generate_bindings("specs/gekko.bindings.chipi")
        .expect("chipi codegen failed");
}

To select a single target or decoder:

chipi_build::generate_bindings_target("specs/gekko.bindings.chipi", "rust")?;
chipi_build::generate_bindings_decoder("specs/dsp.bindings.chipi", "rust", "GcDsp")?;

chipi-build automatically emits cargo:rerun-if-changed for the bindings file. It also emits one for every transitively included bindings file and one for every included .chipi spec.

Bindings reference

Targets

A bindings file contains one or more target <kind> { ... } blocks:

  • target rust: Supports decoder and dispatch blocks.
  • target cpp: Supports decoder blocks.
  • target ida: Supports processor blocks.
  • target binja: Supports architecture blocks.

include "*.chipi" brings in an instruction spec. include "*.bindings.chipi" recursively merges another bindings file's targets. The latter is useful for combining multiple CPUs in one project.

Rust decoders

target rust {
    decoder Gekko {
        output "$OUT_DIR/gekko.rs"

        type gpr = crate::cpu::Gpr
        type fpr = crate::cpu::Fpr
        type simm16 = i32

        subdecoder GekkoExt {
            output "$OUT_DIR/gekko_ext.rs"
        }
    }
}

Type aliases declared in the .chipi file map to the Rust paths listed in type ... = .... Sub-decoder blocks share the same set of options.

Rust dispatches

target rust {
    dispatch Gekko {
        output "$OUT_DIR/gekko_dispatch.rs"

        context crate::Cpu
        handlers crate::cpu::interpreter
        strategy fn_ptr_lut

        invalid_handler crate::cpu::interpreter::invalid

        instruction_type crate::cpu::Instruction {
            output "$OUT_DIR/gekko_instr.rs"
        }

        handler alu {
            addi
            ori
        }
    }
}

If no handler blocks are present, every instruction maps to a same-named function under the handlers module path:

  • addi -> crate::cpu::interpreter::addi
  • ori -> crate::cpu::interpreter::ori

Each handler <name> { ... } block groups the listed instructions under one const-generic handler taking <const OP: u32>:

  • addi -> crate::cpu::interpreter::alu::<{ OP_ADDI }>
  • ori -> crate::cpu::interpreter::alu::<{ OP_ORI }>

The OP_* constants are emitted into the generated dispatch file. The user writes:

pub fn alu<const OP: u32>(ctx: &mut Cpu, instr: Instruction) {
    match OP {
        OP_ADDI => { /* ... */ }
        OP_ORI  => { /* ... */ }
        _ => unreachable!(),
    }
}

Extra const-generic handler arguments

handler_const <expr> appends one or more extra const-generic arguments to every handler reference in the generated LUT. Use it when handlers take more const generics than just OP and the value is constant for the whole binding (e.g. a SystemId selecting which CPU configuration this LUT is for):

target rust {
    dispatch Gekko {
        output "$OUT_DIR/gekko_lut_gc.rs"
        context crate::gamecube::GameCube
        handlers crate::gekko::interpreter
        handler_const crate::system::GC

        handler alu { addi, ori }
    }

    dispatch Gekko {
        output "$OUT_DIR/gekko_lut_wii.rs"
        context crate::wii::Wii
        handlers crate::gekko::interpreter
        handler_const crate::system::WII

        handler alu { addi, ori }
    }
}

Generates alu::<{ OP_ADDI }, { crate::system::GC }> for the first dispatch and alu::<{ OP_ADDI }, { crate::system::WII }> for the second. Two LUTs, one shared generic handler module. The directive is repeatable for handlers with three or more const generics; each entry becomes its own { ... }-wrapped arg in declaration order.

For ungrouped instructions the same arguments apply: sc becomes sc::<{ crate::system::GC }>.

Dispatch strategies

  • fn_ptr_lut. Static [Handler; N] arrays per decision-tree branch.
  • jump_table. One #[inline(always)] function with nested matches.
  • flat_lut. Full-width function-pointer table indexed by raw decoder value.
  • flat_match. Full-width match with adjacent equal handlers compressed into ranges.

flat_lut and flat_match enumerate the entire 2^width key space. They are best suited to small decoders or sub-decoders. chipi does not cap the width for you. It will happily generate gigantic outputs if asked, so make sure you don't run this against something huge.

Subdecoder / subdispatch

target rust {
    decoder GcDsp {
        output "$OUT_DIR/dsp.rs"

        type reg5 = crate::dsp::Register

        subdecoder GcDspExt {
            output "$OUT_DIR/dsp_ext.rs"
        }
    }

    dispatch GcDsp {
        output "$OUT_DIR/dsp_dispatch.rs"

        context crate::dsp::Dsp
        handlers crate::dsp::interpreter
        strategy fn_ptr_lut
        invalid_handler crate::dsp::interpreter::invalid

        subdispatch GcDspExt {
            handlers crate::dsp::interpreter::ext
            strategy flat_lut
            invalid_handler crate::dsp::interpreter::invalid_ext
        }
    }
}

A subdispatch inherits context, strategy, and invalid_handler from its parent unless overridden. handlers and instruction_type may also be overridden.

IDA processor

target ida {
    processor GcDsp {
        output "plugins/ida/dsp_proc.py"

        name "gcdsp"
        long_name "GameCube DSP"
        id 0x8002

        address_size 16
        bytes_per_unit 2

        registers {
            ar0
            ar1
            ar2
            ar3
            CS
            DS
        }

        segment_registers {
            CS
            DS
        }

        flow {
            calls {
                callcc
            }
            returns {
                retcc
            }
            stops {
                halt
            }
        }
    }
}

segment_registers must be a subset of registers. Instruction names in flow must exist in the decoder.

Binary Ninja architecture

target binja {
    architecture GcDsp {
        output "plugins/binja/dsp_arch.py"

        name "gcdsp"

        address_size 2
        default_int_size 2
        endianness big

        registers {
            ar0
            ar1
            ar2
            ar3
        }
    }
}

endianness must be big or little.

Diagnostics

chipi reports validation errors with a span and an optional did you mean suggestion. For example:

error: unknown instruction 'halttt' in handler group
 --> test_grouped.bindings.chipi:14
 = help: did you mean "halt"?

flat_lut / flat_match ambiguity reports each conflicting instruction. This happens when one raw value matches two distinct handlers:

error: flat dispatch cannot resolve raw opcode 0x0000007c
   matched instructions:
     add  -> crate::cpu::interpreter::add
     addx -> crate::cpu::interpreter::addx
   flat dispatch requires each raw value to resolve to exactly one handler.

Examples

Project Description
chipi-gekko GameCube CPU & DSP disassembler (Rust)
chipi-gekko-cpp GameCube CPU disassembler (C++)
gc-dsp-ida GameCube DSP processor plugin for IDA Pro 9.x
gc-dsp-binja GameCube DSP processor plugin for Binary Ninja
chipi-spec Reusable .chipi specs
chipi-vscode VS Code syntax highlighting for .chipi files