chipi
A declarative instruction decoder generator. You define a CPUs encoding in
a portable .chipi DSL file. You then describe per-project codegen choices
in a *.bindings.chipi file. chipi produces decoders, disassemblers and
emulator dispatch code.
.chipi files are language-agnostic. They describe bit patterns, field
extractions and display formats. They contain no language-specific
information.
*.bindings.chipi files are project-specific. They pick which decoders to
lower, pick which language/processor to target (Rust, C++, IDA, Binary Ninja)
and much more.
Backends
| Backend | Output |
|---|---|
rust |
Decoder enum + decode() + Display. Optional emulator dispatch. |
cpp |
Single-header decoder with std::format. |
ida |
IDA Pro 9.x processor module (Python). |
binja |
Binary Ninja Architecture plugin (Python). |
The IDA and Binary Ninja outputs are experimental. They do not replace hand-written processor modules.
Architecture
*.chipi --------------+
|
*.bindings.chipi -----+--> bindings parser/lower --> codegen
|
+--> chipi-cli / chipi-build
Three crates:
| Crate | Purpose |
|---|---|
chipi-core |
Parser, IR, validation, bindings frontend, codegen backends. |
chipi-cli |
CLI: generate, check, explain, preview. |
chipi-build |
build.rs helper for Rust projects. |
Quick start
A .chipi instruction spec describes the encoding:
decoder Gekko {
width = 32
bit_order = msb0
}
addi [0:5]=001110 rd:u5[6:10] ra:u5[11:15] simm:s16[16:31]
| "addi r{rd}, r{ra}, {simm}"
ori [0:5]=011000 rs:u5[6:10] ra:u5[11:15] uimm:u16[16:31]
| "ori r{ra}, r{rs}, 0x{uimm:04x}"
A *.bindings.chipi file picks which decoder/dispatch to generate:
include "gekko.chipi"
target rust {
decoder Gekko {
output "$OUT_DIR/gekko.rs"
type gpr = crate::cpu::Gpr
type simm16 = i32
}
dispatch Gekko {
output "$OUT_DIR/gekko_dispatch.rs"
context crate::Cpu
handlers crate::cpu::interpreter
strategy fn_ptr_lut
invalid_handler crate::cpu::interpreter::invalid
instruction_type crate::cpu::Instruction {
output "$OUT_DIR/gekko_instr.rs"
}
handler alu {
addi
ori
}
}
}
Generate everything:
CLI
# Run all targets in the file.
# Validate without writing.
# Decode an opcode and print the match.
# Print the lowered configuration.
A bindings file may contain multiple targets. If --target is omitted in
that case, chipi reports an error. The same applies to --decoder when
more than one decoder or dispatch is reachable.
build.rs
Drive codegen from build.rs via chipi-build:
// build.rs
To select a single target or decoder:
generate_bindings_target?;
generate_bindings_decoder?;
chipi-build automatically emits cargo:rerun-if-changed for the bindings
file. It also emits one for every transitively included bindings file and one
for every included .chipi spec.
Bindings reference
Targets
A bindings file contains one or more target <kind> { ... } blocks:
target rust: Supportsdecoderanddispatchblocks.target cpp: Supportsdecoderblocks.target ida: Supportsprocessorblocks.target binja: Supportsarchitectureblocks.
include "*.chipi" brings in an instruction spec. include "*.bindings.chipi"
recursively merges another bindings file's targets.
The latter is useful for combining multiple CPUs in one project.
Rust decoders
target rust {
decoder Gekko {
output "$OUT_DIR/gekko.rs"
type gpr = crate::cpu::Gpr
type fpr = crate::cpu::Fpr
type simm16 = i32
subdecoder GekkoExt {
output "$OUT_DIR/gekko_ext.rs"
}
}
}
Type aliases declared in the .chipi file map to the Rust paths listed in
type ... = .... Sub-decoder blocks share the same set of options.
Rust dispatches
target rust {
dispatch Gekko {
output "$OUT_DIR/gekko_dispatch.rs"
context crate::Cpu
handlers crate::cpu::interpreter
strategy fn_ptr_lut
invalid_handler crate::cpu::interpreter::invalid
instruction_type crate::cpu::Instruction {
output "$OUT_DIR/gekko_instr.rs"
}
handler alu {
addi
ori
}
}
}
If no handler blocks are present, every instruction maps to a same-named
function under the handlers module path:
addi->crate::cpu::interpreter::addiori->crate::cpu::interpreter::ori
Each handler <name> { ... } block groups the listed instructions under
one const-generic handler taking <const OP: u32>:
addi->crate::cpu::interpreter::alu::<{ OP_ADDI }>ori->crate::cpu::interpreter::alu::<{ OP_ORI }>
The OP_* constants are emitted into the generated dispatch file. The user
writes:
Extra const-generic handler arguments
handler_const <expr> appends one or more extra const-generic arguments to
every handler reference in the generated LUT. Use it when handlers take
more const generics than just OP and the value is constant for the whole
binding (e.g. a SystemId selecting which CPU configuration this LUT is
for):
target rust {
dispatch Gekko {
output "$OUT_DIR/gekko_lut_gc.rs"
context crate::gamecube::GameCube
handlers crate::gekko::interpreter
handler_const crate::system::GC
handler alu { addi, ori }
}
dispatch Gekko {
output "$OUT_DIR/gekko_lut_wii.rs"
context crate::wii::Wii
handlers crate::gekko::interpreter
handler_const crate::system::WII
handler alu { addi, ori }
}
}
Generates alu::<{ OP_ADDI }, { crate::system::GC }> for the first dispatch
and alu::<{ OP_ADDI }, { crate::system::WII }> for the second. Two LUTs,
one shared generic handler module. The directive is repeatable for handlers
with three or more const generics; each entry becomes its own { ... }-wrapped
arg in declaration order.
For ungrouped instructions the same arguments apply: sc becomes
sc::<{ crate::system::GC }>.
Dispatch strategies
fn_ptr_lut. Static[Handler; N]arrays per decision-tree branch.jump_table. One#[inline(always)]function with nested matches.flat_lut. Full-width function-pointer table indexed by raw decoder value.flat_match. Full-width match with adjacent equal handlers compressed into ranges.
flat_lut and flat_match enumerate the entire 2^width key space. They
are best suited to small decoders or sub-decoders. chipi does not cap the
width for you. It will happily generate gigantic outputs if asked, so
make sure you don't run this against something huge.
Subdecoder / subdispatch
target rust {
decoder GcDsp {
output "$OUT_DIR/dsp.rs"
type reg5 = crate::dsp::Register
subdecoder GcDspExt {
output "$OUT_DIR/dsp_ext.rs"
}
}
dispatch GcDsp {
output "$OUT_DIR/dsp_dispatch.rs"
context crate::dsp::Dsp
handlers crate::dsp::interpreter
strategy fn_ptr_lut
invalid_handler crate::dsp::interpreter::invalid
subdispatch GcDspExt {
handlers crate::dsp::interpreter::ext
strategy flat_lut
invalid_handler crate::dsp::interpreter::invalid_ext
}
}
}
A subdispatch inherits context, strategy, and invalid_handler from
its parent unless overridden. handlers and instruction_type may also
be overridden.
IDA processor
target ida {
processor GcDsp {
output "plugins/ida/dsp_proc.py"
name "gcdsp"
long_name "GameCube DSP"
id 0x8002
address_size 16
bytes_per_unit 2
registers {
ar0
ar1
ar2
ar3
CS
DS
}
segment_registers {
CS
DS
}
flow {
calls {
callcc
}
returns {
retcc
}
stops {
halt
}
}
}
}
segment_registers must be a subset of registers. Instruction names in
flow must exist in the decoder.
Binary Ninja architecture
target binja {
architecture GcDsp {
output "plugins/binja/dsp_arch.py"
name "gcdsp"
address_size 2
default_int_size 2
endianness big
registers {
ar0
ar1
ar2
ar3
}
}
}
endianness must be big or little.
Diagnostics
chipi reports validation errors with a span and an optional did you mean
suggestion. For example:
error: unknown instruction 'halttt' in handler group
--> test_grouped.bindings.chipi:14
= help: did you mean "halt"?
flat_lut / flat_match ambiguity reports each conflicting instruction.
This happens when one raw value matches two distinct handlers:
error: flat dispatch cannot resolve raw opcode 0x0000007c
matched instructions:
add -> crate::cpu::interpreter::add
addx -> crate::cpu::interpreter::addx
flat dispatch requires each raw value to resolve to exactly one handler.
Examples
| Project | Description |
|---|---|
| chipi-gekko | GameCube CPU & DSP disassembler (Rust) |
| chipi-gekko-cpp | GameCube CPU disassembler (C++) |
| gc-dsp-ida | GameCube DSP processor plugin for IDA Pro 9.x |
| gc-dsp-binja | GameCube DSP processor plugin for Binary Ninja |
| chipi-spec | Reusable .chipi specs |
| chipi-vscode | VS Code syntax highlighting for .chipi files |