chipi
A declarative instruction decoder generator. You define a CPUs encoding in
a portable .chipi DSL file. You then describe per-project codegen choices
in a *.bindings.chipi file. chipi produces decoders, disassemblers and
emulator dispatch code.
.chipi files are language-agnostic. They describe bit patterns, field
extractions and display formats. They contain no language-specific
information.
*.bindings.chipi files are project-specific. They pick which decoders to
lower, pick which language/processor to target (Rust, C++, IDA, Binary Ninja)
and much more.
Backends
| Backend | Output |
|---|---|
rust |
Decoder enum + decode() + Display. Optional emulator dispatch. |
cpp |
Single-header decoder with std::format. |
ida |
IDA Pro 9.x processor module (Python). |
binja |
Binary Ninja Architecture plugin (Python). |
The IDA and Binary Ninja outputs are experimental. They do not replace hand-written processor modules.
Architecture
*.chipi --------------+
|
*.bindings.chipi -----+--> bindings parser/lower --> codegen
|
+--> chipi-cli / chipi-build
Three crates:
| Crate | Purpose |
|---|---|
chipi-core |
Parser, IR, validation, bindings frontend, codegen backends. |
chipi-cli |
CLI: generate, check, explain, preview. |
chipi-build |
build.rs helper for Rust projects. |
Quick start
A .chipi instruction spec describes the encoding:
decoder Gekko {
width = 32
bit_order = msb0
}
addi [0:5]=001110 rd:u5[6:10] ra:u5[11:15] simm:s16[16:31]
| "addi r{rd}, r{ra}, {simm}"
ori [0:5]=011000 rs:u5[6:10] ra:u5[11:15] uimm:u16[16:31]
| "ori r{ra}, r{rs}, 0x{uimm:04x}"
A *.bindings.chipi file picks which decoder/dispatch to generate:
include "gekko.chipi"
target rust {
decoder Gekko {
output "$OUT_DIR/gekko.rs"
type gpr = crate::cpu::Gpr
type simm16 = i32
}
dispatch Gekko {
output "$OUT_DIR/gekko_dispatch.rs"
context crate::Cpu
handlers crate::cpu::interpreter
strategy fn_ptr_lut
invalid_handler crate::cpu::interpreter::invalid
instruction_type crate::cpu::Instruction {
output "$OUT_DIR/gekko_instr.rs"
}
handler alu<const OP> {
addi
ori
}
}
}
Generate everything:
CLI
# Run all targets in the file.
# Validate without writing.
# Decode an opcode and print the match.
# Print the lowered configuration.
A bindings file may contain multiple targets. If --target is omitted in
that case, chipi reports an error. The same applies to --decoder when
more than one decoder or dispatch is reachable.
build.rs
Drive codegen from build.rs via chipi-build:
// build.rs
To select a single target or decoder:
generate_bindings_target?;
generate_bindings_decoder?;
chipi-build automatically emits cargo:rerun-if-changed for the bindings
file. It also emits one for every transitively included bindings file and one
for every included .chipi spec.
Bindings reference
Targets
A bindings file contains one or more target <kind> { ... } blocks:
target rust: Supportsdecoderanddispatchblocks.target cpp: Supportsdecoderblocks.target ida: Supportsprocessorblocks.target binja: Supportsarchitectureblocks.
include "*.chipi" brings in an instruction spec. include "*.bindings.chipi"
recursively merges another bindings file's targets.
The latter is useful for combining multiple CPUs in one project.
Rust decoders
target rust {
decoder Gekko {
output "$OUT_DIR/gekko.rs"
type gpr = crate::cpu::Gpr
type fpr = crate::cpu::Fpr
type simm16 = i32
subdecoder GekkoExt {
output "$OUT_DIR/gekko_ext.rs"
}
}
}
Type aliases declared in the .chipi file map to the Rust paths listed in
type ... = .... Sub-decoder blocks share the same set of options.
Rust dispatches
target rust {
dispatch Gekko {
output "$OUT_DIR/gekko_dispatch.rs"
context crate::Cpu
handlers crate::cpu::interpreter
strategy fn_ptr_lut
invalid_handler crate::cpu::interpreter::invalid
instruction_type crate::cpu::Instruction {
output "$OUT_DIR/gekko_instr.rs"
}
handler alu<const OP> {
addi
ori
}
}
}
If no handler blocks are present, every instruction maps to a same-named
function under the handlers module path:
addi->crate::cpu::interpreter::addiori->crate::cpu::interpreter::ori
Each handler <name><const OP> { ... } block groups the listed
instructions under one const-generic handler:
addi->crate::cpu::interpreter::alu::<{ OP_ADDI }>ori->crate::cpu::interpreter::alu::<{ OP_ORI }>
The OP_* constants are emitted into the generated dispatch file.
Dispatch strategies
fn_ptr_lut. Static[Handler; N]arrays per decision-tree branch.jump_table. One#[inline(always)]function with nested matches.flat_lut. Full-width function-pointer table indexed by raw decoder value.flat_match. Full-width match with adjacent equal handlers compressed into ranges.
flat_lut and flat_match enumerate the entire 2^width key space. They
are best suited to small decoders or sub-decoders. chipi does not cap the
width for you. It will happily generate gigantic outputs if asked, so
make sure you don't run this against something huge.
Subdecoder / subdispatch
target rust {
decoder GcDsp {
output "$OUT_DIR/dsp.rs"
type reg5 = crate::dsp::Register
subdecoder GcDspExt {
output "$OUT_DIR/dsp_ext.rs"
}
}
dispatch GcDsp {
output "$OUT_DIR/dsp_dispatch.rs"
context crate::dsp::Dsp
handlers crate::dsp::interpreter
strategy fn_ptr_lut
invalid_handler crate::dsp::interpreter::invalid
subdispatch GcDspExt {
handlers crate::dsp::interpreter::ext
strategy flat_lut
invalid_handler crate::dsp::interpreter::invalid_ext
}
}
}
A subdispatch inherits context, strategy, and invalid_handler from
its parent unless overridden. handlers and instruction_type may also
be overridden.
IDA processor
target ida {
processor GcDsp {
output "plugins/ida/dsp_proc.py"
name "gcdsp"
long_name "GameCube DSP"
id 0x8002
address_size 16
bytes_per_unit 2
registers {
ar0
ar1
ar2
ar3
CS
DS
}
segment_registers {
CS
DS
}
flow {
calls {
callcc
}
returns {
retcc
}
stops {
halt
}
}
}
}
segment_registers must be a subset of registers. Instruction names in
flow must exist in the decoder.
Binary Ninja architecture
target binja {
architecture GcDsp {
output "plugins/binja/dsp_arch.py"
name "gcdsp"
address_size 2
default_int_size 2
endianness big
registers {
ar0
ar1
ar2
ar3
}
}
}
endianness must be big or little.
Diagnostics
chipi reports validation errors with a span and an optional did you mean
suggestion. For example:
error: unknown instruction 'halttt' in handler group
--> test_grouped.bindings.chipi:14
= help: did you mean "halt"?
flat_lut / flat_match ambiguity reports each conflicting instruction.
This happens when one raw value matches two distinct handlers:
error: flat dispatch cannot resolve raw opcode 0x0000007c
matched instructions:
add -> crate::cpu::interpreter::add
addx -> crate::cpu::interpreter::addx
flat dispatch requires each raw value to resolve to exactly one handler.
Examples
| Project | Description |
|---|---|
| chipi-gekko | GameCube CPU & DSP disassembler (Rust) |
| chipi-gekko-cpp | GameCube CPU disassembler (C++) |
| gc-dsp-ida | GameCube DSP processor plugin for IDA Pro 9.x |
| gc-dsp-binja | GameCube DSP processor plugin for Binary Ninja |
| chipi-spec | Reusable .chipi specs |
| chipi-vscode | VS Code syntax highlighting for .chipi files |