# chipi
A declarative instruction decoder generator using a custom DSL. Define your CPUs instruction encoding in a `.chipi` file, and chipi generates a decoder and disassembler for you. Seemless interaction with Rust types.
An example disassembler for GameCube CPU and DSP can be found [here](https://github.com/ioncodes/chipi-gekko).
## Usage
Add to your `Cargo.toml`:
```toml
[build-dependencies]
chipi = "0.3.0"
```
In `build.rs`:
```rs
use std::env;
use std::path::PathBuf;
fn main() {
let out_dir = PathBuf::from(env::var("OUT_DIR").unwrap());
chipi::generate("ppc.chipi", out_dir.join("ppc.rs").to_str().unwrap())
.expect("failed to generate decoder");
println!("cargo:rerun-if-changed=ppc.chipi");
}
```
Then use it:
```rs
mod ppc {
include!(concat!(env!("OUT_DIR"), "/ppc.rs"));
}
match ppc::PpcInstruction::decode(&data[offset..]) {
Some((instr, bytes)) => {
println!("{}", instr); // uses generated Display impl
offset += bytes;
}
None => {
println!(".long {:#010x}", raw);
offset += 4;
}
};
```
The generated `decode()` has the signature `pub fn decode(data: &[u8]) -> Option<(Self, usize)>`, where `usize` is the number of bytes consumed.
The generated `Display` impl uses format lines defined in the DSL. You can also override formatting per-instruction by implementing the generated trait:
```rs
struct MyFormat;
impl ppc::PpcFormat for MyFormat {
fn fmt_bx(li: i32, aa: bool, lk: bool, f: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(f, "BRANCH {:#x}", li)
}
}
// Use custom formatting
println!("{}", instr.display::<MyFormat>());
```
## DSL
Create a `.chipi` file describing your instruction set:
```chipi
import crate::cpu::Register
decoder Ppc {
width = 32
bit_order = msb0
endian = big
}
type reg = u8 as Register
type simm16 = i32 { sign_extend(16) }
type simm24 = i32 { sign_extend(24), shift_left(2) }
# Branch
bx [0:5]=010010 li:simm24[6:29] aa:bool[30] lk:bool[31]
| "b{lk ? l}{aa ? a} {li:#x}"
# Arithmetic
addi [0:5]=001110 rd:reg[6:10] ra:reg[11:15] simm:simm16[16:31]
| "addi {rd}, {ra}, {simm}"
```
- `decoder` block sets the instruction width (in bits), bit ordering, and byte endianness
- Each line defines an instruction: a name, fixed-bit patterns for matching, and named fields to extract
- Fields have a name, a type (`u8`, `u16`, ...), and a bit range
- Fixed bits use `[range]=value` syntax
- Use `0` or `1` for bits that must match exactly
- Use `?` for wildcard bits
- Format lines start with `|` and define disassembly output
- Comments start with `#`
### Bit ordering
- `msb0`: position 0 is the most significant bit
- `lsb0`: position 0 is the least significant bit
### Wildcard bits
Use `?` in fixed bit patterns to indicate "don't care" bits that can be any value. This is useful when certain bit positions are reserved, unused, or ignored by the hardware:
```chipi
# Match when bits [15:8] are 0x8c, bits [7:0] can be anything
clr15 [15:0]=10001100????????
| "CLR15"
# Mix wildcards with specific bits
nop [7:4]=0000 [3:0]=????
| "nop"
```
Wildcard bits are excluded from the matching mask, meaning instructions will match regardless of the values in those positions. This allows you to accurately represent instruction encodings where certain bits are architecturally undefined or reserved.
### Variable-Length Instructions
Chipi supports variable-length instructions. When a bit position exceeds `width - 1`, it implicitly references subsequent units (1 unit = `width` bits). The unit index is automatically computed as `bit_position / width`.
```chipi
decoder GcDsp {
width = 16
bit_order = msb0
endian = big # byte endianness (big or little, default: big)
max_units = 2 # optional safety check
}
# 1-unit instruction: all bits within [0:15]
nop [0:15]=0000000000000000
| "nop"
# 2-unit instruction: bits [16:31] are in the second unit
lri [0:10]=00000010000 rd:u5[11:15] imm:u16[16:31]
| "lri r{rd}, #0x{imm:04x}"
call [0:15]=0000001010111111 addr:u16[16:31]
| "call 0x{addr:04x}"
```
The generated decoder always accepts `&[u8]` and returns bytes consumed. For single-unit instructions, it returns `width / 8` bytes; for multi-unit instructions, it returns `unit_count * (width / 8)` bytes.
### Optional Safety Guard: `max_units`
The `max_units` decoder option acts as a compile-time safety net:
```chipi
decoder GcDsp {
width = 16
bit_order = msb0
endian = big
max_units = 2 # enforce maximum instruction length
}
```
It ensures at compile-time that bitranges do not exceed `max_units * width`. Helps with catching typos.
### Custom types
Use `type` to create type aliases with optional transformations or wrappers:
```chipi
# Simple alias
type byte = u8
# With transformation
type simm16 = i32 { sign_extend(16) }
# With custom wrapper (must be imported)
type reg = u8 as Register
# Multiple transformations (comma-separated)
type addr = u32 { shift_left(2), zero_extend(32) }
# With display format hint
type simm16 = i32 { sign_extend(16), display(signed_hex) }
type uimm = u16 { display(hex) }
```
**Builtin types:**
- `bool`: Converts extracted bit to `bool`
- `u1` through `u7`: Extracted as `u8` in Rust
- `u8`, `u16`, `u32`: Unsigned integer types
- `i8`, `i16`, `i32`: Signed integer types
**Builtin transformations:**
- `sign_extend(n)`: Sign-extends the extracted value from n bits
- `zero_extend(n)`: Zero-extends the extracted value from n bits
- `shift_left(n)`: Shifts the value left by n bits
**Display formats:**
- `display(signed_hex)`: Formats as signed hex (`0x1A`, `-0x1A`, `0`)
- `display(hex)`: Formats as unsigned hex (`0x1A`, `0`)
### Format lines
Format lines follow an instruction definition and control how it is displayed. They start with `|` and contain a quoted format string:
```chipi
bx [0:5]=010010 li:simm24[6:29] aa:bool[30] lk:bool[31]
| "b{lk ? l}{aa ? a} {li:#x}"
```
This produces `b 0x100`, `bl 0x100`, `ba 0x100`, or `bla 0x100` depending on the flag fields.
**Field references:** `{field}` inserts a field value. Add a format specifier with `{field:#x}` (hex), `{field:#b}` (binary), etc.
**Ternary expressions:** `{field ? text}` emits `text` if the field is nonzero, nothing otherwise. `{field ? yes : no}` provides an else branch.
**Arithmetic:** `{a + b * 4}` evaluates inline arithmetic (`+`, `-`, `*`, `/`, `%`).
**Unary negation:** `{-field}` negates a field value.
```chipi
addi [0:5]=001110 rd:reg[6:10] ra:reg[11:15] simm:simm16[16:31]
| simm < 0 : "subi {rd}, {ra}, {-simm}"
| "addi {rd}, {ra}, {simm}"
```
**Builtin functions:** `{rotate_right(val, amt)}` and `{rotate_left(val, amt)}`.
**Guards:** Multiple format lines can be used with guard conditions to select different output based on field values:
```chipi
addi [0:5]=001110 rd:reg[6:10] ra:reg[11:15] simm:simm16[16:31]
| ra == 0: "li {rd}, {simm}"
| "addi {rd}, {ra}, {simm}"
```
Guard conditions support `==`, `!=`, `<`, `<=`, `>`, `>=` and can be joined with `,` or `&&`. Guard operands can be field names, integer literals, or arithmetic expressions (`sh == 32 - mb`). The last format line may omit the guard (acts as the default).
**Escapes:** Use `\{`, `\}`, `\?`, `\:` to emit literal characters.
### Maps
Maps define lookup tables for use in format strings:
```chipi
map spr_name(spr) {
1 => "xer"
8 => "lr"
9 => "ctr"
_ => "???"
}
mtspr [0:5]=011111 rs:reg[6:10] spr:u16[11:20] [21:30]=0111010011 [31]=0
| "mtspr {spr_name(spr)}, {rs}"
```
Map parameters can also use `{param}` interpolation in the output, in which case the map returns a `String` instead of `&'static str`:
```chipi
map ea(mode, reg) {
0, _ => "d{reg}"
1, _ => "a{reg}"
_ => "???"
}
```
### Formatting trait
chipi generates a trait (e.g. `PpcFormat`) with one method per instruction. Each method has a default implementation from the format lines. To override specific instructions, implement the trait on your own struct:
```rs
struct MyFormat;
impl ppc::PpcFormat for MyFormat {
// Override just this one; all others keep their defaults
fn fmt_addi(rd: &Register, ra: &Register, simm: i32,
f: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(f, "ADDI r{}, r{}, {}", rd, ra, simm)
}
}
println!("{}", instr.display::<MyFormat>());
```
Instructions without format lines get a raw fallback: `instr_name field1, field2, ...`.
## Syntax Highlighting
[vscode](https://github.com/ioncodes/chipi-vscode)