unarm
Parser for the ARM instruction set, inspired by powerpc-rs. It currently supports the following versions:
- ARMv4
- ARMv4T
- ARMv5T
- ARMv5TE
- ARMv5TEJ
- ARMv6
- ARMv6K
It also supports these extensions:
- VFPv2
Contents
About
- Most of the parser is generated from
isa.yamlby the/generator/module. - It accepts all 232 possible ARM instructions and 216 Thumb instructions (plus the 4-byte instructions) without errors.
- No promises that the output is 100% correct.
- Some illegal instructions may not be parsed as illegal.
- Some instructions may not stringify correctly.
- (more, probably)
Performance
Tested on all 2^32 ARM and Thumb instructions in ARMv6K using the /fuzz/ module on a single thread.
AMD Ryzen 7 7700X:
| Test | Duration | Throughput |
|---|---|---|
| Parse ARM | 27.11s | ~604 MB/s |
| Parse and stringify ARM | 379.15s | ~43 MB/s |
| Parse Thumb | 20.89s | ~392 MB/s |
| Parse and stringify Thumb | 305.58s | ~27 MB/s |
Usage
Parsing one instruction
use *;
let pc = 0;
let options = default;
let ins = parse_arm;
assert_eq!;
assert_eq!;
4-byte Thumb instructions
Some instructions in Thumb are 4 bytes long instead of 2. To parse them properly, put the second
half of the instruction in the upper half of the code passed to parse_thumb.
let first = 0xf099;
let second = 0xe866;
let = parse_thumb;
You can do this for 2-byte instructions as well by passing two consecutive instructions in the same way. parse_thumb returns both the parsed instruction and its size, so you can tell if only one or both of the 16-bit words were parsed.
The FormatIns trait
The FormatIns trait is used for formatting an instruction. You can implement this trait yourself to
customize the formatted output. Here is an example:
// Write is a supertrait of FormatIns
let mut formatter = MyFormatter ;
let ins = parse_arm;
formatter.write_ins.unwrap;