[−][src]Crate iced_x86
iced-x86 is a high performance and correct x86 (16/32/64-bit) instruction decoder, disassembler and assembler written in Rust.
It can be used for static analysis of x86/x64 binaries, to rewrite code (eg. remove garbage instructions), to relocate code or as a disassembler.
- ✔️Supports all Intel and AMD instructions
- ✔️Correct: All instructions are tested and iced has been tested against other disassemblers/assemblers (xed, gas, objdump, masm, dumpbin, nasm, ndisasm) and fuzzed
- ✔️100% Rust code
- ✔️The formatter supports masm, nasm, gas (AT&T), Intel (XED) and there are many options to customize the output
- ✔️The decoder is 4x+ faster than other similar libraries and doesn't allocate any memory
- ✔️Small decoded instructions, only 32 bytes
- ✔️The encoder can be used to re-encode decoded instructions at any address
- ✔️API to get instruction info, eg. read/written registers, memory and rflags bits; CPUID feature flag, flow control info, etc
- ✔️Supports
#![no_std]
andWebAssembly
- ✔️Supports
rustc
1.20.0
or later - ✔️Few dependencies (
static_assertions
andlazy_static
) - ✔️License: MIT
Usage
Add this to your Cargo.toml
:
[dependencies]
iced-x86 = "1.9.0"
Or to customize which features to use:
[dependencies.iced-x86]
version = "1.9.0"
default-features = false
# See below for all features
features = ["std", "decoder", "masm"]
If you're using Rust 2015 edition you must also add this to your lib.rs
or main.rs
:
extern crate iced_x86;
Crate feature flags
You can enable/disable these in your Cargo.toml
file.
decoder
: (✔️Enabled by default) Enables the decoderencoder
: (✔️Enabled by default) Enables the encoderblock_encoder
: (✔️Enabled by default) Enables theBlockEncoder
. This feature enablesencoder
op_code_info
: (✔️Enabled by default) Enables getting instruction metadata (OpCodeInfo
). This feature enablesencoder
instr_info
: (✔️Enabled by default) Enables the instruction info codegas
: (✔️Enabled by default) Enables the GNU Assembler (AT&T) formatterintel
: (✔️Enabled by default) Enables the Intel (XED) formattermasm
: (✔️Enabled by default) Enables the masm formatternasm
: (✔️Enabled by default) Enables the nasm formatterfast_fmt
: (✔️Enabled by default) EnablesFastFormatter
(masm syntax) which is ~1.9x faster than the other formatters (the time includes decoding + formatting). Use it if formatting speed is more important than being able to re-assemble formatted instructions or if targeting wasm (this formatter uses less code).db
: Enables creatingdb
,dw
,dd
,dq
instructions. It's not enabled by default because it's possible to store up to 16 bytes in the instruction and then use another method to read an enum value.std
: (✔️Enabled by default) Enables thestd
crate.std
orno_std
must be defined, but not both.no_std
: Enables#![no_std]
.std
orno_std
must be defined, but not both. This feature uses thealloc
crate (rustc
1.36.0+
) and thehashbrown
crate.exhaustive_enums
: Enables exhaustive enums, i.e., no enum has the#[non_exhaustive]
attributeno_vex
: Disables allVEX
instructions. See below for more info.no_evex
: Disables allEVEX
instructions. See below for more info.no_xop
: Disables allXOP
instructions. See below for more info.no_d3now
: Disables all3DNow!
instructions. See below for more info.
If you use no_vex
, no_evex
, no_xop
or no_d3now
, you should run the generator again (before building iced) to generate even smaller output.
How-tos
- Disassemble (decode and format instructions)
- Create and encode instructions
- Disassemble with a symbol resolver
- Disassemble with colorized text
- Move code in memory (eg. hook a function)
- Get instruction info, eg. read/written regs/mem, control flow info, etc
- Get the virtual address of a memory operand
- Disassemble old/deprecated CPU instructions
Disassemble (decode and format instructions)
This example uses a Decoder
and one of the Formatter
s to decode and format the code,
eg. GasFormatter
, IntelFormatter
, MasmFormatter
, NasmFormatter
, FastFormatter
.
use iced_x86::{Decoder, DecoderOptions, Formatter, Instruction, NasmFormatter}; /* This method produces the following output: 00007FFAC46ACDA4 48895C2410 mov [rsp+10h],rbx 00007FFAC46ACDA9 4889742418 mov [rsp+18h],rsi 00007FFAC46ACDAE 55 push rbp 00007FFAC46ACDAF 57 push rdi 00007FFAC46ACDB0 4156 push r14 00007FFAC46ACDB2 488DAC2400FFFFFF lea rbp,[rsp-100h] 00007FFAC46ACDBA 4881EC00020000 sub rsp,200h 00007FFAC46ACDC1 488B0518570A00 mov rax,[rel 7FFA`C475`24E0h] 00007FFAC46ACDC8 4833C4 xor rax,rsp 00007FFAC46ACDCB 488985F0000000 mov [rbp+0F0h],rax 00007FFAC46ACDD2 4C8B052F240A00 mov r8,[rel 7FFA`C474`F208h] 00007FFAC46ACDD9 488D05787C0400 lea rax,[rel 7FFA`C46F`4A58h] 00007FFAC46ACDE0 33FF xor edi,edi */ pub(crate) fn how_to_disassemble() { let bytes = EXAMPLE_CODE; let mut decoder = Decoder::new(EXAMPLE_CODE_BITNESS, bytes, DecoderOptions::NONE); decoder.set_ip(EXAMPLE_CODE_RIP); // Formatters: Masm*, Nasm*, Gas* (AT&T) and Intel* (XED). // There's also `FastFormatter` which is ~1.9x faster. Use it if formatting speed is more // important than being able to re-assemble formatted instructions. let mut formatter = NasmFormatter::new(); // Change some options, there are many more formatter.options_mut().set_digit_separator("`"); formatter.options_mut().set_first_operand_char_index(10); // String implements FormatterOutput let mut output = String::new(); // Initialize this outside the loop because decode_out() writes to every field let mut instruction = Instruction::default(); // The decoder also implements Iterator/IntoIterator so you could use a for loop: // for instruction in &mut decoder { /* ... */ } // or collect(): // let instructions: Vec<_> = decoder.into_iter().collect(); // but can_decode()/decode_out() is a little faster: while decoder.can_decode() { // There's also a decode() method that returns an instruction but that also // means it copies an instruction (32 bytes): // instruction = decoder.decode(); decoder.decode_out(&mut instruction); // Format the instruction ("disassemble" it) output.clear(); formatter.format(&instruction, &mut output); // Eg. "00007FFAC46ACDB2 488DAC2400FFFFFF lea rbp,[rsp-100h]" print!("{:016X} ", instruction.ip()); let start_index = (instruction.ip() - EXAMPLE_CODE_RIP) as usize; let instr_bytes = &bytes[start_index..start_index + instruction.len()]; for b in instr_bytes.iter() { print!("{:02X}", b); } if instr_bytes.len() < HEXBYTES_COLUMN_BYTE_LENGTH { for _ in 0..HEXBYTES_COLUMN_BYTE_LENGTH - instr_bytes.len() { print!(" "); } } println!(" {}", output); } } const HEXBYTES_COLUMN_BYTE_LENGTH: usize = 10; const EXAMPLE_CODE_BITNESS: u32 = 64; const EXAMPLE_CODE_RIP: u64 = 0x0000_7FFA_C46A_CDA4; static EXAMPLE_CODE: &[u8] = &[ 0x48, 0x89, 0x5C, 0x24, 0x10, 0x48, 0x89, 0x74, 0x24, 0x18, 0x55, 0x57, 0x41, 0x56, 0x48, 0x8D, 0xAC, 0x24, 0x00, 0xFF, 0xFF, 0xFF, 0x48, 0x81, 0xEC, 0x00, 0x02, 0x00, 0x00, 0x48, 0x8B, 0x05, 0x18, 0x57, 0x0A, 0x00, 0x48, 0x33, 0xC4, 0x48, 0x89, 0x85, 0xF0, 0x00, 0x00, 0x00, 0x4C, 0x8B, 0x05, 0x2F, 0x24, 0x0A, 0x00, 0x48, 0x8D, 0x05, 0x78, 0x7C, 0x04, 0x00, 0x33, 0xFF, ];
Create and encode instructions
This example uses a BlockEncoder
to encode created Instruction
s. This example needs the db
feature because it creates db
"instructions".
use iced_x86::{ BlockEncoder, BlockEncoderOptions, Code, Decoder, DecoderOptions, Formatter, GasFormatter, Instruction, InstructionBlock, MemoryOperand, Register, }; pub(crate) fn how_to_encode_instructions() { let bitness = 64; // All created instructions get an IP of 0. The label id is just an IP. // The branch instruction's *target* IP should be equal to the IP of the // target instruction. let mut label_id: u64 = 1; let mut create_label = || { let id = label_id; label_id += 1; id }; fn add_label(id: u64, mut instruction: Instruction) -> Instruction { instruction.set_ip(id); instruction } let label1 = create_label(); let mut instructions: Vec<Instruction> = Vec::new(); instructions.push(Instruction::with_reg(Code::Push_r64, Register::RBP)); instructions.push(Instruction::with_reg(Code::Push_r64, Register::RDI)); instructions.push(Instruction::with_reg(Code::Push_r64, Register::RSI)); instructions.push(Instruction::with_reg_u32(Code::Sub_rm64_imm32, Register::RSP, 0x50)); instructions.push(Instruction::with(Code::VEX_Vzeroupper)); instructions.push(Instruction::with_reg_mem( Code::Lea_r64_m, Register::RBP, MemoryOperand::with_base_displ(Register::RSP, 0x60), )); instructions.push(Instruction::with_reg_reg(Code::Mov_r64_rm64, Register::RSI, Register::RCX)); instructions.push(Instruction::with_reg_mem( Code::Lea_r64_m, Register::RDI, MemoryOperand::with_base_displ(Register::RBP, -0x38), )); instructions.push(Instruction::with_reg_i32(Code::Mov_r32_imm32, Register::ECX, 0x0A)); instructions.push(Instruction::with_reg_reg(Code::Xor_r32_rm32, Register::EAX, Register::EAX)); instructions.push(Instruction::with_rep_stosd(bitness)); instructions.push(Instruction::with_reg_u64(Code::Cmp_rm64_imm32, Register::RSI, 0x1234_5678)); // Create a branch instruction that references label1 instructions.push(Instruction::with_branch(Code::Jne_rel32_64, label1)); instructions.push(Instruction::with(Code::Nopd)); // Add the instruction that is the target of the branch instructions.push(add_label( label1, Instruction::with_reg_reg(Code::Xor_r32_rm32, Register::R15D, Register::R15D), )); // Create an instruction that accesses some data using an RIP relative memory operand let data1 = create_label(); instructions.push(Instruction::with_reg_mem( Code::Lea_r64_m, Register::R14, MemoryOperand::with_base_displ(Register::RIP, data1 as i32), )); instructions.push(Instruction::with(Code::Nopd)); let raw_data: &[u8] = &[0x12, 0x34, 0x56, 0x78]; // Creating db/dw/dd/dq instructions requires the `db` feature or it will panic!() instructions.push(add_label(data1, Instruction::with_declare_byte(raw_data))); // Use BlockEncoder to encode a block of instructions. This block can contain any // number of branches and any number of instructions. It does support encoding more // than one block but it's rarely needed. // It uses Encoder to encode all instructions. // If the target of a branch is too far away, it can fix it to use a longer branch. // This can be disabled by enabling some BlockEncoderOptions flags. let target_rip = 0x0000_1248_FC84_0000; let block = InstructionBlock::new(&instructions, target_rip); let result = match BlockEncoder::encode(bitness, block, BlockEncoderOptions::NONE) { Err(error) => panic!("Failed to encode it: {}", error), Ok(result) => result, }; // Now disassemble the encoded instructions. Note that the 'jmp near' // instruction was turned into a 'jmp short' instruction because we // didn't disable branch optimizations. let bytes = result.code_buffer; let mut output = String::new(); let bytes_code = &bytes[0..bytes.len() - raw_data.len()]; let bytes_data = &bytes[bytes.len() - raw_data.len()..]; let mut decoder = Decoder::new(bitness, bytes_code, DecoderOptions::NONE); decoder.set_ip(target_rip); let mut formatter = GasFormatter::new(); formatter.options_mut().set_first_operand_char_index(8); for instruction in &mut decoder { output.clear(); formatter.format(&instruction, &mut output); println!("{:016X} {}", instruction.ip(), output); } // Creating db/dw/dd/dq instructions requires the `db` feature or it will panic!() let db = Instruction::with_declare_byte(bytes_data); output.clear(); formatter.format(&db, &mut output); println!("{:016X} {}", decoder.ip(), output); } /* Output: 00001248FC840000 push %rbp 00001248FC840001 push %rdi 00001248FC840002 push %rsi 00001248FC840003 sub $0x50,%rsp 00001248FC84000A vzeroupper 00001248FC84000D lea 0x60(%rsp),%rbp 00001248FC840012 mov %rcx,%rsi 00001248FC840015 lea -0x38(%rbp),%rdi 00001248FC840019 mov $0xA,%ecx 00001248FC84001E xor %eax,%eax 00001248FC840020 rep stos %eax,(%rdi) 00001248FC840022 cmp $0x12345678,%rsi 00001248FC840029 jne 0x00001248FC84002C 00001248FC84002B nop 00001248FC84002C xor %r15d,%r15d 00001248FC84002F lea 0x1248FC840037,%r14 00001248FC840036 nop 00001248FC840037 .byte 0x12,0x34,0x56,0x78 */
Disassemble with a symbol resolver
Creates a custom SymbolResolver
that is called by a Formatter
.
use iced_x86::{ Decoder, DecoderOptions, Formatter, Instruction, MasmFormatter, SymbolResolver, SymbolResult, }; use std::collections::HashMap; struct MySymbolResolver { map: HashMap<u64, String>, } impl SymbolResolver for MySymbolResolver { fn symbol( &mut self, _instruction: &Instruction, _operand: u32, _instruction_operand: Option<u32>, address: u64, _address_size: u32, ) -> Option<SymbolResult> { if let Some(symbol_string) = self.map.get(&address) { // The 'address' arg is the address of the symbol and doesn't have to be identical // to the 'address' arg passed to symbol(). If it's different from the input // address, the formatter will add +N or -N, eg. '[rax+symbol+123]' Some(SymbolResult::with_str(address, symbol_string.as_str())) } else { None } } } pub(crate) fn how_to_resolve_symbols() { let bytes = b"\x48\x8B\x8A\xA5\x5A\xA5\x5A"; let mut decoder = Decoder::new(64, bytes, DecoderOptions::NONE); let instr = decoder.decode(); let mut sym_map: HashMap<u64, String> = HashMap::new(); sym_map.insert(0x5AA5_5AA5, String::from("my_data")); let mut output = String::new(); let resolver = Box::new(MySymbolResolver { map: sym_map }); // Create a formatter that uses our symbol resolver let mut formatter = MasmFormatter::with_options(Some(resolver), None); // This will call the symbol resolver for each immediate / displacement // it finds in the instruction. formatter.format(&instr, &mut output); // Prints: mov rcx,[rdx+my_data] println!("{}", output); }
Disassemble with colorized text
Creates a custom FormatterOutput
that is called by a Formatter
.
This example will fail to compile unless you install the colored
crate, see below.
// This example uses crate colored = "2.0.0" use colored::{ColoredString, Colorize}; use iced_x86::{ Decoder, DecoderOptions, Formatter, FormatterOutput, FormatterTextKind, IntelFormatter, }; // Custom formatter output that stores the output in a vector. struct MyFormatterOutput { vec: Vec<(String, FormatterTextKind)>, } impl MyFormatterOutput { pub fn new() -> Self { Self { vec: Vec::new() } } } impl FormatterOutput for MyFormatterOutput { fn write(&mut self, text: &str, kind: FormatterTextKind) { // This allocates a string. If that's a problem, just call print!() here // instead of storing the result in a vector. self.vec.push((String::from(text), kind)); } } pub(crate) fn how_to_colorize_text() { let bytes = EXAMPLE_CODE; let mut decoder = Decoder::new(EXAMPLE_CODE_BITNESS, bytes, DecoderOptions::NONE); decoder.set_ip(EXAMPLE_CODE_RIP); let mut formatter = IntelFormatter::new(); formatter.options_mut().set_first_operand_char_index(8); let mut output = MyFormatterOutput::new(); for instruction in &mut decoder { output.vec.clear(); // The formatter calls output.write() which will update vec with text/colors formatter.format(&instruction, &mut output); for (text, kind) in output.vec.iter() { print!("{}", get_color(text.as_str(), *kind)); } println!(); } } fn get_color(s: &str, kind: FormatterTextKind) -> ColoredString { match kind { FormatterTextKind::Directive | FormatterTextKind::Keyword => s.bright_yellow(), FormatterTextKind::Prefix | FormatterTextKind::Mnemonic => s.bright_red(), FormatterTextKind::Register => s.bright_blue(), FormatterTextKind::Number => s.bright_cyan(), _ => s.white(), } } const EXAMPLE_CODE_BITNESS: u32 = 64; const EXAMPLE_CODE_RIP: u64 = 0x0000_7FFA_C46A_CDA4; static EXAMPLE_CODE: &[u8] = &[ 0x48, 0x89, 0x5C, 0x24, 0x10, 0x48, 0x89, 0x74, 0x24, 0x18, 0x55, 0x57, 0x41, 0x56, 0x48, 0x8D, 0xAC, 0x24, 0x00, 0xFF, 0xFF, 0xFF, 0x48, 0x81, 0xEC, 0x00, 0x02, 0x00, 0x00, 0x48, 0x8B, 0x05, 0x18, 0x57, 0x0A, 0x00, 0x48, 0x33, 0xC4, 0x48, 0x89, 0x85, 0xF0, 0x00, 0x00, 0x00, 0x4C, 0x8B, 0x05, 0x2F, 0x24, 0x0A, 0x00, 0x48, 0x8D, 0x05, 0x78, 0x7C, 0x04, 0x00, 0x33, 0xFF, ];
Move code in memory (eg. hook a function)
Uses instruction info API and the encoder to patch a function to jump to the programmer's function.
use iced_x86::{ BlockEncoder, BlockEncoderOptions, Code, Decoder, DecoderOptions, FlowControl, Formatter, Instruction, InstructionBlock, NasmFormatter, OpKind, }; // Decodes instructions from some address, then encodes them starting at some // other address. This can be used to hook a function. You decode enough instructions // until you have enough bytes to add a JMP instruction that jumps to your code. // Your code will then conditionally jump to the original code that you re-encoded. // // This code uses the BlockEncoder which will help with some things, eg. converting // short branches to longer branches if the target is too far away. // // 64-bit mode also supports RIP relative addressing, but the encoder can't rewrite // those to use a longer displacement. If any of the moved instructions have RIP // relative addressing and it tries to access data too far away, the encoder will fail. // The easiest solution is to use OS alloc functions that allocate memory close to the // original code (+/-2GB). /* This method produces the following output: Original code: 00007FFAC46ACDA4 mov [rsp+10h],rbx 00007FFAC46ACDA9 mov [rsp+18h],rsi 00007FFAC46ACDAE push rbp 00007FFAC46ACDAF push rdi 00007FFAC46ACDB0 push r14 00007FFAC46ACDB2 lea rbp,[rsp-100h] 00007FFAC46ACDBA sub rsp,200h 00007FFAC46ACDC1 mov rax,[rel 7FFAC47524E0h] 00007FFAC46ACDC8 xor rax,rsp 00007FFAC46ACDCB mov [rbp+0F0h],rax 00007FFAC46ACDD2 mov r8,[rel 7FFAC474F208h] 00007FFAC46ACDD9 lea rax,[rel 7FFAC46F4A58h] 00007FFAC46ACDE0 xor edi,edi Original + patched code: 00007FFAC46ACDA4 mov rax,123456789ABCDEF0h 00007FFAC46ACDAE jmp rax 00007FFAC46ACDB0 push r14 00007FFAC46ACDB2 lea rbp,[rsp-100h] 00007FFAC46ACDBA sub rsp,200h 00007FFAC46ACDC1 mov rax,[rel 7FFAC47524E0h] 00007FFAC46ACDC8 xor rax,rsp 00007FFAC46ACDCB mov [rbp+0F0h],rax 00007FFAC46ACDD2 mov r8,[rel 7FFAC474F208h] 00007FFAC46ACDD9 lea rax,[rel 7FFAC46F4A58h] 00007FFAC46ACDE0 xor edi,edi Moved code: 00007FFAC48ACDA4 mov [rsp+10h],rbx 00007FFAC48ACDA9 mov [rsp+18h],rsi 00007FFAC48ACDAE push rbp 00007FFAC48ACDAF push rdi 00007FFAC48ACDB0 jmp 00007FFAC46ACDB0h */ pub(crate) fn how_to_move_code() { let example_code = EXAMPLE_CODE.to_vec(); println!("Original code:"); disassemble(&example_code, EXAMPLE_CODE_RIP); let mut decoder = Decoder::new(EXAMPLE_CODE_BITNESS, &example_code, DecoderOptions::NONE); decoder.set_ip(EXAMPLE_CODE_RIP); // In 64-bit mode, we need 12 bytes to jump to any address: // mov rax,imm64 // 10 // jmp rax // 2 // We overwrite rax because it's probably not used by the called function. // In 32-bit mode, a normal JMP is just 5 bytes let required_bytes = 10 + 2; let mut total_bytes = 0; let mut orig_instructions: Vec<Instruction> = Vec::new(); for instr in &mut decoder { orig_instructions.push(instr); total_bytes += instr.len() as u32; if instr.is_invalid() { panic!("Found garbage"); } if total_bytes >= required_bytes { break; } match instr.flow_control() { FlowControl::Next => {} FlowControl::UnconditionalBranch => { if instr.op0_kind() == OpKind::NearBranch64 { let _target = instr.near_branch_target(); // You could check if it's just jumping forward a few bytes and follow it // but this is a simple example so we'll fail. } panic!("Not supported by this simple example"); } FlowControl::IndirectBranch | FlowControl::ConditionalBranch | FlowControl::Return | FlowControl::Call | FlowControl::IndirectCall | FlowControl::Interrupt | FlowControl::XbeginXabortXend | FlowControl::Exception => panic!("Not supported by this simple example"), } } if total_bytes < required_bytes { panic!("Not enough bytes!"); } assert!(!orig_instructions.is_empty()); // Create a JMP instruction that branches to the original code, except those instructions // that we'll re-encode. We don't need to do it if it already ends in 'ret' let (jmp_back_addr, add) = { let last_instr = orig_instructions.last().unwrap(); if last_instr.flow_control() != FlowControl::Return { (last_instr.next_ip(), true) } else { (last_instr.next_ip(), false) } }; if add { orig_instructions.push(Instruction::with_branch(Code::Jmp_rel32_64, jmp_back_addr)); } // Relocate the code to some new location. It can fix short/near branches and // convert them to short/near/long forms if needed. This also works even if it's a // jrcxz/loop/loopcc instruction which only have short forms. // // It can currently only fix RIP relative operands if the new location is within 2GB // of the target data location. // // Note that a block is not the same thing as a basic block. A block can contain any // number of instructions, including any number of branch instructions. One block // should be enough unless you must relocate different blocks to different locations. let relocated_base_address = EXAMPLE_CODE_RIP + 0x20_0000; let block = InstructionBlock::new(&orig_instructions, relocated_base_address); // This method can also encode more than one block but that's rarely needed, see above comment. let result = match BlockEncoder::encode(decoder.bitness(), block, BlockEncoderOptions::NONE) { Err(err) => panic!("{}", err), Ok(result) => result, }; let new_code = result.code_buffer; // Patch the original code. Pretend that we use some OS API to write to memory... // We could use the BlockEncoder/Encoder for this but it's easy to do yourself too. // This is 'mov rax,imm64; jmp rax' const YOUR_FUNC: u64 = 0x1234_5678_9ABC_DEF0; // Address of your code let mut example_code = example_code.to_vec(); example_code[0] = 0x48; // \ 'MOV RAX,imm64' example_code[1] = 0xB8; // / let mut v = YOUR_FUNC; for p in &mut example_code[2..10] { *p = v as u8; v >>= 8; } example_code[10] = 0xFF; // \ JMP RAX example_code[11] = 0xE0; // / // Disassemble it println!("Original + patched code:"); disassemble(&example_code, EXAMPLE_CODE_RIP); // Disassemble the moved code println!("Moved code:"); disassemble(&new_code, relocated_base_address); } fn disassemble(data: &[u8], ip: u64) { let mut formatter = NasmFormatter::new(); let mut output = String::new(); let mut decoder = Decoder::new(EXAMPLE_CODE_BITNESS, data, DecoderOptions::NONE); decoder.set_ip(ip); for instruction in &mut decoder { output.clear(); formatter.format(&instruction, &mut output); println!("{:016X} {}", instruction.ip(), output); } println!(); } const EXAMPLE_CODE_BITNESS: u32 = 64; const EXAMPLE_CODE_RIP: u64 = 0x0000_7FFA_C46A_CDA4; static EXAMPLE_CODE: &[u8] = &[ 0x48, 0x89, 0x5C, 0x24, 0x10, 0x48, 0x89, 0x74, 0x24, 0x18, 0x55, 0x57, 0x41, 0x56, 0x48, 0x8D, 0xAC, 0x24, 0x00, 0xFF, 0xFF, 0xFF, 0x48, 0x81, 0xEC, 0x00, 0x02, 0x00, 0x00, 0x48, 0x8B, 0x05, 0x18, 0x57, 0x0A, 0x00, 0x48, 0x33, 0xC4, 0x48, 0x89, 0x85, 0xF0, 0x00, 0x00, 0x00, 0x4C, 0x8B, 0x05, 0x2F, 0x24, 0x0A, 0x00, 0x48, 0x8D, 0x05, 0x78, 0x7C, 0x04, 0x00, 0x33, 0xFF, ];
Get instruction info, eg. read/written regs/mem, control flow info, etc
Shows how to get used registers/memory and other info. It uses Instruction
methods
and an InstructionInfoFactory
to get this info.
use iced_x86::{ ConditionCode, Decoder, DecoderOptions, Instruction, InstructionInfoFactory, OpKind, RflagsBits, }; /* This method produces the following output: 00007FFAC46ACDA4 mov [rsp+10h],rbx OpCode: o64 89 /r Instruction: MOV r/m64, r64 Encoding: Legacy Mnemonic: Mov Code: Mov_rm64_r64 CpuidFeature: X64 FlowControl: Next Displacement offset = 4, size = 1 Memory size: 8 Op0Access: Write Op1Access: Read Op0: r64_or_mem Op1: r64_reg Used reg: RSP:Read Used reg: RBX:Read Used mem: [SS:RSP+0x10;UInt64;Write] 00007FFAC46ACDA9 mov [rsp+18h],rsi OpCode: o64 89 /r Instruction: MOV r/m64, r64 Encoding: Legacy Mnemonic: Mov Code: Mov_rm64_r64 CpuidFeature: X64 FlowControl: Next Displacement offset = 4, size = 1 Memory size: 8 Op0Access: Write Op1Access: Read Op0: r64_or_mem Op1: r64_reg Used reg: RSP:Read Used reg: RSI:Read Used mem: [SS:RSP+0x18;UInt64;Write] 00007FFAC46ACDAE push rbp OpCode: o64 50+ro Instruction: PUSH r64 Encoding: Legacy Mnemonic: Push Code: Push_r64 CpuidFeature: X64 FlowControl: Next SP Increment: -8 Op0Access: Read Op0: r64_opcode Used reg: RBP:Read Used reg: RSP:ReadWrite Used mem: [SS:RSP+0xFFFFFFFFFFFFFFF8;UInt64;Write] 00007FFAC46ACDAF push rdi OpCode: o64 50+ro Instruction: PUSH r64 Encoding: Legacy Mnemonic: Push Code: Push_r64 CpuidFeature: X64 FlowControl: Next SP Increment: -8 Op0Access: Read Op0: r64_opcode Used reg: RDI:Read Used reg: RSP:ReadWrite Used mem: [SS:RSP+0xFFFFFFFFFFFFFFF8;UInt64;Write] 00007FFAC46ACDB0 push r14 OpCode: o64 50+ro Instruction: PUSH r64 Encoding: Legacy Mnemonic: Push Code: Push_r64 CpuidFeature: X64 FlowControl: Next SP Increment: -8 Op0Access: Read Op0: r64_opcode Used reg: R14:Read Used reg: RSP:ReadWrite Used mem: [SS:RSP+0xFFFFFFFFFFFFFFF8;UInt64;Write] 00007FFAC46ACDB2 lea rbp,[rsp-100h] OpCode: o64 8D /r Instruction: LEA r64, m Encoding: Legacy Mnemonic: Lea Code: Lea_r64_m CpuidFeature: X64 FlowControl: Next Displacement offset = 4, size = 4 Op0Access: Write Op1Access: NoMemAccess Op0: r64_reg Op1: mem Used reg: RBP:Write Used reg: RSP:Read 00007FFAC46ACDBA sub rsp,200h OpCode: o64 81 /5 id Instruction: SUB r/m64, imm32 Encoding: Legacy Mnemonic: Sub Code: Sub_rm64_imm32 CpuidFeature: X64 FlowControl: Next Immediate offset = 3, size = 4 RFLAGS Written: OF, SF, ZF, AF, CF, PF RFLAGS Modified: OF, SF, ZF, AF, CF, PF Op0Access: ReadWrite Op1Access: Read Op0: r64_or_mem Op1: imm32sex64 Used reg: RSP:ReadWrite 00007FFAC46ACDC1 mov rax,[7FFAC47524E0h] OpCode: o64 8B /r Instruction: MOV r64, r/m64 Encoding: Legacy Mnemonic: Mov Code: Mov_r64_rm64 CpuidFeature: X64 FlowControl: Next Displacement offset = 3, size = 4 Memory size: 8 Op0Access: Write Op1Access: Read Op0: r64_reg Op1: r64_or_mem Used reg: RAX:Write Used mem: [DS:0x7FFAC47524E0;UInt64;Read] 00007FFAC46ACDC8 xor rax,rsp OpCode: o64 33 /r Instruction: XOR r64, r/m64 Encoding: Legacy Mnemonic: Xor Code: Xor_r64_rm64 CpuidFeature: X64 FlowControl: Next RFLAGS Written: SF, ZF, PF RFLAGS Cleared: OF, CF RFLAGS Undefined: AF RFLAGS Modified: OF, SF, ZF, AF, CF, PF Op0Access: ReadWrite Op1Access: Read Op0: r64_reg Op1: r64_or_mem Used reg: RAX:ReadWrite Used reg: RSP:Read 00007FFAC46ACDCB mov [rbp+0F0h],rax OpCode: o64 89 /r Instruction: MOV r/m64, r64 Encoding: Legacy Mnemonic: Mov Code: Mov_rm64_r64 CpuidFeature: X64 FlowControl: Next Displacement offset = 3, size = 4 Memory size: 8 Op0Access: Write Op1Access: Read Op0: r64_or_mem Op1: r64_reg Used reg: RBP:Read Used reg: RAX:Read Used mem: [SS:RBP+0xF0;UInt64;Write] 00007FFAC46ACDD2 mov r8,[7FFAC474F208h] OpCode: o64 8B /r Instruction: MOV r64, r/m64 Encoding: Legacy Mnemonic: Mov Code: Mov_r64_rm64 CpuidFeature: X64 FlowControl: Next Displacement offset = 3, size = 4 Memory size: 8 Op0Access: Write Op1Access: Read Op0: r64_reg Op1: r64_or_mem Used reg: R8:Write Used mem: [DS:0x7FFAC474F208;UInt64;Read] 00007FFAC46ACDD9 lea rax,[7FFAC46F4A58h] OpCode: o64 8D /r Instruction: LEA r64, m Encoding: Legacy Mnemonic: Lea Code: Lea_r64_m CpuidFeature: X64 FlowControl: Next Displacement offset = 3, size = 4 Op0Access: Write Op1Access: NoMemAccess Op0: r64_reg Op1: mem Used reg: RAX:Write 00007FFAC46ACDE0 xor edi,edi OpCode: o32 33 /r Instruction: XOR r32, r/m32 Encoding: Legacy Mnemonic: Xor Code: Xor_r32_rm32 CpuidFeature: INTEL386 FlowControl: Next RFLAGS Cleared: OF, SF, CF RFLAGS Set: ZF, PF RFLAGS Undefined: AF RFLAGS Modified: OF, SF, ZF, AF, CF, PF Op0Access: Write Op1Access: None Op0: r32_reg Op1: r32_or_mem Used reg: RDI:Write */ pub(crate) fn how_to_get_instruction_info() { let mut decoder = Decoder::new(EXAMPLE_CODE_BITNESS, EXAMPLE_CODE, DecoderOptions::NONE); decoder.set_ip(EXAMPLE_CODE_RIP); // Use a factory to create the instruction info if you need register and // memory usage. If it's something else, eg. encoding, flags, etc, there // are Instruction methods that can be used instead. let mut info_factory = InstructionInfoFactory::new(); let mut instr = Instruction::default(); while decoder.can_decode() { decoder.decode_out(&mut instr); // Gets offsets in the instruction of the displacement and immediates and their sizes. // This can be useful if there are relocations in the binary. The encoder has a similar // method. This method must be called after decode() and you must pass in the last // instruction decode() returned. let offsets = decoder.get_constant_offsets(&instr); // For quick hacks, it's fine to use the Display trait to format an instruction, // but for real code, use a formatter, eg. MasmFormatter. See other examples. println!("{:016X} {}", instr.ip(), instr); let op_code = instr.op_code(); let info = info_factory.info(&instr); println!(" OpCode: {}", op_code.op_code_string()); println!(" Instruction: {}", op_code.instruction_string()); println!(" Encoding: {:?}", instr.encoding()); println!(" Mnemonic: {:?}", instr.mnemonic()); println!(" Code: {:?}", instr.code()); println!( " CpuidFeature: {}", instr .cpuid_features() .iter() .map(|&a| format!("{:?}", a)) .collect::<Vec<String>>() .join(" and ") ); println!(" FlowControl: {:?}", instr.flow_control()); if offsets.has_displacement() { println!( " Displacement offset = {}, size = {}", offsets.displacement_offset(), offsets.displacement_size() ); } if offsets.has_immediate() { println!( " Immediate offset = {}, size = {}", offsets.immediate_offset(), offsets.immediate_size() ); } if offsets.has_immediate2() { println!( " Immediate #2 offset = {}, size = {}", offsets.immediate_offset2(), offsets.immediate_size2() ); } if instr.is_stack_instruction() { println!(" SP Increment: {}", instr.stack_pointer_increment()); } if instr.condition_code() != ConditionCode::None { println!(" Condition code: {:?}", instr.condition_code()); } if instr.rflags_read() != RflagsBits::NONE { println!(" RFLAGS Read: {}", flags(instr.rflags_read())); } if instr.rflags_written() != RflagsBits::NONE { println!(" RFLAGS Written: {}", flags(instr.rflags_written())); } if instr.rflags_cleared() != RflagsBits::NONE { println!(" RFLAGS Cleared: {}", flags(instr.rflags_cleared())); } if instr.rflags_set() != RflagsBits::NONE { println!(" RFLAGS Set: {}", flags(instr.rflags_set())); } if instr.rflags_undefined() != RflagsBits::NONE { println!(" RFLAGS Undefined: {}", flags(instr.rflags_undefined())); } if instr.rflags_modified() != RflagsBits::NONE { println!(" RFLAGS Modified: {}", flags(instr.rflags_modified())); } for i in 0..instr.op_count() { let op_kind = instr.op_kind(i); if op_kind == OpKind::Memory || op_kind == OpKind::Memory64 { let size = instr.memory_size().size(); if size != 0 { println!(" Memory size: {}", size); } break; } } for i in 0..instr.op_count() { println!(" Op{}Access: {:?}", i, info.op_access(i)); } for i in 0..op_code.op_count() { println!(" Op{}: {:?}", i, op_code.op_kind(i)); } for reg_info in info.used_registers() { println!(" Used reg: {:?}", reg_info); } for mem_info in info.used_memory() { println!(" Used mem: {:?}", mem_info); } } } fn flags(rf: u32) -> String { fn append(sb: &mut String, s: &str) { if !sb.is_empty() { sb.push_str(", "); } sb.push_str(s); } let mut sb = String::new(); if (rf & RflagsBits::OF) != 0 { append(&mut sb, "OF"); } if (rf & RflagsBits::SF) != 0 { append(&mut sb, "SF"); } if (rf & RflagsBits::ZF) != 0 { append(&mut sb, "ZF"); } if (rf & RflagsBits::AF) != 0 { append(&mut sb, "AF"); } if (rf & RflagsBits::CF) != 0 { append(&mut sb, "CF"); } if (rf & RflagsBits::PF) != 0 { append(&mut sb, "PF"); } if (rf & RflagsBits::DF) != 0 { append(&mut sb, "DF"); } if (rf & RflagsBits::IF) != 0 { append(&mut sb, "IF"); } if (rf & RflagsBits::AC) != 0 { append(&mut sb, "AC"); } if sb.is_empty() { sb.push_str("<empty>"); } sb } const EXAMPLE_CODE_BITNESS: u32 = 64; const EXAMPLE_CODE_RIP: u64 = 0x0000_7FFA_C46A_CDA4; static EXAMPLE_CODE: &[u8] = &[ 0x48, 0x89, 0x5C, 0x24, 0x10, 0x48, 0x89, 0x74, 0x24, 0x18, 0x55, 0x57, 0x41, 0x56, 0x48, 0x8D, 0xAC, 0x24, 0x00, 0xFF, 0xFF, 0xFF, 0x48, 0x81, 0xEC, 0x00, 0x02, 0x00, 0x00, 0x48, 0x8B, 0x05, 0x18, 0x57, 0x0A, 0x00, 0x48, 0x33, 0xC4, 0x48, 0x89, 0x85, 0xF0, 0x00, 0x00, 0x00, 0x4C, 0x8B, 0x05, 0x2F, 0x24, 0x0A, 0x00, 0x48, 0x8D, 0x05, 0x78, 0x7C, 0x04, 0x00, 0x33, 0xFF, ];
Get the virtual address of a memory operand
use iced_x86::{Decoder, DecoderOptions, Register}; pub(crate) fn how_to_get_virtual_address() { // add [rdi+r12*8-5AA5EDCCh],esi let bytes = b"\x42\x01\xB4\xE7\x34\x12\x5A\xA5"; let mut decoder = Decoder::new(64, bytes, DecoderOptions::NONE); let instr = decoder.decode(); // There's also try_virtual_address() which returns an Option<u64> let va = instr.virtual_address(0, 0, |register, _element_index, _element_size| { match register { // The base address of ES, CS, SS and DS is always 0 in 64-bit mode Register::ES | Register::CS | Register::SS | Register::DS => 0, Register::RDI => 0x0000_0000_1000_0000, Register::R12 => 0x0000_0004_0000_0000, _ => unimplemented!(), } }); assert_eq!(0x0000_001F_B55A_1234, va); }
Disassemble old/deprecated CPU instructions
use iced_x86::{Decoder, DecoderOptions, Formatter, Instruction, NasmFormatter}; /* This method produces the following output: 731E0A03 bndmov bnd1, [eax] 731E0A07 mov tr3, esi 731E0A0A rdshr [eax] 731E0A0D dmint 731E0A0F svdc [eax], cs 731E0A12 cpu_read 731E0A14 pmvzb mm1, [eax] 731E0A17 frinear 731E0A19 altinst */ pub(crate) fn how_to_disassemble_old_instrs() { #[rustfmt::skip] let bytes = &[ // bndmov bnd1,[eax] 0x66, 0x0F, 0x1A, 0x08, // mov tr3,esi 0x0F, 0x26, 0xDE, // rdshr [eax] 0x0F, 0x36, 0x00, // dmint 0x0F, 0x39, // svdc [eax],cs 0x0F, 0x78, 0x08, // cpu_read 0x0F, 0x3D, // pmvzb mm1,[eax] 0x0F, 0x58, 0x08, // frinear 0xDF, 0xFC, // altinst 0x0F, 0x3F, ]; // Enable decoding of Cyrix/Geode instructions, Centaur ALTINST, MOV to/from TR // and MPX instructions. // There are other options to enable other instructions such as UMOV, etc. // These are deprecated instructions or only used by old CPUs so they're not // enabled by default. Some newer instructions also use the same opcodes as // some of these old instructions. const DECODER_OPTIONS: u32 = DecoderOptions::MPX | DecoderOptions::MOV_TR | DecoderOptions::CYRIX | DecoderOptions::CYRIX_DMI | DecoderOptions::ALTINST; let mut decoder = Decoder::new(32, bytes, DECODER_OPTIONS); decoder.set_ip(0x731E0A03); let mut formatter = NasmFormatter::new(); formatter.options_mut().set_space_after_operand_separator(true); let mut output = String::new(); let mut instruction = Instruction::default(); while decoder.can_decode() { decoder.decode_out(&mut instruction); output.clear(); formatter.format(&instruction, &mut output); println!("{:08X} {}", instruction.ip(), &output); } }
Minimum supported rustc
version
iced-x86 supports rustc
1.20.0
or later.
This is checked in CI builds where the minimum supported version and the latest stable version are used to build the source code and run tests.
If you use an older version of rustc
, you may need to update the versions of some iced-x86 dependencies because cargo
prefers to use the latest version which may not support your rustc
.
Eg. iced-x86 needs lazy_static
1.1.1
(or later), but cargo
wants to use the latest version which is currently 1.4.0
and it doesn't support the minimum supported rustc
version.
Here's how you can force a compatible version of any iced-x86 dependency without updating iced-x86's Cargo.toml
:
cargo generate-lockfile
cargo update --package lazy_static --precise 1.1.1
Bumping the minimum supported version of rustc
is considered a minor breaking change. The minor version of iced-x86 will be incremented.
Structs
BlockEncoder | Encodes instructions. It can be used to move instructions from one location to another location. |
BlockEncoderOptions |
|
BlockEncoderResult |
|
ConstantOffsets | Contains the offsets of the displacement and immediate. Call |
Decoder | Decodes 16/32/64-bit x86 instructions |
DecoderIntoIter | An iterator that consumes a |
DecoderIter | An iterator that borrows a |
DecoderOptions | Decoder options |
Encoder | Encodes instructions decoded by the decoder or instructions created by other code.
See also |
FastFormatter | Fast formatter with less formatting options and with masm-like syntax. Use it if formatting speed is more important than being able to re-assemble formatted instructions. |
FastFormatterOptions | Fast formatter options |
FormatMnemonicOptions | Format mnemonic options |
FormatterOperandOptions | Operand options |
FormatterOptions | Formatter options |
GasFormatter | GNU assembler (AT&T) formatter |
IcedFeatures | Gets the available features |
Instruction | A 16/32/64-bit x86 instruction. Created by |
InstructionBlock | Contains a slice of instructions that should be encoded by |
InstructionInfo | Contains information about an instruction, eg. read/written registers, read/written |
InstructionInfoFactory | Creates |
InstructionInfoOptions | Instruction info options used by |
IntelFormatter | Intel formatter (same as Intel XED) |
MasmFormatter | Masm formatter |
MemoryOperand | Memory operand passed to one of |
MemorySizeInfo |
|
NasmFormatter | Nasm formatter |
NumberFormattingOptions | Gets initialized with the default options and can be overridden by a |
OpCodeInfo | Opcode info, returned by |
RegisterInfo |
|
RelocInfo | Relocation info |
RflagsBits |
|
SymResTextPart | Contains text and colors |
SymbolFlags | Symbol flags |
SymbolResult | Created by a |
UsedMemory | A memory location used by an instruction |
UsedRegister | A register used by an instruction |
Enums
CC_a | Mnemonic condition code selector (eg. |
CC_ae | Mnemonic condition code selector (eg. |
CC_b | Mnemonic condition code selector (eg. |
CC_be | Mnemonic condition code selector (eg. |
CC_e | Mnemonic condition code selector (eg. |
CC_g | Mnemonic condition code selector (eg. |
CC_ge | Mnemonic condition code selector (eg. |
CC_l | Mnemonic condition code selector (eg. |
CC_le | Mnemonic condition code selector (eg. |
CC_ne | Mnemonic condition code selector (eg. |
CC_np | Mnemonic condition code selector (eg. |
CC_p | Mnemonic condition code selector (eg. |
Code | x86 instruction code |
CodeSize | The code size (16/32/64) that was used when an instruction was decoded |
ConditionCode | Instruction condition code (used by |
CpuidFeature |
|
DecoderError | Decoder error |
DecoratorKind | Decorator |
EncodingKind | Instruction encoding |
FlowControl | Flow control |
FormatterTextKind | Formatter text kind |
MandatoryPrefix | Mandatory prefix |
MemorySize | Size of a memory reference |
MemorySizeOptions | Memory size options used by the formatters |
Mnemonic | Mnemonic |
NumberBase | Number base |
NumberKind | Number kind |
OpAccess | Operand, register and memory access |
OpCodeOperandKind | Operand kind |
OpCodeTableKind | Opcode table |
OpKind | Instruction operand kind |
PrefixKind | Prefix |
Register | A register |
RelocKind | Relocation kind |
RepPrefixKind |
|
RoundingControl | Rounding control |
SymResString | Contains a |
SymResTextInfo | Contains one or more |
TupleType | Tuple type (EVEX) which can be used to get the disp8 scale factor |
Traits
Formatter | Formats instructions |
FormatterOptionsProvider | Can override options used by a |
FormatterOutput | Used by a |
SymbolResolver | Used by a |