Module assembly
Expand description
CIL instruction processing: disassembly, analysis, and assembly based on ECMA-335.
This module provides comprehensive CIL (Common Intermediate Language) instruction processing capabilities, including both disassembly (bytecode to instructions) and assembly (instructions to bytecode). It implements the complete ECMA-335 instruction set with support for control flow analysis, stack effect tracking, and bidirectional instruction processing.
§Architecture
The assembly module is built around several core concepts:
- Instruction Decoding: Binary CIL bytecode to structured instruction representation
- Instruction Encoding: Structured instructions back to binary CIL bytecode
- Control Flow Analysis: Building basic blocks and analyzing program flow
- Stack Effect Analysis: Tracking how instructions affect the evaluation stack
- Label Resolution: Automatic resolution of branch targets and labels
- Type Safety: Compile-time validation of instruction operand types
§Key Components
§Disassembly Components
crate::assembly::decode_instruction- Decode a single instructioncrate::assembly::decode_stream- Decode a sequence of instructionscrate::assembly::decode_blocks- Build basic blocks from instruction stream
§Assembly Components
crate::assembly::InstructionEncoder- Low-level instruction encoding (supports all 220 CIL instructions)crate::assembly::InstructionAssembler- High-level fluent API for common instruction patternscrate::assembly::LabelFixup- Label resolution system for branch instructions
§Shared Components
crate::assembly::Instruction- Represents a decoded CIL instructioncrate::assembly::BasicBlock- A sequence of instructions with single entry/exitcrate::assembly::Operand- Instruction operands (immediates, tokens, targets)crate::assembly::FlowType- How instructions affect control flow
§Usage Examples
§Disassembly
use dotscope::{assembly::decode_instruction, Parser};
let bytecode = &[0x00, 0x2A]; // nop, ret
let mut parser = Parser::new(bytecode);
let instruction = decode_instruction(&mut parser, 0x1000)?;
println!("Mnemonic: {}", instruction.mnemonic);
println!("Flow type: {:?}", instruction.flow_type);§High-Level Assembly
use dotscope::assembly::InstructionAssembler;
let mut asm = InstructionAssembler::new();
asm.ldarg_0()? // Load first argument
.ldarg_1()? // Load second argument
.add()? // Add them together
.ret()?; // Return result
let bytecode = asm.finish()?;§Low-Level Assembly
use dotscope::assembly::{InstructionEncoder, Operand, Immediate};
let mut encoder = InstructionEncoder::new();
encoder.emit_instruction("nop", None)?;
encoder.emit_instruction("ldarg.s", Some(Operand::Immediate(Immediate::Int8(1))))?;
encoder.emit_instruction("ret", None)?;
let bytecode = encoder.finalize()?;§Integration
The assembly module integrates with the metadata system to resolve tokens and provide rich semantic information about method calls, field access, and type operations. The encoder and assembler use the same instruction metadata as the disassembler, ensuring perfect consistency between assembly and disassembly operations.
§Thread Safety
All assembly types are std::marker::Send and std::marker::Sync for safe concurrent processing.
CIL (Common Intermediate Language) instruction processing engine.
This module provides comprehensive support for processing CIL bytecode from .NET assemblies according to ECMA-335 specifications. It implements both disassembly and assembly pipelines, including instruction parsing, encoding, control flow analysis, stack effect tracking, and basic block construction for advanced static analysis and code generation capabilities.
§Architecture
The module is organized into several cooperating components: instruction decoding and encoding transform between raw bytecode and structured instruction objects, control flow analysis builds basic blocks with predecessor/successor relationships, and metadata integration provides semantic context for method-level analysis and code generation.
§Key Components
crate::assembly::Instruction- Complete CIL instruction representationcrate::assembly::BasicBlock- Control flow basic block with instruction sequencescrate::assembly::Operand- Type-safe instruction operand representationcrate::assembly::FlowType- Control flow behavior classificationcrate::assembly::decode_instruction- Core single instruction decodercrate::assembly::decode_stream- Linear instruction sequence decodercrate::assembly::decode_blocks- Complete control flow analysis with basic blockscrate::assembly::InstructionEncoder- Core instruction encoding engine for assembly generationcrate::assembly::InstructionAssembler- High-level fluent API for convenient instruction assembly
§Usage Examples
§Disassembly Examples
use dotscope::assembly::{decode_instruction, decode_stream, decode_blocks};
use dotscope::Parser;
// Decode a single instruction
let bytecode = &[0x2A]; // ret
let mut parser = Parser::new(bytecode);
let instruction = decode_instruction(&mut parser, 0x1000)?;
println!("Instruction: {}", instruction.mnemonic);
// Decode a sequence of instructions
let bytecode = &[0x00, 0x2A]; // nop, ret
let mut parser = Parser::new(bytecode);
let instructions = decode_stream(&mut parser, 0x1000)?;
assert_eq!(instructions.len(), 2);
// Decode with control flow analysis
let bytecode = &[0x00, 0x2A]; // nop, ret
let blocks = decode_blocks(bytecode, 0, 0x1000, None)?;
assert_eq!(blocks.len(), 1);§Assembly Examples
use dotscope::assembly::{InstructionAssembler, InstructionEncoder};
use dotscope::assembly::{Operand, Immediate};
// High-level fluent API
let mut assembler = InstructionAssembler::new();
assembler
.ldarg_0()?
.ldarg_1()?
.add()?
.ret()?;
let bytecode = assembler.finish()?;
// Low-level encoder API
let mut encoder = InstructionEncoder::new();
encoder.emit_instruction("ldarg.0", None)?;
encoder.emit_instruction("ldarg.1", None)?;
encoder.emit_instruction("add", None)?;
encoder.emit_instruction("ret", None)?;
let bytecode2 = encoder.finalize()?;
assert_eq!(bytecode, bytecode2); // Both produce identical results§Thread Safety
All public types in this module are designed to be thread-safe where appropriate.
crate::assembly::Instruction, crate::assembly::BasicBlock, and related types
implement std::marker::Send and std::marker::Sync as they contain only
thread-safe data. The decoder functions can be called concurrently from different threads
with separate parser instances.
§Integration
This module integrates with:
crate::metadata::method- Provides method-level disassembly and cachingcrate::metadata::token- Resolves metadata token references in operands
Structs§
- Basic
Block - Represents a basic block in the control flow graph.
- CilInstruction
- Metadata for a CIL instruction definition.
- Instruction
- A decoded CIL instruction with all metadata needed for analysis and emulation.
- Instruction
Assembler - High-level fluent API for assembling CIL instructions.
- Instruction
Encoder - Core CIL instruction encoder.
- Label
Fixup - Label fixup information for branch instruction resolution.
- Stack
Behavior - Stack effect of an instruction.
Enums§
- Flow
Type - How an instruction affects control flow.
- Immediate
- Represents an immediate value type embedded in CIL instructions.
- Instruction
Category - Categorization of instructions by their primary function.
- Operand
- Represents an operand in a more structured way.
- Operand
Type - Types of operands for CIL instructions.
Constants§
- INSTRUCTIONS
- Lookup table for single-byte CIL instruction metadata.
- INSTRUCTIONS_
FE - Lookup table for double-byte CIL instruction metadata (0xFE prefix).
- INSTRUCTIONS_
FE_ MAX - Maximum opcode value for double-byte CIL instructions prefixed with 0xFE.
- INSTRUCTIONS_
MAX - Maximum opcode value for single-byte CIL instructions.
Functions§
- decode_
blocks - Decodes bytecode into a collection of basic blocks with control flow analysis.
- decode_
instruction - Decodes a single CIL instruction from the current parser position.
- decode_
stream - Decodes a continuous stream of CIL instructions from a byte stream.