1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
//! ECMA-335 Metadata Streams for .NET Assembly Processing
//!
//! This module implements comprehensive parsing, representation, and access to metadata streams
//! according to the ECMA-335 standard. Metadata streams are the fundamental data structures
//! within .NET assemblies that store type definitions, method signatures, string literals,
//! binary data, and global identifiers in optimized, compressed formats.
//!
//! # Metadata Stream Architecture
//!
//! The .NET metadata format organizes data into distinct streams, each optimized for specific
//! data types and access patterns. This separation enables efficient compression, fast lookup
//! operations, and minimal memory overhead during assembly processing.
//!
//! ## Physical Layout
//! ```text
//! Metadata Root
//! ├── Stream Header Directory
//! │ ├── #Strings - String identifier heap
//! │ ├── #US - User string literals heap
//! │ ├── #Blob - Binary data heap
//! │ ├── #GUID - Global identifier array
//! │ └── #~ - Compressed metadata tables
//! └── Stream Data Sections
//! ```
//!
//! ## Stream Identification
//! Each stream is identified by a specific name and serves a distinct purpose:
//! - Fixed names like `#Strings`, `#US`, `#Blob`, `#GUID`
//! - Table streams use `#~` (compressed) or `#-` (uncompressed)
//! - Custom streams may exist but are non-standard
//!
//! # Standard Stream Types
//!
//! The ECMA-335 specification defines five standard stream types with specific formats
//! and access patterns optimized for their data characteristics.
//!
//! ## String Storage Streams
//!
//! ### `#Strings` - Identifier Heap
//! **Purpose**: Stores UTF-8 encoded identifier strings referenced by metadata tables
//! - **Content**: Type names, member names, namespace identifiers, attribute names
//! - **Format**: Null-terminated UTF-8 strings with mandatory null entry at offset 0
//! - **Access**: 0-based offset indexing with O(1) random access
//! - **Compression**: Shared string storage eliminates duplication
//! - **Performance**: Optimized for frequent lookup during type resolution
//!
//! ### `#US` - User String Heap
//! **Purpose**: Stores UTF-16 encoded string literals from IL code
//! - **Content**: String constants, resource names, exception messages
//! - **Format**: Length-prefixed UTF-16 with terminal flag byte
//! - **Access**: 0-based offset indexing with variable-length entries
//! - **Encoding**: Little-endian UTF-16 with embedded null support
//! - **Performance**: Optimized for runtime string literal access
//!
//! ## Binary Data Streams
//!
//! ### `#Blob` - Binary Data Heap
//! **Purpose**: Stores variable-length binary data referenced by metadata tables
//! - **Content**: Method signatures, field types, custom attribute values, constants
//! - **Format**: Size-prefixed binary chunks with compressed length encoding
//! - **Access**: 0-based offset indexing with O(1) blob retrieval
//! - **Compression**: ECMA-335 compressed integer size prefixes
//! - **Performance**: Lazy parsing for on-demand signature decoding
//!
//! ### `#GUID` - Global Identifier Array
//! **Purpose**: Stores 128-bit globally unique identifiers for assembly correlation
//! - **Content**: Assembly GUIDs, module identifiers, type library references
//! - **Format**: Sequential 16-byte GUID entries in little-endian format
//! - **Access**: 1-based indexing (unique among metadata streams)
//! - **Alignment**: Fixed 16-byte boundaries for optimal memory access
//! - **Performance**: Direct array access with minimal validation overhead
//!
//! ## Metadata Table Streams
//!
//! ### `#~` - Compressed Metadata Tables
//! **Purpose**: Stores structural metadata in compressed tabular format
//! - **Content**: Type definitions, method signatures, field layouts, references
//! - **Format**: Variable-width compressed tables with optimized storage
//! - **Access**: Token-based indexing with cross-table references
//! - **Compression**: Row-based compression with minimal table overhead
//! - **Performance**: Bulk operations optimized for metadata scanning
//!
//! # Access Patterns and Performance
//!
//! ## Unified Interface Design
//! All heap types provide consistent access patterns:
//! - **Indexed Access**: `get(index)` for direct element retrieval
//! - **Sequential Access**: `iter()` for complete traversal
//! - **Zero-Copy**: Direct references to heap data without allocation
//! - **Error Handling**: Comprehensive bounds checking and format validation
//!
//! # Advanced Features
//!
//! ## Cross-Stream References
//! Metadata tables use indices to reference data across different streams:
//! ```text
//! Method Table Entry
//! ├── Name → #Strings offset
//! ├── Signature → #Blob offset
//! ├── RVA → Code location
//! └── Flags → Method attributes
//! ```
//!
//! ## Compression Techniques
//! - **String deduplication**: Shared storage for identical strings
//! - **Compressed integers**: Variable-length encoding for sizes and counts
//! - **Table optimization**: Minimal overhead for sparse tables
//! - **Reference packing**: Optimized token formats for cross-references
//!
//! ## Format Evolution
//! The module supports multiple metadata format versions:
//! - Legacy uncompressed format (`#-` tables)
//! - Modern compressed format (`#~` tables)
//! - Extended table schemas for newer .NET versions
//! - Backward compatibility with older assemblies
//!
//! # Examples
//!
//! ## Basic Stream Access
//! ```rust,no_run
//! use dotscope::CilObject;
//!
//! # fn example() -> dotscope::Result<()> {
//! let assembly = CilObject::from_path("example.dll")?;
//!
//! // Access string heap for type and member names
//! if let Some(strings) = assembly.strings() {
//! let type_name = strings.get(0x123)?; // Get string at offset 0x123
//! println!("Type name: {}", type_name);
//!
//! // Enumerate all strings in the heap
//! for (offset, string) in strings.iter() {
//! if !string.is_empty() {
//! println!("String at 0x{:X}: '{}'", offset, string);
//! }
//! }
//! }
//! # Ok(())
//! # }
//! ```
//!
//! ## Signature Analysis
//! ```rust,no_run
//! use dotscope::CilObject;
//!
//! # fn example() -> dotscope::Result<()> {
//! let assembly = CilObject::from_path("example.dll")?;
//!
//! // Access blob heap for method signatures and field types
//! if let Some(blob) = assembly.blob() {
//! let signature_data = blob.get(1)?; // Get blob at offset 1
//! println!("Signature bytes: {} bytes", signature_data.len());
//!
//! // Analyze all binary data for debugging
//! for (offset, blob_data) in blob.iter() {
//! if blob_data.len() > 0 {
//! println!("Blob at 0x{:X}: {} bytes", offset, blob_data.len());
//! }
//! }
//! }
//! # Ok(())
//! # }
//! ```
//!
//! ## Assembly Identity and Versioning
//! ```rust,no_run
//! use dotscope::CilObject;
//!
//! # fn example() -> dotscope::Result<()> {
//! let assembly = CilObject::from_path("example.dll")?;
//!
//! // Access GUID heap for assembly and module identifiers
//! if let Some(guid) = assembly.guids() {
//! let assembly_guid = guid.get(1)?; // Get GUID at index 1
//! println!("Assembly GUID: {}", assembly_guid);
//!
//! // Enumerate all GUIDs for correlation analysis
//! for (index, guid_value) in guid.iter() {
//! let null_guid = uguid::guid!("00000000-0000-0000-0000-000000000000");
//! if guid_value != null_guid {
//! println!("Active GUID at index {}: {}", index, guid_value);
//! }
//! }
//! }
//! # Ok(())
//! # }
//! ```
//!
//! ## String Literal Processing
//! ```rust,no_run
//! use dotscope::CilObject;
//!
//! # fn example() -> dotscope::Result<()> {
//! let assembly = CilObject::from_path("example.dll")?;
//!
//! // Access user strings heap for IL string literals
//! if let Some(user_strings) = assembly.userstrings() {
//! let literal = user_strings.get(0x100)?; // Get user string at offset 0x100
//! println!("String literal: '{}'", literal.to_string_lossy());
//!
//! // Process all string literals for analysis
//! for (offset, string_data) in user_strings.iter() {
//! if !string_data.is_empty() {
//! println!("User string at 0x{:X}: '{}'", offset, string_data.to_string_lossy());
//! }
//! }
//! }
//! # Ok(())
//! # }
//! ```
//!
//! ## Comprehensive Metadata Analysis
//! ```rust,no_run
//! use dotscope::CilObject;
//!
//! # fn example() -> dotscope::Result<()> {
//! let assembly = CilObject::from_path("example.dll")?;
//!
//! // Analyze all available streams for comprehensive metadata overview
//! println!("=== Metadata Stream Analysis ===");
//!
//! if let Some(strings) = assembly.strings() {
//! let string_count = strings.iter().count();
//! println!("String heap: {} entries", string_count);
//! }
//!
//! if let Some(user_strings) = assembly.userstrings() {
//! let literal_count = user_strings.iter().count();
//! println!("User string heap: {} entries", literal_count);
//! }
//!
//! if let Some(blob) = assembly.blob() {
//! let blob_count = blob.iter().count();
//! println!("Blob heap: {} entries", blob_count);
//! }
//!
//! if let Some(guid) = assembly.guids() {
//! let guid_count = guid.iter().count();
//! println!("GUID heap: {} entries", guid_count);
//! }
//!
//! if let Some(tables) = assembly.tables() {
//! let table_count = tables.table_count();
//! println!("Metadata tables: {} tables present", table_count);
//! }
//! # Ok(())
//! # }
//! ```
//!
//! # Error Handling and Validation
//!
//! All stream operations provide comprehensive error handling:
//! - **Format Validation**: Ensures stream headers and data integrity
//! - **Bounds Checking**: Prevents access beyond stream boundaries
//! - **Encoding Validation**: Verifies string encoding and compressed integers
//! - **Cross-Reference Validation**: Validates indices between streams and tables
//!
//! ## Iterator Error Behavior
//!
//! Stream iterators have two error handling strategies, chosen based on data characteristics:
//!
//! | Iterator | Strategy | Rationale |
//! |----------|----------|-----------|
//! | `StringsIterator` | Fail-fast | Null-terminated strings have clear boundaries |
//! | `BlobIterator` | Fail-fast | Length prefixes define exact extents |
//! | `GuidIterator` | Fail-fast | Fixed 16-byte entries have rigid structure |
//! | `UserStringsIterator` | Best-effort recovery | Complex format may have recoverable corruption |
//!
//! **Fail-fast iterators** return `None` immediately on encountering malformed data,
//! terminating iteration. This is appropriate when data corruption affects parsing
//! of subsequent entries.
//!
//! **Best-effort iterators** attempt to skip over corrupted entries (with bounded
//! retry limits) to continue yielding valid data. This is useful for forensic
//! analysis of partially corrupted assemblies.
//!
//! ## Common Error Scenarios
//! - Corrupted or truncated stream data
//! - Invalid offset or index values
//! - Malformed compressed integer encoding
//! - Inconsistent cross-stream references
//! - Unsupported metadata format versions
//!
//! # ECMA-335 Compliance
//!
//! This implementation fully complies with ECMA-335 requirements:
//! - Correct parsing of all standard stream formats
//! - Proper handling of compressed and uncompressed metadata
//! - Support for all defined string encodings (UTF-8, UTF-16)
//! - Accurate implementation of compressed integer formats
//! - Complete validation of stream headers and data integrity
//!
//! # Implementation Notes
//!
//! ## Memory Management
//! - **Zero-copy design**: All stream data accessed via direct references
//! - **Lazy evaluation**: Stream parsing deferred until first access
//! - **Minimal allocation**: Iterator state requires minimal heap allocation
//! - **Reference counting**: Safe lifetime management across thread boundaries
//!
//! ## Optimization Strategies
//! - **Compressed integer caching**: Frequently accessed sizes cached
//! - **String interning**: Identical strings share storage automatically
//! - **Sequential access optimization**: Iterator patterns optimized for cache locality
//! - **Bulk operations**: Table scanning operations optimized for performance
//!
//! ## Cross-Platform Considerations
//! - **Endianness handling**: Proper little-endian conversion on all platforms
//! - **Alignment requirements**: Respect platform-specific alignment constraints
//! - **Unicode normalization**: Consistent string handling across operating systems
//! - **Path separators**: Platform-agnostic resource name processing
//!
//! # See Also
//! - [`crate::metadata::cilobject::CilObject`]: Main assembly access interface
//! - [`crate::metadata::tables`]: Metadata table processing and analysis
//! - [`crate::File`]: PE file format handling and data access
//! - [`crate::metadata::signatures`]: Binary signature parsing and analysis
//!
//! # References
//! - **ECMA-335 II.24.2.2**: Stream header specification and directory format
//! - **ECMA-335 II.24.2.3**: `#Strings` heap format and encoding rules
//! - **ECMA-335 II.24.2.4**: `#Blob` heap format and compression details
//! - **ECMA-335 II.24.2.5**: `#GUID` heap format and indexing convention
//! - **ECMA-335 II.24.2.6**: `#US` heap format and UTF-16 encoding
//! - **ECMA-335 II.22**: Metadata table definitions and relationships
/// Stream header parsing and validation for ECMA-335 metadata directory.
///
/// Provides [`crate::metadata::streams::streamheader::StreamHeader`] for parsing stream directory entries, validating
/// stream names, and calculating data offsets within the metadata section.
pub use StreamHeader;
/// UTF-8 identifier string heap (`#Strings`) implementation.
///
/// Provides [`crate::metadata::streams::strings::Strings`] and [`crate::metadata::streams::strings::StringsIterator`] for accessing null-terminated
/// UTF-8 strings used for type names, member names, and other identifiers.
pub use ;
/// UTF-16 user string heap (`#US`) implementation.
///
/// Provides [`crate::metadata::streams::userstrings::UserStrings`] and [`crate::metadata::streams::userstrings::UserStringsIterator`] for accessing
/// length-prefixed UTF-16 string literals from IL code and resources.
pub use ;
/// Metadata tables header (`#~` / `#-`) parsing and validation.
///
/// Provides [`crate::metadata::streams::tablesheader::TablesHeader`] for parsing compressed and uncompressed
/// metadata table headers, schemas, and row count information.
pub use TablesHeader;
/// 128-bit GUID array (`#GUID`) implementation.
///
/// Provides [`crate::metadata::streams::guid::Guid`] and [`crate::metadata::streams::guid::GuidIterator`] for accessing globally unique
/// identifiers used for assembly identity and cross-reference correlation.
pub use ;
/// Variable-length binary data heap (`#Blob`) implementation.
///
/// Provides [`crate::metadata::streams::blob::Blob`] and [`crate::metadata::streams::blob::BlobIterator`] for accessing size-prefixed
/// binary data including signatures, custom attributes, and constants.
pub use ;