1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
// Copyright 2015 Ted Mielczarek. See the COPYRIGHT
// file at the top-level directory of this distribution.

//! A parser for the minidump file format.
//!
//! The `minidump` module provides a parser for the
//! [minidump][minidump] file format as produced by Microsoft's
//! [`MinidumpWriteDump`][minidumpwritedump] API and the
//! [Google Breakpad][breakpad] library.
//!
//!
//!
//! # Usage
//!
//! The primary API for this library is the [`Minidump`][] struct, which can be
//! instantiated by calling the [`Minidump::read`][] or [`Minidump::read_path`][]
//! methods.
//!
//! Successfully parsing a Minidump struct means the minidump has a minimally valid
//! header and stream directory. Individual streams are only parsed when they're
//! requested.
//!
//! Although you may enumerate the streams in a minidump with methods like
//! [`Minidump::all_streams`][], this is only really useful for debugging. Instead
//! you should statically request streams with [`Minidump::get_stream`][].
//! Depending on what analysis you're trying to perform, you may:
//!
//! * Consider it an error for a stream to be missing (using `?` or `unwrap`)
//! * Branch on the presence of stream to conditionally refine your analysis
//! * Use a stream's `Default` implementation to get an "empty" instance
//!   (with `unwrap_or_default`)
//!
//! ```
//! use minidump::*;
//!
//! fn main() -> Result<(), Error> {
//!     // Read the minidump from a file
//!     let mut dump = minidump::Minidump::read_path("../testdata/test.dmp")?;
//!
//!     // Statically request (and require) several streams we care about:
//!     let system_info = dump.get_stream::<MinidumpSystemInfo>()?;
//!     let exception = dump.get_stream::<MinidumpException>()?;
//!
//!     // Combine the contents of the streams to perform more refined analysis
//!     let crash_reason = exception.get_crash_reason(system_info.os, system_info.cpu);
//!
//!     // Conditionally analyze a stream
//!     if let Ok(threads) = dump.get_stream::<MinidumpThreadList>() {
//!         // Use `Default` to try to make progress when a stream is missing.
//!         // This is especially natural for MinidumpMemoryList because
//!         // everything needs to handle memory lookups failing anyway.
//!         let mem = dump.get_memory().unwrap_or_default();
//!
//!         for thread in &threads.threads {
//!             let stack = thread.stack_memory(&mem);
//!             // ...
//!         }
//!     }
//!
//!     Ok(())
//! }
//! ```
//!
//! Generally speaking, there isn't any reason to distinguish between a stream being
//! absent and it being corrupt. Just ask for what you want and we'll do our best
//! to give it to you.
//!
//! Everything else you would want to do with a Minidump is specific to the
//! individual streams:
//!
//! * [`MinidumpAssertion`][]
//! * [`MinidumpBreakpadInfo`][]
//! * [`MinidumpCrashpadInfo`][]
//! * [`MinidumpException`][]
//! * [`MinidumpLinuxCpuInfo`][]
//! * [`MinidumpLinuxEnviron`][]
//! * [`MinidumpLinuxLsbRelease`][]
//! * [`MinidumpLinuxMaps`][]
//! * [`MinidumpLinuxProcStatus`][]
//! * [`MinidumpMacCrashInfo`][]
//! * [`MinidumpMacBootargs`][]
//! * [`MinidumpMemoryList`][]
//! * [`MinidumpMemoryInfoList`][]
//! * [`MinidumpMiscInfo`][]
//! * [`MinidumpModuleList`][]
//! * [`MinidumpSystemInfo`][]
//! * [`MinidumpThreadList`][]
//! * [`MinidumpThreadNames`][]
//! * [`MinidumpUnloadedModuleList`][]
//! * [`MinidumpLinuxProcLimits`][]
//!
//!
//!
//!
//! # Notable Streams
//!
//! There's a lot of different Minidump Streams, but some are especially
//! notable/fundamental:
//!
//! [`MinidumpSystemInfo`][] includes details about the hardware and operating
//! system that the crash occured on. This information is often required to
//! properly interpret the other streams of the minidump, as they contain
//! platform-specific values.
//!
//! [`MinidumpException`][] includes actual details about where and why the crash
//! occured.
//!
//! [`MinidumpThreadList`][] includes the registers and stack memory of every thread
//! in the program at the time of the crash. This enables generating backtraces for
//! every thread.
//!
//! [`MinidumpMemoryList`][] maps the crashing program's runtime addresses (such as
//! `$rsp`) to ranges of memory in the Minidump.
//!
//! [`MinidumpModuleList`][] includes info on all the modules (libraries) that were
//! linked into the crashing program. This enables symbolication, as you can map
//! instruction addresses back to offsets in a specific library's binary.
//!
//!
//!
//!
//! # What is a Minidump?
//!
//! Minidumps capture the state of a crashing process (threads, stack memory,
//! registers, dlls), why it crashed (crashing thread, error codes, error
//! messages), and details about the system the program was running on (os, cpu).
//!
//! The information in a minidump is divided up into a series of
//! independent "streams". If you want a specific piece of information, you must
//! know the stream that contains it, and then look up that stream in the
//! minidump's directory. Most streams are pretty straight-forward -- you can guess
//! what you might find in [`MinidumpThreadList`][] or [`MinidumpSystemInfo`][]
//! -- but others -- like [`MinidumpMiscInfo`][] -- are a bit more random.
//!
//! This [format][minidump] was initially defined by Microsoft, as Windows has long
//! included [system apis to generate minidumps][minidumpwritedump]. But lots of
//! software gets made for operating systems other than Windows, where no such
//! native support for minidumps is present. [google-breakpad][breakpad] was
//! created to extend Microsoft's minidump format to other platforms, and defines
//! minidump generators for things like Linux and MacOS.
//!
//! I do not believe that Microsoft and Breakpad officially collaborate on the
//! format, it's just designed to be very extensible, so it's easy to add random
//! stuff to a minidump in ways that don't break old tools and likely won't
//! interfere with future versions. That said, Microsoft does now develop
//! cross-platform products that make use of Breakpad, such as VSCode, so at very
//! least their crash reporting infra deals with Breakpad minidumps.
//!
//! The rust-minidump crates are specifically designed to support Breakpad's
//! extended minidump format (and native Windows minidumps, which should in theory
//! just be a subset). That said, rust-minidump doesn't yet (and probably won't
//! ever) support *everything*. There's a lot of random stuff that either Microsoft
//! or Breakpad have defined over the years that we just, do not have any use for
//! at the moment. Not a lot of demand for handling minidumps for PlayStation 3,
//! SPARC, or Windows CE these days.
//!
//!
//!
//!
//!
//! # The Minidump Format
//!
//! This section is dedicated to describing how to parse minidumps, for anyone
//! wanting to maintain this code or write their own parser.
//!
//! Minidumps are a binary format. This format is simultaneously very simple and
//! very complicated.
//!
//! The simple part of a minidump is that it's basically just an array of pointers
//! to different typed "Streams" (system info, exception info, threads, memory
//! mappings, etc.). So if you want to lookup the system info, you just search the
//! array for a system info stream and interpret that range of memory as that
//! stream.
//!
//! The complicated part of a minidump is the fact that every stream contains
//! totally different information in totally different formats. Sure, there are
//! families of streams that have the same general structure, but you've still got
//! to write custom code to interpret the values meaningfully and figure out what
//! on earth that information is useful for.
//!
//! Sometimes the answer to "what is it useful for?" is "I don't know but maybe
//! we'll find a use for it later". This is genuinely useful because it allows us
//! to add new analyses long after a crash occurs and gain new insights that the
//! minidump format wasn't explicitly designed to provide.
//!
//! This is all to say that, beyond the basic layout of the minidump header and
//! directory, it's basically just a big ball of random formats with independent
//! formats and layout -- and everyone is technically free to come up with their
//! own custom Streams that they can just toss in there, so trying to cover
//! everything is kind of impossible? Lets see how far we get!
//!
//!
//!
//! ## The Minidump Header and Directory
//!
//! The first thing in a Minidump is the [`MINIDUMP_HEADER`][format::MINIDUMP_HEADER], which has the
//! following layout:
//!
//! ```
//! pub struct MINIDUMP_HEADER {
//!     pub signature: u32,
//!     pub version: u32,
//!     pub stream_count: u32,
//!     pub stream_directory_rva: RVA,
//!     pub checksum: u32,
//!     pub time_date_stamp: u32,
//!     pub flags: u64,
//! }
//!
//! /// Offset into the minidump
//! pub type RVA = u32;
//! ```
//!
//! The `signature` is always [`MINIDUMP_SIGNATURE`][format::MINIDUMP_SIGNATURE] = `0x504d444d`
//! ("MDMP" in ascii). You can use this to detect whether the minidump is little-endian or
//! big-endian (minidumps always have the endianess of platform they were generated
//! on, since they contain lots of raw memory from the process, but at this point
//! we don't know what that platform is).
//!
//! The lower 16 bits of `version` are always
//! [`MINIDUMP_VERSION`][format::MINIDUMP_VERSION] = 42899.
//! (The high bits contain implementation-specific values that you should just
//! ignore).
//!
//! `stream_directory_rva` and `stream_count` are the location (offset from the
//! start of the file, in bytes) and size of the stream directory, respectively.
//!
//! `checksum` is some kind of checksum of the minidump itself (which may be null),
//! but the algorithm isn't specified, and rust-minidump doesn't check it.
//!
//! `time_date_stamp` is a Windows `time_t` of when the miniump was generated.
//!
//! `flags` are a [`MINIDUMP_TYPE`][MINIDUMP_TYPE] which largely just specify what you can expect
//! to find in the minidump. This is unused by rust-minidump since this information
//! is generally redundant with the stream directory and flags within the streams
//! that we need to check anyway. (e.g. instead of checking that this is a
//! `MiniDumpWithUnloadedModules`, you can just check the directory for the
//! [`MinidumpUnloadedModuleList`][] stream.)
//!
//! At `stream_directory_rva` (typically immediately after the header) you will find
//! an array of `stream_count` [`MINIDUMP_DIRECTORY`][format::MINIDUMP_DIRECTORY] entries,
//! with the following layout:
//!
//! ```
//! pub struct MINIDUMP_DIRECTORY {
//!     /// The type of the stream
//!     pub stream_type: u32,
//!     /// The location of the stream contents within the dump.
//!     pub location: MINIDUMP_LOCATION_DESCRIPTOR,
//! }
//!
//! /// A "slice" of the minidump
//! pub struct MINIDUMP_LOCATION_DESCRIPTOR {
//!     /// The size of this data (in bytes)
//!     pub data_size: u32,
//!     /// The offset to this data within the minidump file.
//!     pub rva: RVA,
//! }
//!
//! /// Offset into the minidump
//! pub type RVA = u32;
//! ```
//!
//! Known `stream_type` values are defined in
//! [`MINIDUMP_STREAM_TYPE`][format::MINIDUMP_STREAM_TYPE], but users
//! are allowed to define their own stream types, so it's normal to see unknown
//! types (this is the primary mechanism breakpad uses to extend the format without
//! causing upstream problems).
//!
//! And that's it! Everything else in a minidump is just all the different types of
//! stream. As of this writing, rust-minidump is aware of 51 different types of
//! stream, and implements 18 of them (there's a long tail of platform-specific and
//! domain-specific streams, so that isn't as bad as it sounds).
//!
//!
//!
//!
//! ## Stream Format Families
//!
//! Although every stream can do whatever it wants, there's a lot of streams that
//! are basically "a struct" or "a list of structs", so the same header formats and
//! layouts are used in several places. (This is descriptive, so these aren't
//! necessarily official terms/concepts.)
//!
//!
//!
//! ### Plain Old Struct Streams
//!
//! A stream that's just a struct.
//!
//! That's it. Just read the struct out of the stream. Although it might contain
//! RVAs to other data, which may or may not be relative to the start of the stream
//! or the start of the file (annoyingly inconsistent between streams).
//!
//! Known members of this family:
//!
//! * [`MinidumpAssertion`][] (contains [`MINIDUMP_ASSERTION_INFO`][format::MINIDUMP_ASSERTION_INFO])
//! * [`MinidumpBreakpadInfo`][] (contains [`MINIDUMP_BREAKPAD_INFO`][format::MINIDUMP_BREAKPAD_INFO])
//! * [`MinidumpCrashpadInfo`][] (contains [`MINIDUMP_CRASHPAD_INFO`][format::MINIDUMP_CRASHPAD_INFO])
//! * [`MinidumpException`][] (contains [`MINIDUMP_EXCEPTION_STREAM`][format::MINIDUMP_EXCEPTION_STREAM])
//! * [`MinidumpSystemInfo`][] (contains [`MINIDUMP_SYSTEM_INFO`][format::MINIDUMP_SYSTEM_INFO])
//!
//!
//!
//! ### List Streams
//!
//! A list of some entry type.
//!
//! A `u32` count of entries followed by an array of entries. There may be padding
//! between the count and the entries. The array should be "right-justified" in the
//! stream (the stream ends exactly where the array does), so you can use the
//! difference between the array's expected size and the rest of the stream's size
//! to determine the padding.
//!
//! This format is used by a lot of the oldest (and therefore most important)
//! minidump streams.
//!
//! Known members of this family:
//!
//! * [`MinidumpMemoryList`] (entries are [`MINIDUMP_MEMORY_DESCRIPTOR`][format::MINIDUMP_MEMORY_DESCRIPTOR])
//! * [`MinidumpModuleList`] (entries are [`MINIDUMP_MODULE`][format::MINIDUMP_MODULE])
//! * [`MinidumpThreadList`] (entries are [`MINIDUMP_THREAD`][format::MINIDUMP_THREAD])
//! * [`MinidumpThreadNames`] (entries are [`MINIDUMP_THREAD_NAME`][format::MINIDUMP_THREAD_NAME])
//! * `MINIDUMP_THREAD_EX_LIST` (yes, the stream with "EX_LIST" in the name isn't an
//!   EX list, names are hard.)
//!
//! The stream [`MinidumpMemory64List`] is a variant of list stream. It starts with
//! a `u64` count of entries, a 64-bit shared RVA for all entries, then followed by
//! an array of entires [`MINIDUMP_MEMORY_DESCRIPTOR64`][format::MINIDUMP_MEMORY_DESCRIPTOR64].
//!
//!
//! ### EX List Streams
//!
//! A newer and more flexible version of list streams. (so EXtreme!!!)
//!
//! EX list streams start with this header:
//!
//! ```
//! struct EX_LIST_HEADER {
//!   /// Size (in bytes) of this header (array starts immediately after)
//!   pub size_of_header: u32,
//!   /// Size (in bytes) of an entry in the array
//!   pub size_of_entry: u32,
//!   /// The number of entries in the array
//!   pub number_of_entries: u32,
//! }
//! ```
//!
//! This design allows newer versions of the stream to be introduced, and for fields
//! to be added to the end of an entry type. I am not aware of an instance where
//! this flexibility has been used yet, but in theory you could identify "versions"
//! of the stream format by size, and older versions don't need to worry about
//! unknown future revisions, because they can just ignore the trailing bytes of
//! each entry.
//!
//! Known members of this family:
//!
//! * [`MinidumpMemoryInfoList`][] (entries are [`MINIDUMP_MEMORY_INFO`][format::MINIDUMP_MEMORY_INFO])
//! * [`MinidumpUnloadedModuleList`][] (entries are [`MINIDUMP_UNLOADED_MODULE`][format::MINIDUMP_UNLOADED_MODULE])
//! * [`MinidumpHandleDataStream`][] is a slight variation of this format with different
//!   filed names and a trailing `u32` member reserved for future use (entries
//!   are [`MINIDUMP_HANDLE_DESCRIPTOR`][format::MINIDUMP_HANDLE_DESCRIPTOR] and
//!   [`MINIDUMP_HANDLE_DESCRIPTOR_2`][format::MINIDUMP_HANDLE_DESCRIPTOR_2])
//! * [`MinidumpThreadInfoList`][] (entries are [`MINIDUMP_THREAD_INFO`][format::MINIDUMP_THREAD_INFO])
//!
//!
//!
//! ### Linux List Streams
//!
//! A dump of a special linux file like `/proc/cpuinfo`.
//!
//! These streams are plain text ([`strings::LinuxOsString`][]) files containing
//! line-delimited key-value pairs, like:
//!
//! ```text
//! processor       : 0
//! vendor_id       : GenuineIntel
//! cpu family      : 6
//! model           : 45
//! model name      : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
//! ```
//!
//! Whitespace and separators vary from stream to stream.
//!
//! Known members of this family:
//!
//! * [`MinidumpLinuxCpuInfo`][] (separator is `:`)
//! * [`MinidumpLinuxEnviron`][] (separator is `=`)
//! * [`MinidumpLinuxLsbRelease`][] (separator is `=`)
//! * [`MinidumpLinuxProcStatus`][] (separator is `:`)
//! * [`MinidumpLinuxProcLimits`][] (separator is ` `)
//!
//!
//!
//! [MINIDUMP_TYPE]: https://docs.microsoft.com/en-us/windows/win32/api/minidumpapiset/ne-minidumpapiset-minidump_type
//! [minidump]: https://msdn.microsoft.com/en-us/library/windows/desktop/ms680369%28v=vs.85%29.aspx
//! [minidumpwritedump]: https://msdn.microsoft.com/en-us/library/windows/desktop/ms680360%28v=vs.85%29.aspx
//! [breakpad]: https://chromium.googlesource.com/breakpad/breakpad/+/master/

#![warn(missing_debug_implementations)]

#[cfg(doctest)]
doc_comment::doctest!("../README.md");

pub use scroll::Endian;

mod context;
mod iostuff;
mod minidump;

pub use minidump_common::format;
pub use minidump_common::traits::Module;

pub use crate::iostuff::Readable;
pub use crate::minidump::*;

pub mod strings;
pub mod system_info;