Outlook Email Message (.msg) parser
A parser for Microsoft Outlook .msg files (OLE Compound Document format).
Extracts message metadata, body content, recipients, attachments, and transport
headers as specified in [MS-OXMSG] and [MS-OXPROPS].
Usage
Add this to your Cargo.toml:
[]
= "0.3"
Quick Start
use Outlook;
Parsing from different sources
use Outlook;
// From a file path
let outlook = from_path.unwrap;
// From a byte slice (accepts &[u8], Vec<u8>, or anything AsRef<[u8]>)
let bytes = read.unwrap;
let outlook = from_slice.unwrap;
// Passing a Vec<u8> directly also works
let outlook = from_slice.unwrap;
// From any std::io::Read source (file, stdin, network, etc.)
let file = open.unwrap;
let outlook = from_reader.unwrap;
Display formatting
Outlook, Person, and Attachment all implement Display for
human-readable output:
let outlook = from_path.unwrap;
// Prints a summary: From, Subject, To, CC, BCC, Date, Attachments
println!;
Saving attachments
let outlook = from_path.unwrap;
for attach in &outlook.attachments
Detecting and parsing embedded messages
Attachments with attach_method == 5 are nested .msg files (embedded
messages). Use as_message() to parse them recursively:
let outlook = from_path.unwrap;
for attach in &outlook.attachments
// Or use the convenience method:
for attach in &outlook.attachments
Inline images (Content-ID)
HTML bodies reference inline images via cid: URIs. Use content_id to
resolve them:
let outlook = from_path.unwrap;
let mut html = outlook.html.clone;
for attach in &outlook.attachments
RTF decompression and HTML extraction
Many .msg files store the body as compressed RTF rather than HTML.
Use rtf_decompressed() to get the raw RTF, or html_from_rtf() to
extract embedded HTML:
let outlook = from_path.unwrap;
// Get the best available HTML body
let html = if !outlook.html.is_empty else ;
// Or work with the raw decompressed RTF directly
if let Some = outlook.rtf_decompressed
Named properties (MAPI 0x8000+ range)
The parser automatically resolves MAPI named properties — both well-known
dispID-based properties (e.g. ReminderSet, InternetAccountName,
AppointmentStartWhole) and custom string-named properties stored in the
__nameid_version1.0 streams. These are merged into the same property maps
used for standard MAPI properties, so they appear transparently in the parsed
output and JSON serialization.
Message metadata
let outlook = from_path.unwrap;
// Timestamps (ISO 8601 UTC, empty string if unavailable)
println!;
println!;
println!;
println!;
// Classification
println!; // e.g. "IPM.Note"
println!; // 0=Low, 1=Normal, 2=High
println!; // 0=Normal, 1=Personal, 2=Private, 3=Confidential
JSON output
let outlook = from_path.unwrap;
let json = outlook.to_json.unwrap;
println!;
Available fields
| Field | Type | Description |
|---|---|---|
headers |
TransportHeaders |
SMTP transport headers (raw + parsed fields) |
sender |
Person |
Sender name and email |
to |
Vec<Person> |
Primary recipients |
cc |
Vec<Person> |
Carbon-copy recipients |
bcc |
Vec<Person> |
Blind carbon-copy recipients |
subject |
String |
Subject line |
body |
String |
Plain-text body |
html |
String |
HTML body |
rtf_compressed |
String |
RTF body (hex-encoded) |
message_class |
String |
Message class (e.g. "IPM.Note") |
importance |
u32 |
0=Low, 1=Normal, 2=High |
sensitivity |
u32 |
0=Normal, 1=Personal, 2=Private, 3=Confidential |
client_submit_time |
String |
ISO 8601 UTC timestamp |
message_delivery_time |
String |
ISO 8601 UTC timestamp |
creation_time |
String |
ISO 8601 UTC timestamp |
last_modification_time |
String |
ISO 8601 UTC timestamp |
attachments |
Vec<Attachment> |
File attachments with metadata and raw bytes |
Attachment fields
| Field | Type | Description |
|---|---|---|
display_name |
String |
Display name shown in the mail client |
payload |
String |
Hex-encoded attachment content |
payload_bytes |
Vec<u8> |
Raw attachment bytes |
extension |
String |
File extension (e.g. ".pdf") |
mime_tag |
String |
MIME type (e.g. "image/png") |
file_name |
String |
Short 8.3 filename |
long_file_name |
String |
Full original filename |
attach_method |
u32 |
1=file, 5=embedded .msg, 6=OLE object |
content_id |
String |
Content-ID for inline images |
Methods
| Method | Returns | Description |
|---|---|---|
Outlook::from_path(path) |
Result<Outlook, Error> |
Parse from filesystem path |
Outlook::from_slice(bytes) |
Result<Outlook, Error> |
Parse from byte slice or Vec<u8> |
Outlook::from_reader(reader) |
Result<Outlook, Error> |
Parse from any Read source |
Outlook::to_json() |
Result<String, Error> |
Serialize to JSON |
Outlook::rtf_decompressed() |
Option<Vec<u8>> |
Decompress RTF body |
Outlook::html_from_rtf() |
Option<String> |
Extract HTML from compressed RTF |
Attachment::as_message() |
Option<Result<Outlook, Error>> |
Parse embedded .msg attachment |
Attachment::is_embedded_message() |
bool |
Check if attachment is embedded .msg |
Performance
from_path and from_slice use an optimized zero-copy header parsing path
(Reader::from_bytes) that avoids the double-allocation overhead of streaming
through BufReader. For large .msg files this reduces peak memory usage and
parse time compared to the generic from_reader path.
Requirements
- Rust edition 2024 (rustc 1.85+)
Running the example
# or with a specific file:
Running tests
Contributions
Feel free to open pull requests to contribute, enhance, or fix bugs.
License: MIT