oletools_rs 0.1.0

Rust port of oletools — analysis tools for Microsoft Office files (VBA macros, DDE, OLE objects, RTF exploits)
Documentation
# oletools_rs

[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
[![Rust](https://img.shields.io/badge/rust-1.87%2B-orange.svg)](https://www.rust-lang.org/)

Rust port of [python-oletools](https://github.com/decalage2/oletools) — a set
of tools for analyzing Microsoft Office files to detect VBA macros, DDE
exploits, embedded objects, and other potentially malicious content.

## Modules

| Module | Description |
|--------|-------------|
| **ole** | OLE2 Compound Document parsing (streams, metadata, CLSIDs, timestamps) |
| **ooxml** | Office Open XML parsing (ZIP + XML parts, content types, relationships) |
| **vba** | VBA macro extraction and analysis (MS-OVBA decompression, keyword scanning) |
| **ftguess** | File type detection via magic bytes, OLE CLSID, and OOXML content types |
| **mraptor** | MacroRaptor heuristic detection of malicious macros (AutoExec + Write/Execute) |
| **oleid** | Security indicator analysis (format, encryption, VBA, mraptor, external rels, ObjectPool, Flash) |
| **oleobj** | Embedded OLE object extraction (OLE 1.0, OleNativeStream, external relationships) |
| **rtfobj** | RTF parsing and OLE extraction with CVE detection (CVE-2017-0199, CVE-2017-11882) |
| **msodde** | DDE command detection across .doc, .docx, .xls, .xlsx, RTF, CSV |
| **oletimes** | Timestamp extraction from OLE entries (FILETIME conversion) |
| **crypto** | Encrypted document detection (feature-gated) |

## MSRV

Minimum Supported Rust Version: **1.87** (edition 2024).

## Installation

```toml
[dependencies]
oletools_rs = "0.1"
```

To enable encrypted document support:

```toml
[dependencies]
oletools_rs = { version = "0.1", features = ["crypto"] }
```

## Quick start

### Detect file type

```rust,no_run
use oletools_rs::FileTypeGuesser;

fn main() -> oletools_rs::Result<()> {
    let result = FileTypeGuesser::from_path("document.docm")?;
    println!("Type: {}", result.file_type);
    println!("Contains VBA: {}", result.may_contain_vba);
    Ok(())
}
```

### Extract VBA macros

```rust,no_run
use oletools_rs::VbaParser;

fn main() -> oletools_rs::Result<()> {
    let parser = VbaParser::from_path("spreadsheet.xlsm")?;
    if parser.detect_vba_macros()? {
        for m in parser.extract_macros()? {
            println!("Module: {} ({:?})", m.name, m.module_type);
            println!("{}", m.code);
        }
    }
    Ok(())
}
```

### Detect malicious macros (MacroRaptor)

```rust,no_run
use oletools_rs::MacroRaptor;

fn main() -> oletools_rs::Result<()> {
    let data = std::fs::read("document.docm")?;
    let (result, flags) = MacroRaptor::scan_file(&data)?;
    if flags.is_suspicious() {
        println!("Suspicious macro detected!");
        println!("  AutoExec: {}", flags.autoexec);
        println!("  Write:    {}", flags.write);
        println!("  Execute:  {}", flags.execute);
    }
    Ok(())
}
```

### Run full security analysis (OleID)

```rust,no_run
use oletools_rs::OleID;

fn main() {
    let data = std::fs::read("document.doc").unwrap();
    let oleid = OleID::new(&data);
    for ind in oleid.analyze() {
        println!("{}: {} = {} ({})", ind.id, ind.name, ind.value, ind.risk);
    }
}
```

### Extract OLE objects from RTF

```rust,no_run
use oletools_rs::RtfObjParser;

fn main() -> oletools_rs::Result<()> {
    let data = std::fs::read("document.rtf")?;
    let objects = RtfObjParser::extract(&data)?;
    for obj in &objects {
        if let Some(ref ole) = obj.ole_object {
            println!("Class: {}", ole.class_name);
        }
        for cve in &obj.cve_detections {
            println!("{}: {}", cve.cve_id, cve.description);
        }
    }
    Ok(())
}
```

### Detect DDE commands

```rust,no_run
use oletools_rs::msodde;

fn main() -> oletools_rs::Result<()> {
    let doc_data = std::fs::read("document.doc")?;
    for field in msodde::doc::process_doc(&doc_data)? {
        println!("DDE: {} (source: {})", field.command, field.source);
    }
    Ok(())
}
```

### Extract timestamps

```rust,no_run
use oletools_rs::oletimes;

fn main() -> oletools_rs::Result<()> {
    let data = std::fs::read("document.doc")?;
    for entry in oletimes::extract_timestamps_from_bytes(&data)? {
        println!(
            "{}: created={:?}, modified={:?}",
            entry.path, entry.created, entry.modified
        );
    }
    Ok(())
}
```

## Feature flags

| Flag | Default | Description |
|------|---------|-------------|
| `crypto` | off | Enable encrypted document detection via `office-crypto` |

## Project structure

```
src/
  lib.rs              Public API and re-exports
  error.rs            Unified error types (thiserror)
  common/
    codepages.rs      Windows codepage to encoding_rs mapping
    patterns.rs       IOC regex patterns (URLs, IPs, executables)
  ole/
    container.rs      OLE2 container (cfb wrapper)
    clsid.rs          Known CLSID database
    metadata.rs       OLE metadata extraction
    directory.rs      Directory entry types
    sector_map.rs     Sector chain analysis
  ooxml/
    parser.rs         ZIP-based OOXML parser
    content_types.rs  [Content_Types].xml parser
    relationships.rs  .rels parser, external relationship detection
  vba/
    decompressor.rs   MS-OVBA 2.4.1 decompression
    project.rs        VBA project dir stream parser
    module.rs         VBA module source extraction
    parser.rs         High-level VBA parser (OLE/OOXML/FlatOPC)
    scanner.rs        Suspicious pattern scanner
    keywords.rs       AutoExec/Suspicious keyword database
  ftguess/
    detector.rs       File type detection engine
    types.rs          FileType, Container, Application enums
  mraptor/
    analyzer.rs       MacroRaptor A/W/X heuristic engine
  oleid/
    indicator.rs      Indicator and RiskLevel types
    checker.rs        7-check security analysis
  oleobj/
    native_stream.rs  OleNativeStream parser (MS-OLEDS 2.3.6)
    ole_object.rs     OLE 1.0 object parser
    extractor.rs      High-level embedded object extractor
  rtfobj/
    parser.rs         RTF state machine parser
    object.rs         OLE object extraction from RTF
    cve.rs            CVE-2017-0199 / CVE-2017-11882 detection
  msodde/
    field_parser.rs   DDE field types, QUOTE decoding, safe-field blocklist
    doc.rs            Word binary (.doc) DDE scanning
    docx.rs           Word OOXML (.docx) DDE scanning
    xls.rs            Excel binary (.xls) SupBook DDE scanning
    xlsx.rs           Excel OOXML (.xlsx) ddeLink scanning
    rtf.rs            RTF fldinst DDE scanning
    csv.rs            CSV formula injection detection
  oletimes/
    mod.rs            FILETIME conversion and timestamp extraction
  crypto/
    mod.rs            Encryption detection (feature-gated)
```

## Testing

```bash
cargo test         # 187 unit tests
cargo clippy       # zero warnings
```

## Disclaimer

This library is intended for **defensive security analysis**, malware triage,
and forensic investigation. It parses and inspects Office documents but does
not execute any embedded code. The detection heuristics (MacroRaptor, DDE,
CVE checks) are indicators, not guarantees — always combine with other tools
for production security decisions.

## License

[MIT](LICENSE)