ewf-forensic 0.5.0

Verify the image. Trust the evidence.

ewf-forensic is a pure-Rust integrity analyser for EWF images — no libewf, no C toolchain, no build complexity. It supports EWF v1 (E01 multi-segment with sibling auto-discovery), EWF v2 (Ex01/Lx01), SHA-1 and SHA-256 from digest sections, chain-of-custody external hash comparison (MD5, SHA-1, SHA-256), and optional header metadata extraction.

The analyser reports exactly what is structurally wrong across eight layers: signature forgery, broken section chains, cyclic chain attacks, Adler-32 descriptor corruption, volume geometry inconsistencies, table mismatches, out-of-bounds chunk pointers, MD5/SHA-1/SHA-256 hash mismatches, per-chunk checksum errors, and EWF v2 per-section data integrity. 40 distinct anomaly types across four severity levels.

Install

[dependencies]
ewf-forensic = "0.4"

Optional Serde support for serialising anomalies and progress events to JSON:

ewf-forensic = { version = "0.4", features = ["serde"] }

What It Checks

Layer 1 — File Header

Anomaly	Severity
`InvalidSignature` — EVF magic bytes corrupted or absent	Critical
`SegmentNumberZero` — segment number field is 0 (invalid)	Error

Layer 2 — Section Descriptor Integrity

Anomaly	Severity
`SectionDescriptorCrcMismatch { offset, section_type, computed, stored }` — Adler-32 over descriptor bytes [0..72] does not match stored checksum	Error

Layer 3 — Section Chain

Anomaly	Severity
`SectionChainBroken { at_offset, next_offset }` — `next` pointer is zero, past EOF, or points backward (cycle)	Critical
`SectionGapNonZero { gap_offset, gap_size }` — non-zero bytes exist between consecutive sections	Warning
`SectionGapZero { gap_offset, gap_size }` — zero-filled bytes between sections (alignment padding; noted as structural anomaly)	Info

Layer 4 — Section Completeness

Anomaly	Severity
`VolumeSectionMissing` — neither `volume` nor `disk` section found	Critical
`SectorsSectionMissing` — no `sectors` section found; sector data absent	Error
`TableSectionMissing` — no `table` section found; chunk index unusable	Error
`UnknownSectionType { offset, type_name }` — section type string not in the EWF v1 spec	Warning
`DoneSectionMissing` — chain ends without a `done` section	Warning

Layer 5 — Volume Geometry

Anomaly	Severity
`BytesPerSectorInvalid { bytes_per_sector }` — not 512 or 4 096	Error
`VolumeBodyCrcMismatch { computed, stored }` — Adler-32 over volume section body does not match (EWF v2)	Error
`ChunkSizeInvalid { sectors_per_chunk, bytes_per_sector }` — zero or not a power of two	Error
`SectorCountMismatch { declared, expected }` — `sector_count` is outside the valid range; last-chunk padding is normal and not flagged	Error
`MediaTypeUnknown { media_type }` — `media_type` byte is not a recognised EWF v2 value	Warning
`SetIdentifierMismatch { segment, expected }` — set GUID in segment header does not match the first segment (EWF v2 multi-segment)	Error

Layer 6 — Table Integrity

Anomaly	Severity
`TableChunkCountMismatch { in_volume, in_table }` — entry count in table header differs from volume	Error
`TableHeaderAdler32Mismatch { computed, stored }` — Adler-32 over table header bytes does not match stored checksum	Error
`Table2Mismatch { chunk_index, offset_in_table, offset_in_table2 }` — `table` and `table2` entries disagree for the same chunk index	Error
`TableEntryOutOfBounds { chunk_index, entry_offset, file_size }` — chunk offset resolves past EOF	Error
`TableEntryOutsideSectorsRange { chunk_index, entry_offset, sectors_start, sectors_end }` — entry resolves inside the file but outside the sectors data body	Error
`Ewf2ChunkTableChecksumMismatch { segment, computed, stored }` — EWF v2 chunk table Adler-32 does not match	Error
`ChunkChecksumMismatch { chunk_index, computed, stored }` — per-chunk Adler-32 (appended after uncompressed chunk data) does not match	Error
`ChunkDecompressionError { chunk_index }` — compressed chunk cannot be decompressed (corrupt DEFLATE stream)	Error
`UnsupportedCompressionAlgorithm { chunk_index, algorithm }` — compression method is not deflate	Error

Layer 7 — Hash Verification

Anomaly	Severity
`HashMismatch { computed, stored }` — MD5 of all sector data does not match stored hash	Error
`HashSectionMissing` — no `hash` section found	Warning
`DigestSha1Mismatch { computed, stored }` — SHA-1 of all sector data does not match `digest` section	Error
`DigestSha256Mismatch { computed, stored }` — SHA-256 of all sector data does not match `digest` section	Error
`BadSectorsPresent { count }` — `error2` section reports unreadable sectors at acquisition time	Warning

Layer 8 — Multi-segment and External Reference

Anomaly	Severity
`SegmentOutOfOrder { segment_number, expected }` — supplied segments are not in sequential order	Error
`ExternalMd5Mismatch { computed, expected }` — computed MD5 does not match externally-supplied chain-of-custody hash	Critical
`ExternalSha1Mismatch { computed, expected }` — computed SHA-1 does not match externally-supplied reference	Critical
`ExternalSha256Mismatch { computed, expected }` — computed SHA-256 does not match externally-supplied reference	Critical

EWF v2 — Per-section and Media Integrity

Anomaly	Severity
`Ewf2SectionDataHashMismatch { offset, section_type_id, computed, stored }` — MD5 of section body does not match `data_integrity_hash` in the descriptor	Error
`Ewf2EncryptedSection { offset }` — encrypted section found; content cannot be verified	Warning
`Ewf2HashSectionMissing` — no hash section (type 0x08 or 0x09) in the final segment	Warning
`Ewf2MediaInfoMissing` — no media_info section found in the image	Warning
`Ewf2MediaInfoParseFailed` — media_info section body is not a valid zlib stream	Error

Usage

Analyse an E01 file (recommended path-based API)

use ewf_forensic::{EwfIntegrityPath, Severity};

fn main() -> std::io::Result<()> {
    // Pass only the .E01 — siblings E02, E03 … are auto-discovered.
    let findings = EwfIntegrityPath::from_path("evidence.E01").analyse()?;

    if findings.is_empty() {
        println!("clean — no anomalies detected");
        return Ok(());
    }

    for anomaly in &findings {
        let tag = match anomaly.severity() {
            Severity::Critical => "[CRITICAL]",
            Severity::Error    => "[ERROR]   ",
            Severity::Warning  => "[WARNING] ",
            Severity::Info     => "[INFO]    ",
        };
        println!("{tag} {anomaly}");
    }
    Ok(())
}

In-memory API (e.g., when you already have the bytes)

use ewf_forensic::{EwfIntegrity, Severity};

let data = std::fs::read("evidence.E01").unwrap();
let findings = EwfIntegrity::new(&data).analyse();

Multi-segment image — explicit paths

use ewf_forensic::EwfIntegrityPath;

let findings = EwfIntegrityPath::from_paths(&[
    "evidence.E01",
    "evidence.E02",
    "evidence.E03",
]).analyse()?;

Verify against a chain-of-custody hash

use ewf_forensic::EwfIntegrityPath;

let coc_md5:    [u8; 16] = /* bytes from acquisition report */;
let coc_sha256: [u8; 32] = /* bytes from acquisition report */;

// ExternalMd5Mismatch / ExternalSha256Mismatch (Critical) fire if image was altered.
let findings = EwfIntegrityPath::from_path("evidence.E01")
    .with_expected_md5(coc_md5)
    .with_expected_sha256(coc_sha256)
    .analyse()?;

Compute hashes independently

use ewf_forensic::EwfIntegrityPath;

// Returns None if the image is not a valid EWF.
if let Some(hashes) = EwfIntegrityPath::from_path("evidence.E01").compute_hashes()? {
    println!("MD5:    {:x?}", hashes.md5);
    println!("SHA-1:  {:x?}", hashes.sha1);
    println!("SHA-256:{:x?}", hashes.sha256);
}

Single-pass: analyse and compute hashes together

use ewf_forensic::EwfIntegrityPath;

let (anomalies, hashes) = EwfIntegrityPath::from_path("evidence.E01")
    .analyse_and_compute_hashes()?;
// Both results from a single read pass — useful for large images.

Progress callback (long images, pipelines)

use ewf_forensic::{EwfIntegrityPath, AnalysisProgress};

let (anomalies, ()) = EwfIntegrityPath::from_path("evidence.E01")
    .analyse_with_progress(|p: AnalysisProgress| {
        if let Some(total) = p.chunks_total {
            eprint!("\r{}/{} chunks", p.chunks_done, total);
        }
    })?;

Read acquisition metadata from the header

use ewf_forensic::{EwfIntegrity, EwfHeaderMetadata};

let data = std::fs::read("evidence.E01").unwrap();
if let Some(meta) = EwfIntegrity::new(&data).header_metadata() {
    println!("Examiner:  {}", meta.examiner_name);
    println!("Acquired:  {}", meta.acquisition_date);
    println!("Case:      {}", meta.case_number);
}

Triage by severity

use ewf_forensic::{EwfIntegrityPath, Severity};

let findings = EwfIntegrityPath::from_path("evidence.E01").analyse()?;

let critical: Vec<_> = findings.iter()
    .filter(|a| a.severity() == Severity::Critical)
    .collect();

if !critical.is_empty() {
    eprintln!("{} critical finding(s) — image may be unreadable or tampered", critical.len());
}

Serde — serialise findings to JSON

ewf-forensic = { version = "0.4", features = ["serde"] }

use ewf_forensic::EwfIntegrityPath;

let findings = EwfIntegrityPath::from_path("evidence.E01").analyse()?;
let json = serde_json::to_string_pretty(&findings).unwrap();
println!("{json}");

CLI — `ewf-check`

ewf-check [OPTIONS] <segment>...

ARGUMENTS
    <segment>...    One or more segment paths. When a single .E01 is given,
                    consecutive siblings are discovered automatically.

OPTIONS
    --min-severity=<level>    Only report anomalies at or above this level.
                              Levels: info, warning, error, critical [default: info]
    --json                    Emit machine-readable JSON.
    --hash-md5=<hex>          Compare computed MD5 against this hex string.
    --hash-sha1=<hex>         Compare computed SHA-1 against this hex string.
    --hash-sha256=<hex>       Compare computed SHA-256 against this hex string.
    --print-hashes            Compute and print MD5, SHA-1, and SHA-256.
    --progress                Show a progress bar on stderr during analysis.
    --help / --version

EXIT CODES
    0   Clean — no anomalies at or above --min-severity
    1   Anomalies found
    2   Usage error or I/O failure

Example: verify an 8-segment acquisition against an external hash manifest:

ewf-check --hash-md5=2692f3177a389e58906b5c9080aa1add evidence.E01
# auto-discovers evidence.E02 … evidence.E08

Design

File-based API uses memory-mapped I/O — EwfIntegrityPath mmaps each segment rather than reading it into a Vec<u8>. Large images (100 GB+) do not require 100 GB of RAM.
No unsafe code in ewf-forensic — the crate itself contains no unsafe blocks. memmap2 wraps the OS mmap syscall but its unsafety is isolated to that dependency.
No panics on adversarial input — every parser path is bounded; cycle attacks and integer overflows are explicitly handled. Verified by libfuzzer (4.5 M iterations, zero crashes) and proptest (property-based, runs in cargo test).
Validated against 11 committed real-world fixtures — seven acquisition-tool images (EWF v1 and EWF v2, all confirmed clean by ewfverify), plus CTF and sleuthkit test-corpus images including structurally invalid zero-byte inputs. See docs/validation.md for image sources and reproduction steps.
MSRV 1.85 — no nightly, no unstable features.

Fuzzing

cargo +nightly fuzz run fuzz_integrity
cargo +nightly fuzz run fuzz_repair

Both targets run in CI for 30 seconds on every push. To run longer locally, remove -max_total_time.

Anomaly Catalog

docs/anomaly-catalog.md maps every detectable anomaly to its threat scenario — evidence suppression, modification, insertion, redirection, and parser exploitation — and documents known detection limits.

Limitations

MD5 is cryptographically broken

EWF v1 stores an MD5 digest in the hash section. MD5 chosen-prefix collisions are feasible on consumer hardware. A sufficiently resourced adversary can modify sector data, construct a new sectors body with the same MD5, and store the original hash — HashMismatch is not reported. ewf-forensic cannot detect a valid MD5 collision.

Mitigation: Supply --hash-sha256 (or .with_expected_sha256()) with a SHA-256 computed at acquisition time and stored separately from the image.

Sector content not inspected beyond hash verification

The analyser decompresses and hashes every chunk but does not parse filesystem structures within those sectors. It cannot identify which specific LBA ranges were modified, detect filesystem-level tampering (MFT manipulation, journal editing), or report what changed. That requires a full EWF reader such as ewf.

Recomputed Adler-32 does not prove a descriptor is unmodified

If an attacker modifies a section descriptor field and recomputes the Adler-32 over the modified bytes, the descriptor verifies correctly. Only an external hash over the full image taken at acquisition time can detect this. SectionDescriptorCrcMismatch catches the lazy attacker; it does not catch the careful one.

EWF v2 encrypted sections are not verified

When Ewf2EncryptedSection is reported, the section body is skipped. Content inside an encrypted section cannot be integrity-checked. If every data section is encrypted, the analyser provides structural checks only.

Progress callbacks report `chunks_total = None` for EWF v1

EWF v1 does not declare the total chunk count in a header field — it is discovered by walking the section chain. AnalysisProgress.chunks_total is None during EWF v1 analysis and Some(n) during EWF v2 analysis where the chunk table declares its entry count up front.

FTK Imager and X-Ways real-fixture tests are deferred

tool_fixtures_tests includes tests against real ftk_imager_clean.E01 and xways_clean.E01 fixtures. These are marked #[ignore] because the acquisition tools are Windows-only and not available in CI. The format variations they exercise are covered by always-on synthetic builder tests.