officemd_docx
DOCX extraction helpers built on officemd_core. This crate streams WordprocessingML parts to build the shared IR.
Features
- Extracts body, headers, footers, footnotes, endnotes into
DocSectionblocks. - Resolves hyperlinks via
.relsand a best-effort field-code fallback. - Parses comments and inserts inline anchors plus per-section footnotes.
- Collects document properties from
docProps/*.
Rust Usage
use extract_ir;
let doc = extract_ir?;
Cargo Example
Python usage
=
=
=
Tests
# Rust tests
# Python bindings (from dedicated adapter crate)
&&
Fixture note:
tests/fixtures/basic.docxis the crate-local sample fixture path (fixture-based tests skip when missing).- Real-world fixtures can be placed in
tests/data/*.docx(crate local) or../../tests/data/*.docx(repo root).