ooxmlsdk 0.4.0

Open XML SDK for Rust
Documentation

Open XML SDK for Rust

crates.io docs

ooxmlsdk is a Rust library for reading, writing, and round-tripping Office Open XML documents such as .docx, .xlsx, and .pptx. The overall API shape is inspired by the .NET Open XML SDK, but the implementation is code-generated for Rust.

0.4.0 Highlights

  • Simplified public feature surface around the features that are actually used today: parts, microsoft365, and validators.
  • Stronger package-level coverage for Wordprocessing, Spreadsheet, and Presentation documents through the generated parts API.
  • Validator support remains optional behind the validators feature and has dedicated integration coverage.
  • Read-path performance work in ooxmlsdk-derive now pushes more dispatch decisions into code generation, reducing runtime branching and unnecessary allocations in generated parsers.
  • Release validation for 0.4.0 is based on checked-in generated runtime code plus full workspace test and clippy coverage.

Features

The runtime crate currently exposes these public features:

  • default: enables microsoft365 and parts
  • parts: enables package-level OOXML read/write support such as WordprocessingDocument, SpreadsheetDocument, and PresentationDocument
  • microsoft365: enables newer schema and part coverage that is not available in the Office 2007-oriented surface
  • validators: enables optional validator APIs

The always-available modules in the crate root are:

  • common
  • namespaces
  • schemas
  • sdk
  • simple_type

Feature-gated modules are:

  • parts behind parts
  • validator behind validators

Quick Start

Add the crate:

[dependencies]
ooxmlsdk = "0.4.0"

Read and save a package:

use ooxmlsdk::parts::wordprocessing_document::WordprocessingDocument;

fn round_trip(path: &std::path::Path) -> Result<(), Box<dyn std::error::Error>> {
  let document = WordprocessingDocument::new_from_file(path)?;
  let mut out = std::io::Cursor::new(Vec::new());
  document.save(&mut out)?;
  Ok(())
}

Parse XML into generated schema types:

use ooxmlsdk::schemas::opc_core_properties::CoreProperties;

fn parse_core_properties(xml: &str) -> Result<CoreProperties, Box<dyn std::error::Error>> {
  Ok(xml.parse()?)
}

What 0.4.0 Focuses On

0.4.0 is mainly a quality and runtime release rather than a surface-area explosion.

  • The derive-generated read path was tightened so more child and choice dispatch work happens in generated code instead of generic runtime fallback logic.
  • Package handling remains centered on generated strongly typed parts instead of a thin zip wrapper.
  • Feature gating is clearer: Office 2007-oriented validation is done via --no-default-features --features parts, while newer Microsoft 365 coverage stays behind microsoft365.
  • Validator support is still optional and should be treated as an extra capability, not the default path.

Project Structure

  • crates/ooxmlsdk: runtime library exposed to downstream users
  • crates/ooxmlsdk-build: generator that turns checked-in metadata into Rust code
  • crates/ooxmlsdk-derive: derive macros used by the generated runtime code
  • crates/ooxmlsdk-test: integration tests and benchmarks
  • sdk_data/: checked-in intermediate generator data
  • schemas/OpenPackagingConventions-XMLSchema/: package schema inputs used by the generator

The generated runtime code under crates/ooxmlsdk/src/schemas/, crates/ooxmlsdk/src/parts/, and related module files is intended to be checked in and reviewed as generated artifacts.

Validation And Benchmarks

For release validation, this repository uses the full workspace sequence:

cargo test -p ooxmlsdk-build test_gen -- --ignored --nocapture
cargo test --workspace
cargo test --workspace --no-default-features
cargo test --workspace --no-default-features --features parts
cargo clippy --workspace --all-targets --no-default-features -- -D warnings
cargo clippy --workspace --all-targets --no-default-features --features parts -- -D warnings
cargo clippy --workspace --all-targets -- -D warnings
cargo fmt --all

For runtime performance work, prefer evaluating cargo bench -p ooxmlsdk-test as a whole. The packages and xml suites have shown a persistent disagreement on wordprocessing_document/write/parsed, so treat that one case as an anomaly rather than as the sole performance signal.

Known Limitations

  • There is no serde integration.
  • The validator surface is optional and still narrower than the core read/write path.
  • Some schema shapes still map to generated enum-based child collections rather than a fully particle-aware hand-modeled API.
  • to_string() is just Display; prefer the XML-oriented APIs when you care about write performance.

Changelog

See CHANGELOG.md.

License

MIT OR Apache-2.0

sdk_data/ is generated from the upstream .NET Open XML SDK, and schemas/OpenPackagingConventions-XMLSchema/ contains package schema inputs derived from the Open Packaging Conventions XSDs. Review upstream licensing before redistributing refreshed snapshots.