Open XML SDK for Rust
ooxmlsdk is a Rust library for reading, writing, and round-tripping Office Open XML documents such as .docx, .xlsx, and .pptx. It uses the .NET Open XML SDK as a primary reference for OOXML package and schema behavior, but exposes Rust-native generated types, serializers, and strongly typed package parts.
Features
The runtime crate exposes a small public feature surface:
default: enablesparts; this is the recommended configuration for most usersparts: enables package-level OOXML read/write support such asWordprocessingDocument,SpreadsheetDocument, andPresentationDocumentflat-opc: enables Flat OPC package read/write helpers and depends onpartsmce: enables Markup Compatibility and Extensibility processing and depends onpartsvalidators: enables optional validator APIs
The always-available modules in the crate root are:
commonschemassdksimple_typeunits
Feature-gated modules are:
partsbehindpartsvalidatorbehindvalidators
Version Coverage
Office 2007 is the baseline. The checked-in generated schemas also include newer OOXML namespaces and package parts from the upstream metadata.
Common build shapes:
- default: generated schemas plus package APIs
--no-default-features --features parts: package APIs only--no-default-features --features flat-opc: package APIs plus Flat OPC helpers--no-default-features --features mce: package APIs plus Markup Compatibility and Extensibility processing--features validators: optional validator APIs
The generated runtime includes Office 2010, 2013, 2016, 2019, 2021, Microsoft 365-era extensions, and newer upstream namespace revisions currently present in the checked-in metadata. In practice this covers later DrawingML and chart extensions, SVG and 3D-related parts, threaded comments, dynamic-array-era spreadsheet extensions, and other post-2007 additions tracked by Open XML SDK metadata.
Documentation
Rust API documentation is published on docs.rs/ooxmlsdk.
For Open XML package concepts, file format background, and WordprocessingML, SpreadsheetML, PresentationML, Flat OPC, and Markup Compatibility guidance, see the Microsoft Learn Open XML SDK documentation. This crate follows many of the same package and schema concepts while exposing Rust APIs and feature flags.
Package API
The parts feature exposes package-level APIs for .docx, .xlsx, and .pptx files. The intended public surface follows upstream Open XML SDK concepts:
- open and create packages with constructors such as
new,new_with_settings,new_from_file, andnew_from_file_with_settings - save packages with
save - inspect package and part relationships with
parts,get_all_parts,get_part_by_id,get_parts_of_type, and relationship-specific helpers - traverse typed related parts with helpers such as
related_parts_of_type,related_part_of_type, and relationship-type-specific variants when the relationship id is needed alongside the typed part - access well-known child parts through typed methods such as
main_document_part,workbook_part,presentation_part,worksheet_parts,font_table_part, and chart-related part accessors - read, replace, or unload parsed part payloads through public data helpers and root-element helpers
Raw package storage, raw relationship sets, generated factory internals, and unchecked dynamic part plumbing are not part of the public API. Prefer the package and part methods above when writing code that should survive generator updates.
The package API follows Open XML SDK container concepts. When relationship metadata matters, typed traversal helpers return RelatedPart<T> so callers can keep the typed part and its r:id together.
Generated Schema API
The schemas module is generated from upstream Open XML SDK metadata plus checked-in schema extensions. Generated names are intended to read like Rust while staying traceable to the source schema:
- repeated child fields are named for their item type, for example
paragraph,extension, ortable_row - choices use concrete child names when the schema provides enough information; generic names remain for genuinely anonymous schema groups
- common scalar shapes are typed: lists are
Vec<T>, OOXML booleans are enums, and measures/percentages use unit wrappers - extension and wildcard content is preserved, with known children exposed through typed choices where possible
Prefer these generated types and conversion helpers over raw XML strings in new code. See the changelog for release-specific API changes.
XML And MCE Compatibility
The generated XML reader/writer preserves markup compatibility data needed for stable round trips, including common mc:* attributes, mc:AlternateContent, choice/fallback content, unknown extension attributes, and extension namespace children used by newer Office documents.
With the mce feature enabled, package/root loading can process known Markup Compatibility and Extensibility constructs such as mc:AlternateContent and package-level ProcessAllParts behavior. Integration coverage includes upstream-derived MCE, strict, OPC, extension, and real-world compatibility samples, with tests focused on public Rust APIs and stable XML/package round trips.
Unknown-element DOM editing and markup compatibility validator behavior are still future work.
Flat OPC
The flat-opc feature exposes Wordprocessing Flat OPC helpers for loading and writing XML package representations. Flat OPC APIs support strings and readers, and written Flat OPC preserves binary package parts such as alternative format import parts while writing XML-safe parts such as SVG media as XML data.
Project Structure
crates/ooxmlsdk: runtime library exposed to downstream userscrates/ooxmlsdk-build: generator that turns checked-in metadata into Rust codecrates/ooxmlsdk-derive: derive macros used by the generated runtime codecrates/ooxmlsdk-test: integration tests and benchmarkssdk_data/: checked-in intermediate generator datadata/: upstream-derived metadata snapshots consumed by the generator pipelineschemas/OpenPackagingConventions-XMLSchema/: package schema inputs used by the generator
The generated runtime code under crates/ooxmlsdk/src/schemas/, crates/ooxmlsdk/src/deserializers/, crates/ooxmlsdk/src/serializers/, crates/ooxmlsdk/src/parts/, and related module files is intended to be checked in and reviewed as generated artifacts.
Validation And Benchmarks
For release validation, this repository uses the full workspace sequence:
For runtime performance work, prefer evaluating cargo bench -p ooxmlsdk-test as a whole. The packages and xml suites have shown a persistent disagreement on wordprocessing_document/write/parsed, so treat that one case as an anomaly rather than as the sole performance signal.
The compatibility round-trip lane is:
The committed fixture set includes document, presentation, MCE, OPC, DrawingML, WML, and SpreadsheetML coverage, including spreadsheet cell types, defined names, formatting, formulas, freeze panes, merged cells, number formats, row/column dimensions, sheet visibility, and rich shared strings.
Known Limitations
- There is no
serdeintegration. - The validator surface is optional and still narrower than the core read/write path.
- Unknown-element DOM APIs and markup compatibility validator behavior are not yet exposed.
- Some schema shapes still map to generated enum-based child collections rather than a fully particle-aware hand-modeled API.
to_string()is justDisplay; prefer the XML-oriented APIs when you care about write performance.
Changelog
See CHANGELOG.md.
Data Provenance
data/ is directly copied from the upstream .NET Open XML SDK.
sdk_data/ is generated from the upstream .NET Open XML SDK, and schemas/OpenPackagingConventions-XMLSchema/ contains package schema inputs derived from the Open Packaging Conventions XSDs. Review upstream licensing before redistributing refreshed snapshots.
License
MIT OR Apache-2.0