Expand description
§office_oxide
The fastest Office document processing library for Rust.
Reads, writes, and edits DOCX, XLSX, PPTX, DOC, XLS, PPT — all six Microsoft Office formats — with a single unified API and zero C/C++ dependencies.
§Quick start
use office_oxide::Document;
let doc = Document::open("report.docx")?;
println!("{}", doc.plain_text());§Feature flags
| Flag | What it enables |
|---|---|
python | PyO3 Python bindings |
wasm | wasm-bindgen WASM bindings |
mmap | Memory-mapped file I/O |
parallel | Rayon-based parallel processing |
Re-exports§
pub use core::OfficeDocument;pub use error::OfficeError;pub use error::Result;pub use format::DocumentFormat;pub use ir::DocumentIR;
Modules§
- cfb
- Compound Binary File (OLE2/CFB) container reader, used by legacy formats. Pure Rust reader for Compound Binary File Format (CFBF/OLE2) containers.
- core
- Shared OOXML primitives: OPC, XML utilities, relationships, theme, units.
- create
- Document creation API: write new DOCX/XLSX/PPTX from scratch or from IR. Unified document creation from IR or markdown.
- doc
- Legacy Word Binary (.doc) document reader. Pure Rust reader for legacy Word Binary (.doc) files.
- docx
- Word document (.docx) reader, writer, and editor.
- edit
- Document editing API: modify existing DOCX/XLSX/PPTX files in-place. Unified document editing API.
- error
- Top-level error type wrapping all format-specific errors.
- ffi
- C Foreign Function Interface (FFI) for office_oxide.
- format
DocumentFormatenum and format detection utilities.- ir
- Format-agnostic intermediate representation (IR) of a document.
- ppt
- Legacy PowerPoint Binary (.ppt) presentation reader. Pure Rust reader for legacy PowerPoint Binary (.ppt) files.
- pptx
- PowerPoint presentation (.pptx) reader, writer, and editor.
- xls
- Legacy Excel Binary (.xls) workbook reader. Pure Rust reader for legacy Excel Binary (.xls) BIFF8 files.
- xlsx
- Excel spreadsheet (.xlsx) reader, writer, and editor.
Structs§
- Document
- A unified document handle supporting DOCX, XLSX, PPTX, DOC, XLS, and PPT formats.
Constants§
- VERSION
- Library version (matches the Cargo package version).
Functions§
- extract_
text - Extract plain text from any supported document file.
- to_html
- Convert any supported document file to an HTML fragment.
- to_
markdown - Convert any supported document file to markdown.