Expand description
High-level pure-Rust convenience helpers that mirror Clark’s current
soffice --headless usage without relying on LibreOffice itself.
The Clark-focused surface in this crate is:
- visual
DOCX -> PDF - visual
PPTX -> PDF DOC -> DOCXXLSXrecalc with cached<v>patching- tracked-change acceptance for
DOCX - generic
convert_bytes/convert_bytes_auto - JSON recalc reports compatible with Clark’s existing
recalc.py - direct DOCX/PPTX page rasterization to PNG/JPEG
- Markdown extraction for DOCX/PPTX/XLSX
- PDF -> TXT/MD/HTML via the native PDF reader
Structs§
Functions§
- accept_
all_ tracked_ changes_ docx_ bytes - Walk every WordprocessingML part inside a DOCX, accept all common tracked revisions, then re-emit the package.
- accept_
tracked_ changes_ docx_ bytes Deprecated - Back-compat alias; prefer
accept_all_tracked_changes_docx_bytes. - base_
convert_ bytes - Convert a base-format byte stream from
fromtoto. - calc_
convert_ bytes - Convert a calc-format byte stream from
fromtoto. - convert_
bytes - Convert any supported office-format byte stream from
fromtoto. - convert_
bytes_ auto - Infer the source format from the byte payload itself and dispatch to
convert_bytes. - convert_
path_ bytes - Infer the source format from
pathand dispatch toconvert_bytes. - doc_
to_ docx_ bytes - Convert a legacy binary
.docfile (Word 97-2003) into a DOCX byte stream by extracting the piece-table text and re-emitting it. - docx_
to_ html_ bytes - docx_
to_ jpeg_ pages - Rasterize a DOCX document directly to JPEG pages at the requested DPI.
- docx_
to_ md_ bytes - Extract Markdown from an existing DOCX file using the native Writer importer.
- docx_
to_ odt_ bytes - docx_
to_ pdf_ bytes - Convert a DOCX byte stream into a PDF using Writer’s native Rust layout/rendering path.
- docx_
to_ png_ pages - Rasterize a DOCX document directly to PNG pages at the requested DPI.
- docx_
to_ txt_ bytes - draw_
convert_ bytes - Convert a draw-format byte stream from
fromtoto. - impress_
convert_ bytes - Convert an impress-format byte stream from
fromtoto. - math_
convert_ bytes - Convert a math-format byte stream from
fromtoto. - odp_
to_ pdf_ bytes - odp_
to_ pptx_ bytes - ods_
to_ csv_ bytes - ods_
to_ pdf_ bytes - ods_
to_ xlsx_ bytes - odt_
to_ docx_ bytes - odt_
to_ html_ bytes - odt_
to_ pdf_ bytes - pdf_
to_ html_ bytes - pdf_
to_ md_ bytes - pdf_
to_ txt_ bytes - pptx_
to_ html_ bytes - pptx_
to_ jpeg_ pages - Rasterize a PPTX deck directly to JPEG slide images at the requested DPI.
- pptx_
to_ md_ bytes - Extract Markdown from an existing PPTX file using the native Impress importer.
- pptx_
to_ odp_ bytes - pptx_
to_ pdf_ bytes - Convert a PPTX byte stream into a PDF using Impress’s native Rust renderer.
- pptx_
to_ png_ pages - Rasterize a PPTX deck directly to PNG slide images at the requested DPI.
- pptx_
to_ svg_ bytes - recalc_
existing_ xlsx_ bytes Deprecated - Back-compat alias; prefer
xlsx_recalc_bytes. - sniff_
format_ from_ bytes - Infer a format from raw bytes.
- sniff_
format_ from_ path - Infer a format hint from a file path by looking at its extension.
- writer_
convert_ bytes - Convert a writer-format byte stream from
fromtoto. - xlsx_
recalc_ bytes - Re-evaluate every formula in an XLSX workbook and rewrite the cached
<v>values inside the existing sheet XML. The result is a fresh XLSX byte stream with the same shape as the input, minusxl/calcChain.xml. - xlsx_
recalc_ check_ json - Produce a Clark-shaped JSON report for an existing XLSX workbook.
- xlsx_
recalc_ report - Produce the structured recalc report used by
xlsx_recalc_check_json. - xlsx_
to_ csv_ bytes - xlsx_
to_ html_ bytes - xlsx_
to_ md_ bytes - Extract Markdown from an existing XLSX file using the native Calc importer.
- xlsx_
to_ ods_ bytes - xlsx_
to_ pdf_ bytes