Skip to main content

Crate officemd_pdf

Crate officemd_pdf 

Source
Expand description

PDF extraction wrapper for officemd IR and markdown rendering.

Structs§

PdfFontInspection
PdfFontUsage

Enums§

OoxmlPdfError

Functions§

extract_ir
Extract PDF content as the shared officemd IR.
extract_ir_force
Extract PDF content as the shared officemd IR, optionally forcing extraction on scanned/image-based PDFs.
extract_ir_json
Extract PDF content as IR JSON.
extract_ir_json_force
Extract PDF content as IR JSON, optionally forcing extraction on scanned/image-based PDFs.
inspect_pdf
Detects parseability/classification metadata for a PDF.
inspect_pdf_fonts
Detects fonts declared/used in a PDF with diagnostics.
inspect_pdf_fonts_json
Detects PDF fonts and returns JSON.
looks_like_pdf_header
Returns true when bytes look like a PDF file header.
markdown_from_bytes_force
Render PDF bytes directly to markdown, optionally forcing extraction on scanned/image-based PDFs.
markdown_from_bytes_with_options
Render PDF bytes directly to markdown with shared render options.