Expand description
PDF extraction wrapper for officemd IR and markdown rendering.
Structs§
Enums§
Functions§
- extract_
ir - Extract PDF content as the shared officemd IR.
- extract_
ir_ force - Extract PDF content as the shared officemd IR, optionally forcing extraction on scanned/image-based PDFs.
- extract_
ir_ json - Extract PDF content as IR JSON.
- extract_
ir_ json_ force - Extract PDF content as IR JSON, optionally forcing extraction on scanned/image-based PDFs.
- inspect_
pdf - Detects parseability/classification metadata for a PDF.
- inspect_
pdf_ fonts - Detects fonts declared/used in a PDF with diagnostics.
- inspect_
pdf_ fonts_ json - Detects PDF fonts and returns JSON.
- looks_
like_ pdf_ header - Returns true when bytes look like a PDF file header.
- markdown_
from_ bytes_ force - Render PDF bytes directly to markdown, optionally forcing extraction on scanned/image-based PDFs.
- markdown_
from_ bytes_ with_ options - Render PDF bytes directly to markdown with shared render options.