Expand description
Input extraction backends.
Each submodule turns a specific input format (PDF, future: Word, HTML) into a PNG image ready for OCR.
Modules§
- pdf
pdf-input - PDF text extraction and rasterization via
pdfium-render.
Enums§
- Input
Kind - Source kind detected for an input path.