Skip to main contentModule parser
Source - FontInfo
- Font metadata resolved from a PDF font dictionary.
- RawTextSegment
- A single text segment extracted from a PDF content stream.
- extract_text_segments_for_page
- Extract raw text segments from a page’s content stream.
- load_pdf
- Load PDF bytes into a lopdf Document, mapping all failures to warnings.
- parse_pdf
- End-to-end PDF parsing: load, extract metadata, resolve fonts, extract text segments.
- resolve_fonts_for_page
- Resolve font dictionaries for a given page.