PDF ingestion for BookForge (ROADMAP §9b).
Layout extraction is delegated to poppler's command-line tools;
everything after the pdftohtml -xml output is deterministic Rust:
line merging, column detection, reading order, paragraph clustering,
heading detection, and synthetic-EPUB assembly. The produced EPUB
flows through the ordinary BookForge pipeline — this crate is an
ingestion front-end, not a parallel translation path.