pub fn normalize_text(raw: &str) -> StringExpand description
Canonicalize extracted text so output is stable across adapters:
- Normalize line endings to
\n(drop\r). - Trim trailing whitespace on each line.
- Collapse three-or-more consecutive blank lines to a single blank line.
- Trim leading/trailing blank lines, then append exactly one
\n(unless the whole text is empty, which stays empty — the image-only-PDF contract).
This is layout tid-up only; it never reorders or drops words. Word-level content is whatever the adapter recovered.