Expand description
PDF → text extraction for the research context layer.
Delegates to the pdf-extract crate. Because PDF parsers can panic on
malformed or unusual input — and ctx_url_read accepts arbitrary
agent-supplied URLs — extraction is wrapped in std::panic::catch_unwind
so a bad document yields an error instead of taking down the handler.
Functions§
- extract_
text - Extract and normalize the text content of a PDF byte buffer.
- looks_
like_ pdf - PDFs start with
%PDF-(optionally after a small BOM/whitespace preamble).