pub fn passage_token_offsets(
text: &str,
) -> Result<Vec<(usize, usize)>, AppError>Expand description
Returns the byte-offset pairs (start, end) for each whitespace-delimited
word in text. The tokenizers crate used to return true sub-word offsets;
the LLM headless path doesn’t need that granularity, so we return word
boundaries.