pub fn from_html(html: &str) -> String
Produce the canonical plain-text extraction used for the location model. Must be deterministic and stable — char_offset semantics depend on it.
char_offset