get_original_spans

Function get_original_spans 

Source
pub fn get_original_spans<S: Borrow<str>>(
    tokens: &[S],
    original_text: &str,
) -> Vec<Vec<Span>>
Expand description

Returns the span indices of original_text from the tokens based on the shortest edit script (SES).

This is useful, for example, when you want to get the spans in the original text of tokens obtained in the normalized text.

§Examples

let tokens = vec!["a", "la", "gorge"];
let original_text = "à  LA    gorge";
let spans = textspan::get_original_spans(&tokens, original_text);
assert_eq!(spans, vec![vec![(0, 1)], vec![(3, 5)], vec![(9, 14)]]);