Skip to main content

passage_token_offsets

Function passage_token_offsets 

Source
pub fn passage_token_offsets(
    tokenizer: &Tokenizer,
    text: &str,
) -> Result<Vec<(usize, usize)>, AppError>
Expand description

Returns the byte-offset pairs (start, end) for each token in text.

The passage prefix is prepended before tokenizing; offsets in the returned vector are adjusted back to be relative to the original text slice.

ยงErrors

Returns Err when the tokenizer fails to encode the input.