Skip to main content

normalize_sentence_with_max_span

Function normalize_sentence_with_max_span 

Source
pub fn normalize_sentence_with_max_span(
    input: &str,
    max_span_tokens: usize,
) -> String
Expand description

Normalize a full sentence with a configurable max span size.

max_span_tokens controls the maximum number of consecutive tokens that will be considered as a single normalizable expression. Smaller values are faster but may miss multi-word expressions. Larger values catch more patterns but do more work per token.

use text_processing_rs::normalize_sentence_with_max_span;

// Short span: only catches small expressions
assert_eq!(normalize_sentence_with_max_span("I have twenty one apples", 4), "I have 21 apples");