Skip to main content

normalize_for_matching

Function normalize_for_matching 

Source
pub fn normalize_for_matching(input: &str, opts: &MatchingOptions) -> String
Expand description

Normalize input for matching: NFKC → CaseFold → Confusable Skeleton.

Returns a canonical matching form where:

  • Compatibility equivalents are unified (NFKC)
  • Case differences are eliminated (Unicode case folding)
  • Visually confusable characters map to the same prototype (UTS #39 skeleton)

Two strings produce the same result if and only if they should be treated as equivalent for keyword detection and anti-spoofing.

§Examples

use simd_normalizer::matching::{normalize_for_matching, MatchingOptions};

let opts = MatchingOptions::default();

// Case folding
assert_eq!(
    normalize_for_matching("File", &opts),
    normalize_for_matching("file", &opts),
);

// Turkish dotless-I
assert_eq!(
    normalize_for_matching("file", &opts),
    normalize_for_matching("f\u{0131}le", &opts),
);