pub fn jaccard_similarity(a: &str, b: &str) -> f64Expand description
Computes the trigram-Jaccard similarity between two strings.
The score is |A ∩ B| / |A ∪ B| where A and B are the sets of
character-trigrams extracted from each input. The trigrams are taken
over Unicode scalar values via char_indices, so the function is
safe to call on multi-byte UTF-8 inputs without byte-boundary errors.
§Edge cases
- Both inputs empty: returns
1.0(the empty trigram set is trivially contained in itself). - One input empty, the other non-empty: returns
0.0(no overlap). - Identical inputs: returns
1.0.
The function is pure: no I/O, no allocation beyond the two trigram sets, deterministic for a given pair of inputs. It is safe to call in hot paths.