Function tokenizations::get_charmap [−][src]
pub fn get_charmap(a: &str, b: &str) -> (CharMap, CharMap)
Returns the character mappings c_a2b
(from a
to b
) and c_b2a
(from b
to a
) based on the shortest edit script (SES).
a
and b
can be noisy. For example, bar
and bår
can be properly compared.
Examples
Basic usage:
use tokenizations::get_charmap; let a = "bar"; let b = "bår"; let (c_a2b, c_b2a) = get_charmap(a, b); assert_eq!(c_a2b, vec![vec![0], vec![1], vec![2]]); assert_eq!(c_b2a, vec![vec![0], vec![1], vec![2]]);