Skip to main content

Crate homoglyph_detect

Crate homoglyph_detect 

Source
Expand description

§homoglyph-detect

Detect Cyrillic / Greek / fullwidth lookalike characters masquerading as ASCII letters. Used to catch domain-spoof URLs and content that mixes scripts to slip past keyword filters.

Common attack: replace the a in claude with Cyrillic а (U+0430). It renders identically but bypasses keyword matching.

§Example

use homoglyph_detect::{find_homoglyphs, normalize_to_ascii};
let attack = "cl\u{0430}ude"; // Cyrillic 'a'
let hits = find_homoglyphs(attack);
assert_eq!(hits.len(), 1);
assert_eq!(hits[0].ascii_equivalent, 'a');
assert_eq!(normalize_to_ascii(attack), "claude");

Structs§

Finding
One detected lookalike.

Functions§

ascii_equivalent
Per-char lookalike → ASCII mapping. Returns None if the char is not a known confusable for any ASCII letter.
find_homoglyphs
Return every lookalike position in s.
has_homoglyphs
True when at least one lookalike is present.
normalize_to_ascii
Replace each lookalike with its ASCII equivalent.