disarm 0.10.0

Unicode canonicalization and TR39 confusable analysis: building blocks for text-security pipelines (homoglyph/bidi/zalgo handling) plus standards-based transliteration
Documentation
1
2
3
4
5
6
7
8
9
# Vietnamese (vi) language-specific overrides
# Source: NFKD decomposition + Vietnamese orthographic conventions

0110	D
0111	d
01A0	O
01A1	o
01AF	U
01B0	u