Module basic_text_internals::unicode [−][src]
Constants
BEL | ASCII BEL. |
BOM | ZERO WIDTH NO-BREAK SPACE, also known as the byte-order mark, or BOM |
CAN | ASCII CAN. |
CGJ | COMBINING GRAPHEME JOINER |
DEL | ASCII DEL, which is not what’s generated by the “delete” key on the keyboard |
ESC | ASCII ESC, known as ‘\e’ in some contexts. |
FF | ASCII FF, known as ‘\f’ in some contexts. |
LS | LINE SEPARATOR |
MAX_UTF8_SIZE | The size of the longest UTF-8 scalar value encoding. Note that even though RFC-2279 allowed longer encodings, it’s obsoleted by RFC-3629 which doesn’t. This limit is also documented in the relevant section of Rust’s documentation. |
NEL | EBCDIC NEXT LINE, which is treated like generic whitespace. |
NORMALIZATION_BUFFER_LEN | |
NORMALIZATION_BUFFER_SIZE | The minimum size of a buffer needed to perform NFC normalization, and thus
the minimum size needed to pass to |
ORC | OBJECT REPLACEMENT CHARACTER |
PS | PARAGRAPH SEPARATOR |
REPL | REPLACEMENT CHARACTER |
SUB | ASCII SUB. |
WJ | WORD JOINER |
ZWJ | ZERO WIDTH JOINER |
Functions
is_normalization_form_starter |