Module basic_text_internals::unicode[][src]

Constants

BEL

ASCII BEL.

BOM

ZERO WIDTH NO-BREAK SPACE, also known as the byte-order mark, or BOM

CAN

ASCII CAN.

CGJ

COMBINING GRAPHEME JOINER

DEL

ASCII DEL, which is not what’s generated by the “delete” key on the keyboard

ESC

ASCII ESC, known as ‘\e’ in some contexts.

FF

ASCII FF, known as ‘\f’ in some contexts.

LS

LINE SEPARATOR

MAX_UTF8_SIZE

The size of the longest UTF-8 scalar value encoding. Note that even though RFC-2279 allowed longer encodings, it’s obsoleted by RFC-3629 which doesn’t. This limit is also documented in the relevant section of Rust’s documentation.

NEL

EBCDIC NEXT LINE, which is treated like generic whitespace.

NORMALIZATION_BUFFER_LEN
NORMALIZATION_BUFFER_SIZE

The minimum size of a buffer needed to perform NFC normalization, and thus the minimum size needed to pass to TextReader’s read.

ORC

OBJECT REPLACEMENT CHARACTER

PS

PARAGRAPH SEPARATOR

REPL

REPLACEMENT CHARACTER

SUB

ASCII SUB.

WJ

WORD JOINER

ZWJ

ZERO WIDTH JOINER

Functions

is_normalization_form_starter