Expand description
rehuman — Unicode‑safe text cleaning & typographic normalization.
Structs§
- Cleaning
Options - Configuration for cleaning.
- Cleaning
Options Builder - Builder for
CleaningOptions. - Cleaning
Result - Result of a text cleaning operation.
- Cleaning
Stats - Detailed statistics about cleaning operations.
- Stream
Cleaner - Incremental cleaner that processes text in newline-delimited chunks.
- Stream
Summary - Summary of cumulative streaming cleanup work.
- Text
Cleaner - Main cleaner.
Enums§
- Cleaning
Error - Errors produced by fallible cleaning APIs.
- Emoji
Policy - Policy for emoji handling when
keyboard_onlyis enabled. - Line
Ending Style - Line ending styles.
- NonAscii
Policy - Policy for handling non-ASCII graphemes in
keyboard_onlymode. - Unicode
Normalization Mode - Unicode normalization modes.
Functions§
- clean
- Convenience: clean with default options.
- humanize
- Convenience: clean with the humanize preset.
- is_
emoji - Emoji detection via the Unicode
Emojibinary property. - is_
extended_ keyboard_ char - Curated non-ASCII characters allowed in extended keyboard mode.
- is_
hidden_ char - Hidden/format-like characters defined by Default_Ignorable_Code_Point (DI).
- is_
keyboard_ ascii - ASCII keyboard (US) characters + whitespace controls typically produced by keyboards.