Skip to main content

Module unicode

Module unicode 

Source
Expand description

Unicode utilities for text processing.

Provides helpers for proper Unicode handling including character boundary detection and validation.

Functionsยง

current_timestamp
Returns the current Unix timestamp in seconds.
find_char_boundary
Finds a valid UTF-8 character boundary at or before the given position.
find_char_boundary_forward
Finds a valid UTF-8 character boundary at or after the given position.
grapheme_byte_position
Finds the byte position of the nth grapheme cluster.
grapheme_count
Counts the number of grapheme clusters in a string.
lines_with_offsets
Iterates over lines with their byte offsets.
split_sentences
Splits text into sentences (approximate).
truncate_graphemes
Truncates a string at a grapheme cluster boundary.
validate_utf8
Validates that a byte slice is valid UTF-8.