Expand description
Converts string offsets between UTF-8 bytes, UTF-16 code units, Unicode code points, and lines.
§Example
use string_offsets::StringOffsets;
let s = "☀️hello\n🗺️world\n";
let offsets: StringOffsets = StringOffsets::new(s);
// Find offsets where lines begin and end.
assert_eq!(offsets.line_to_utf8s(0), 0..12); // note: 0-based line numbers
// Translate string offsets between UTF-8 and other encodings.
// This map emoji is 7 UTF-8 bytes...
assert_eq!(&s[12..19], "🗺️");
// ...but only 3 UTF-16 code units...
assert_eq!(offsets.utf8_to_utf16(12), 8);
assert_eq!(offsets.utf8_to_utf16(19), 11);
// ...and only 2 Unicode code points.
assert_eq!(offsets.utf8s_to_chars(12..19), 8..10);
See StringOffsets
for details.
Structs§
- AllConfig
- Configuration type that enables all features.
- Only
Lines - Configuration type that only enables line conversions.
- Pos
- A position in a string, specified by line and column number.
- String
Offsets - Converts positions within a given string between UTF-8 byte offsets (the usual in Rust), UTF-16 code units, Unicode code points, and line numbers.