Crate string_offsets

Source
Expand description

Converts string offsets between UTF-8 bytes, UTF-16 code units, Unicode code points, and lines.

§Example

use string_offsets::StringOffsets;

let s = "☀️hello\n🗺️world\n";
let offsets: StringOffsets = StringOffsets::new(s);

// Find offsets where lines begin and end.
assert_eq!(offsets.line_to_utf8s(0), 0..12);  // note: 0-based line numbers

// Translate string offsets between UTF-8 and other encodings.
// This map emoji is 7 UTF-8 bytes...
assert_eq!(&s[12..19], "🗺️");
// ...but only 3 UTF-16 code units...
assert_eq!(offsets.utf8_to_utf16(12), 8);
assert_eq!(offsets.utf8_to_utf16(19), 11);
// ...and only 2 Unicode code points.
assert_eq!(offsets.utf8s_to_chars(12..19), 8..10);

See StringOffsets for details.

Structs§

AllConfig
Configuration type that enables all features.
OnlyLines
Configuration type that only enables line conversions.
Pos
A position in a string, specified by line and column number.
StringOffsets
Converts positions within a given string between UTF-8 byte offsets (the usual in Rust), UTF-16 code units, Unicode code points, and line numbers.