Expand description
This crate provides replacement types for String
and &str
that allow for safe
indexing by character to avoid panics and the usual pitfalls of working with multi-byte
UTF-8 characters, namely the scenario where the byte length of a string and the
character length of that same string are not the same.
Specifically, IndexedString
(replaces String
) and IndexedSlice
(replaces
&str
) allow for O(1) slicing and indexing by character, and they will never panic
when indexing or slicing.
This is accomplished by storing the character offsets of each character in the string,
along with the original String
, and using this information to calculate the byte
offsets of each character on the fly. Thus IndexedString
uses ~2x the memory of a
normal String
, but IndexedSlice
and other types implementing IndexedStr
have
only one usize
extra in overhead over that of a regular &str
slice / fat
pointer. In theory this could be reduced down to the same size as a fat pointer using
unsafe rust, but this way we get to have completely safe code and the difference is
negligible.
Β§Examples
use safe_string::{IndexedString, IndexedStr, IndexedSlice};
let message = IndexedString::from("Hello, δΈη! ππ");
assert_eq!(message.as_str(), "Hello, δΈη! ππ");
assert_eq!(message, "Hello, δΈη! ππ"); // handy PartialEq impls
// Access characters by index
assert_eq!(message.char_at(7), Some('δΈ'));
assert_eq!(message.char_at(100), None); // Out of bounds access returns None
// Slice the IndexedString
let slice = message.slice(7..9);
assert_eq!(slice.as_str(), "δΈη");
// Convert slice back to IndexedString
let sliced_message = slice.to_indexed_string();
assert_eq!(sliced_message.as_str(), "δΈη");
// Nested slicing
let slice = message.slice(0..10);
let nested_slice = slice.slice(3..6);
assert_eq!(nested_slice.as_str(), "lo,");
// Display byte length and character length
assert_eq!(IndexedString::from_str("δΈη").byte_len(), 6); // "δΈη" is 6 bytes in UTF-8
assert_eq!(IndexedString::from_str("δΈη").len(), 2); // "δΈη" has 2 characters
// Demonstrate clamped slicing (no panic)
let clamped_slice = message.slice(20..30);
assert_eq!(clamped_slice.as_str(), "");
// Using `as_str` to interface with standard Rust string handling
let slice = message.slice(0..5);
let standard_str_slice = slice.as_str();
assert_eq!(standard_str_slice, "Hello");
StructsΒ§
- Indexed
Lines - An iterator over the lines of an
IndexedStr
. - Indexed
Slice - A
&str
replacement that allows for safe indexing and slicing of multi-byte characters. - Indexed
String - A
String
replacement that allows for safe indexing and slicing of multi-byte characters.
TraitsΒ§
- Indexed
Str - A trait that facilitates safe interaction with strings that contain multi-byte characters.