pub struct CharsCows<'a, Offset: OffsetType = u32, Length: LengthType = u16> { /* private fields */ }
Expand description
A memory-efficient collection of string slices with configurable offset and length types.
CharsCows
stores references to string slices in a shared data buffer using compact
(offset, length) pairs. This is ideal for large datasets where you want to reference
substrings without duplicating the underlying data.
§Type Parameters
Offset
- The offset type (u8, u16, u32, u64) determining maximum data sizeLength
- The length type (u8, u16, u32, u64) determining maximum slice size
§Memory Efficiency
For 500M words (8 bytes avg) from a 4GB file:
Vec<String>
: ~66 GB (24 bytes per String + heap overhead)CharsCows<u32, u16>
: ~7 GB (4+2 bytes per entry + shared 4GB data)
§Examples
use stringtape::{CharsCows, StringTapeError};
use std::borrow::Cow;
let data = "hello world foo bar";
let cows = CharsCows::<u32, u16>::from_iter_and_data(
data.split_whitespace(),
Cow::Borrowed(data.as_bytes())
)?;
assert_eq!(cows.len(), 4);
assert_eq!(cows.get(0), Some("hello"));
assert_eq!(cows.get(3), Some("bar"));
Implementations§
Source§impl<'a, Offset: OffsetType, Length: LengthType> CharsCows<'a, Offset, Length>
impl<'a, Offset: OffsetType, Length: LengthType> CharsCows<'a, Offset, Length>
Sourcepub fn from_iter_and_data<I>(
iter: I,
data: Cow<'a, [u8]>,
) -> Result<Self, StringTapeError>
pub fn from_iter_and_data<I>( iter: I, data: Cow<'a, [u8]>, ) -> Result<Self, StringTapeError>
Creates a CharsCows from an iterator of string slices and shared data buffer.
The slices must be subslices of the data buffer. Offsets and lengths are inferred from the slice pointers.
§Arguments
iter
- Iterator yielding string slices that are subslices ofdata
data
- Cow-wrapped data buffer (borrowed or owned)
§Errors
OffsetOverflow
if offset/length exceeds type maximumIndexOutOfBounds
if slice not within data buffer
§Example
let data = "hello world";
let cows = CharsCowsU32U8::from_iter_and_data(
data.split_whitespace(),
Cow::Borrowed(data.as_bytes())
)?;
Sourcepub fn get(&self, index: usize) -> Option<&str>
pub fn get(&self, index: usize) -> Option<&str>
Returns a reference to the string at the given index, or None
if out of bounds.
Sourcepub fn iter(&self) -> CharsCowsIter<'_, Offset, Length> ⓘ
pub fn iter(&self) -> CharsCowsIter<'_, Offset, Length> ⓘ
Returns an iterator over the string cows.
Sourcepub fn sort(&mut self)where
Offset: OffsetType,
Length: LengthType,
pub fn sort(&mut self)where
Offset: OffsetType,
Length: LengthType,
Sorts the slices in-place using the default string comparison.
This is a stable sort that preserves the order of equal elements.
§Examples
use stringtape::CharsCowsU32U8;
use std::borrow::Cow;
let data = "zebra apple banana";
let mut cows = CharsCowsU32U8::from_iter_and_data(
data.split_whitespace(),
Cow::Borrowed(data.as_bytes())
).unwrap();
cows.sort();
let sorted: Vec<&str> = cows.iter().collect();
assert_eq!(sorted, vec!["apple", "banana", "zebra"]);
Sourcepub fn sort_unstable(&mut self)where
Offset: OffsetType,
Length: LengthType,
pub fn sort_unstable(&mut self)where
Offset: OffsetType,
Length: LengthType,
Sorts the slices in-place using an unstable sorting algorithm.
This is faster than stable sort but may not preserve the order of equal elements.
Sourcepub fn sort_by<F>(&mut self, compare: F)
pub fn sort_by<F>(&mut self, compare: F)
Sorts the slices in-place using a custom comparison function.
§Examples
use stringtape::CharsCowsU32U8;
use std::borrow::Cow;
let data = "aaa bb c";
let mut cows = CharsCowsU32U8::from_iter_and_data(
data.split_whitespace(),
Cow::Borrowed(data.as_bytes())
).unwrap();
// Sort by length, then alphabetically
cows.sort_by(|a, b| a.len().cmp(&b.len()).then(a.cmp(b)));
let sorted: Vec<&str> = cows.iter().collect();
assert_eq!(sorted, vec!["c", "bb", "aaa"]);
Sourcepub fn sort_by_key<K, F>(&mut self, f: F)
pub fn sort_by_key<K, F>(&mut self, f: F)
Sorts the slices in-place using a key extraction function.
§Examples
use stringtape::CharsCowsU32U8;
use std::borrow::Cow;
let data = "aaa bb c";
let mut cows = CharsCowsU32U8::from_iter_and_data(
data.split_whitespace(),
Cow::Borrowed(data.as_bytes())
).unwrap();
// Sort by string length
cows.sort_by_key(|s| s.len());
let sorted: Vec<&str> = cows.iter().collect();
assert_eq!(sorted, vec!["c", "bb", "aaa"]);