CharsCows

Struct CharsCows 

Source
pub struct CharsCows<'a, Offset: OffsetType = u32, Length: LengthType = u16> { /* private fields */ }
Expand description

A memory-efficient collection of string slices with configurable offset and length types.

CharsCows stores references to string slices in a shared data buffer using compact (offset, length) pairs. This is ideal for large datasets where you want to reference substrings without duplicating the underlying data.

§Type Parameters

  • Offset - The offset type (u8, u16, u32, u64) determining maximum data size
  • Length - The length type (u8, u16, u32, u64) determining maximum slice size

§Memory Efficiency

For 500M words (8 bytes avg) from a 4GB file:

  • Vec<String>: ~66 GB (24 bytes per String + heap overhead)
  • CharsCows<u32, u16>: ~7 GB (4+2 bytes per entry + shared 4GB data)

§Examples

use stringtape::{CharsCows, StringTapeError};
use std::borrow::Cow;

let data = "hello world foo bar";
let cows = CharsCows::<u32, u16>::from_iter_and_data(
    data.split_whitespace(),
    Cow::Borrowed(data.as_bytes())
)?;

assert_eq!(cows.len(), 4);
assert_eq!(cows.get(0), Some("hello"));
assert_eq!(cows.get(3), Some("bar"));

Implementations§

Source§

impl<'a, Offset: OffsetType, Length: LengthType> CharsCows<'a, Offset, Length>

Source

pub fn from_iter_and_data<I>( iter: I, data: Cow<'a, [u8]>, ) -> Result<Self, StringTapeError>
where I: IntoIterator, I::Item: AsRef<str>,

Creates a CharsCows from an iterator of string slices and shared data buffer.

The slices must be subslices of the data buffer. Offsets and lengths are inferred from the slice pointers.

§Arguments
  • iter - Iterator yielding string slices that are subslices of data
  • data - Cow-wrapped data buffer (borrowed or owned)
§Errors
  • OffsetOverflow if offset/length exceeds type maximum
  • IndexOutOfBounds if slice not within data buffer
§Example
let data = "hello world";
let cows = CharsCowsU32U8::from_iter_and_data(
    data.split_whitespace(),
    Cow::Borrowed(data.as_bytes())
)?;
Source

pub fn get(&self, index: usize) -> Option<&str>

Returns a reference to the string at the given index, or None if out of bounds.

Source

pub fn len(&self) -> usize

Returns the number of slices in the collection.

Source

pub fn is_empty(&self) -> bool

Returns true if the collection contains no cows.

Source

pub fn iter(&self) -> CharsCowsIter<'_, Offset, Length>

Returns an iterator over the string cows.

Source

pub fn data(&self) -> &[u8]

Returns a reference to the underlying data buffer.

Source

pub fn sort(&mut self)
where Offset: OffsetType, Length: LengthType,

Sorts the slices in-place using the default string comparison.

This is a stable sort that preserves the order of equal elements.

§Examples
use stringtape::CharsCowsU32U8;
use std::borrow::Cow;

let data = "zebra apple banana";
let mut cows = CharsCowsU32U8::from_iter_and_data(
    data.split_whitespace(),
    Cow::Borrowed(data.as_bytes())
).unwrap();

cows.sort();
let sorted: Vec<&str> = cows.iter().collect();
assert_eq!(sorted, vec!["apple", "banana", "zebra"]);
Source

pub fn sort_unstable(&mut self)
where Offset: OffsetType, Length: LengthType,

Sorts the slices in-place using an unstable sorting algorithm.

This is faster than stable sort but may not preserve the order of equal elements.

Source

pub fn sort_by<F>(&mut self, compare: F)
where F: FnMut(&str, &str) -> Ordering, Offset: OffsetType, Length: LengthType,

Sorts the slices in-place using a custom comparison function.

§Examples
use stringtape::CharsCowsU32U8;
use std::borrow::Cow;

let data = "aaa bb c";
let mut cows = CharsCowsU32U8::from_iter_and_data(
    data.split_whitespace(),
    Cow::Borrowed(data.as_bytes())
).unwrap();

// Sort by length, then alphabetically
cows.sort_by(|a, b| a.len().cmp(&b.len()).then(a.cmp(b)));
let sorted: Vec<&str> = cows.iter().collect();
assert_eq!(sorted, vec!["c", "bb", "aaa"]);
Source

pub fn sort_by_key<K, F>(&mut self, f: F)
where F: FnMut(&str) -> K, K: Ord, Offset: OffsetType, Length: LengthType,

Sorts the slices in-place using a key extraction function.

§Examples
use stringtape::CharsCowsU32U8;
use std::borrow::Cow;

let data = "aaa bb c";
let mut cows = CharsCowsU32U8::from_iter_and_data(
    data.split_whitespace(),
    Cow::Borrowed(data.as_bytes())
).unwrap();

// Sort by string length
cows.sort_by_key(|s| s.len());
let sorted: Vec<&str> = cows.iter().collect();
assert_eq!(sorted, vec!["c", "bb", "aaa"]);
Source§

impl<'a, Offset: OffsetType, Length: LengthType> CharsCows<'a, Offset, Length>

Source

pub fn into_bytes_slices(self) -> BytesCows<'a, Offset, Length>

Trait Implementations§

Source§

impl<'a, Offset: Clone + OffsetType, Length: Clone + LengthType> Clone for CharsCows<'a, Offset, Length>

Source§

fn clone(&self) -> CharsCows<'a, Offset, Length>

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<'a, Offset: Debug + OffsetType, Length: Debug + LengthType> Debug for CharsCows<'a, Offset, Length>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'a, Offset: OffsetType, Length: LengthType> From<CharsCows<'a, Offset, Length>> for BytesCows<'a, Offset, Length>

Source§

fn from(chars_slices: CharsCows<'a, Offset, Length>) -> Self

Converts to this type from the input type.
Source§

impl<'a, Offset: OffsetType, Length: LengthType> Index<usize> for CharsCows<'a, Offset, Length>

Source§

type Output = str

The returned type after indexing.
Source§

fn index(&self, index: usize) -> &Self::Output

Performs the indexing (container[index]) operation. Read more
Source§

impl<'a, Offset: OffsetType, Length: LengthType> IntoIterator for &'a CharsCows<'a, Offset, Length>

Source§

type Item = &'a str

The type of the elements being iterated over.
Source§

type IntoIter = CharsCowsIter<'a, Offset, Length>

Which kind of iterator are we turning this into?
Source§

fn into_iter(self) -> Self::IntoIter

Creates an iterator from a value. Read more
Source§

impl<'a, Offset: OffsetType, Length: LengthType> TryFrom<BytesCows<'a, Offset, Length>> for CharsCows<'a, Offset, Length>

Source§

type Error = StringTapeError

The type returned in the event of a conversion error.
Source§

fn try_from( bytes_slices: BytesCows<'a, Offset, Length>, ) -> Result<Self, Self::Error>

Performs the conversion.

Auto Trait Implementations§

§

impl<'a, Offset, Length> Freeze for CharsCows<'a, Offset, Length>

§

impl<'a, Offset, Length> RefUnwindSafe for CharsCows<'a, Offset, Length>
where Offset: RefUnwindSafe, Length: RefUnwindSafe,

§

impl<'a, Offset, Length> Send for CharsCows<'a, Offset, Length>
where Offset: Send, Length: Send,

§

impl<'a, Offset, Length> Sync for CharsCows<'a, Offset, Length>
where Offset: Sync, Length: Sync,

§

impl<'a, Offset, Length> Unpin for CharsCows<'a, Offset, Length>
where Offset: Unpin, Length: Unpin,

§

impl<'a, Offset, Length> UnwindSafe for CharsCows<'a, Offset, Length>
where Offset: UnwindSafe, Length: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.