Utf32Str

Enum Utf32Str 

Source
pub enum Utf32Str<'a> {
    Ascii(&'a [u8]),
    Unicode(&'a [char]),
}
Expand description

A UTF32 encoded (char array) string that is used as an input to (fuzzy) matching.

Usually rusts’ utf8 encoded strings are great. However during fuzzy matching operates on codepoints (it should operate on graphemes but that’s too much hassle to deal with). We want to quickly iterate these codepoints between (up to 5 times) during matching.

Doing codepoint segmentation on the fly not only blows trough the cache (lookuptables and Icache) but also has nontrivial runtime compared to the matching itself. Furthermore there are a lot of exta optimizations available for ascii only text (but checking during each match has too much overhead).

Ofcourse this comes at exta memory cost as we usually still need the ut8 encoded variant for rendering. In the (dominant) case of ascii-only text we don’t require a copy. Furthermore fuzzy matching usually is applied while the user is typing on the fly so the same item is potentially matched many times (making the the upfront cost more worth it). That means that its basically always worth it to presegment the string.

For usecases that only match (a lot of) strings once its possible to keep char buffer around that is filled with the presegmented chars

Another advantage of this approach is that the matcher will naturally produce char indices (instead of utf8 offsets) anyway. With a codepoint basic representation like this the indices can be used directly

Variants§

§

Ascii(&'a [u8])

A string represented as ASCII encoded bytes. Correctness invariant: must only contain valid ASCII (<=127)

§

Unicode(&'a [char])

A string represented as an array of unicode codepoints (basically UTF-32).

Implementations§

Source§

impl<'a> Utf32Str<'a>

Source

pub fn new(str: &'a str, buf: &'a mut Vec<char>) -> Self

Convenience method to construct a Utf32Str from a normal utf8 str

Source

pub fn len(self) -> usize

Returns the number of characters in this string.

Source

pub fn is_empty(self) -> bool

Returns whether this string is empty.

Source

pub fn slice(self, range: impl RangeBounds<usize>) -> Utf32Str<'a>

Creates a slice with a string that contains the characters in the specified character range.

Source

pub fn slice_u32(self, range: impl RangeBounds<u32>) -> Utf32Str<'a>

Same as slice but accepts a u32 range for convenience since those are the indices returned by the matcher.

Source

pub fn is_ascii(self) -> bool

Returns whether this string only contains ascii text.

Source

pub fn get(self, n: u32) -> char

Returns the nth character in this string.

Source

pub fn chars(self) -> Chars<'a>

Returns an iterator over the characters in this string

Trait Implementations§

Source§

impl<'a> Clone for Utf32Str<'a>

Source§

fn clone(&self) -> Utf32Str<'a>

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for Utf32Str<'_>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Display for Utf32Str<'_>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'a> Hash for Utf32Str<'a>

Source§

fn hash<__H: Hasher>(&self, state: &mut __H)

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl<'a> Ord for Utf32Str<'a>

Source§

fn cmp(&self, other: &Utf32Str<'a>) -> Ordering

This method returns an Ordering between self and other. Read more
1.21.0 · Source§

fn max(self, other: Self) -> Self
where Self: Sized,

Compares and returns the maximum of two values. Read more
1.21.0 · Source§

fn min(self, other: Self) -> Self
where Self: Sized,

Compares and returns the minimum of two values. Read more
1.50.0 · Source§

fn clamp(self, min: Self, max: Self) -> Self
where Self: Sized,

Restrict a value to a certain interval. Read more
Source§

impl<'a> PartialEq for Utf32Str<'a>

Source§

fn eq(&self, other: &Utf32Str<'a>) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl<'a> PartialOrd for Utf32Str<'a>

Source§

fn partial_cmp(&self, other: &Utf32Str<'a>) -> Option<Ordering>

This method returns an ordering between self and other values if one exists. Read more
1.0.0 · Source§

fn lt(&self, other: &Rhs) -> bool

Tests less than (for self and other) and is used by the < operator. Read more
1.0.0 · Source§

fn le(&self, other: &Rhs) -> bool

Tests less than or equal to (for self and other) and is used by the <= operator. Read more
1.0.0 · Source§

fn gt(&self, other: &Rhs) -> bool

Tests greater than (for self and other) and is used by the > operator. Read more
1.0.0 · Source§

fn ge(&self, other: &Rhs) -> bool

Tests greater than or equal to (for self and other) and is used by the >= operator. Read more
Source§

impl<'a> Copy for Utf32Str<'a>

Source§

impl<'a> Eq for Utf32Str<'a>

Source§

impl<'a> StructuralPartialEq for Utf32Str<'a>

Auto Trait Implementations§

§

impl<'a> Freeze for Utf32Str<'a>

§

impl<'a> RefUnwindSafe for Utf32Str<'a>

§

impl<'a> Send for Utf32Str<'a>

§

impl<'a> Sync for Utf32Str<'a>

§

impl<'a> Unpin for Utf32Str<'a>

§

impl<'a> UnwindSafe for Utf32Str<'a>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

fn to_string(&self) -> String

Converts the given value to a String. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.