Struct Wtf8Buf

Source

pub struct Wtf8Buf { /* private fields */ }

Expand description

An owned, growable string of well-formed WTF-8 data.

Similar to String, but can additionally contain surrogate code points if they’re not in a surrogate pair.

Implementations§

Source §

impl Wtf8Buf

Source

pub fn new() -> Wtf8Buf

Creates a new, empty WTF-8 string.

Source

pub fn with_capacity(capacity: usize) -> Wtf8Buf

Creates a new, empty WTF-8 string with pre-allocated capacity for capacity bytes.

Source

pub const unsafe fn from_bytes_unchecked(value: Vec<u8>) -> Wtf8Buf

Creates a WTF-8 string from a WTF-8 byte vec.

§Safety

value must contain valid WTF-8.

Source

pub fn from_bytes(value: Vec<u8>) -> Result<Self, Vec<u8>>

Create a WTF-8 string from a WTF-8 byte vec.

Source

pub fn from_string(string: String) -> Wtf8Buf

Creates a WTF-8 string from a UTF-8 String.

This takes ownership of the String and does not copy.

Since WTF-8 is a superset of UTF-8, this always succeeds.

Source

pub fn join<I, S>(sep: impl AsRef<Wtf8>, iter: I) -> Wtf8Buf
where I: IntoIterator<Item = S>, S: AsRef<Wtf8>,

Source

pub fn clear(&mut self)

Source

pub fn from_wide(v: &[u16]) -> Wtf8Buf

Creates a WTF-8 string from a potentially ill-formed UTF-16 slice of 16-bit code units.

This is lossless: calling .encode_wide() on the resulting string will always return the original code units.

Source

pub fn as_slice(&self) -> &Wtf8

Source

pub fn as_mut_slice(&mut self) -> &mut Wtf8

Source

pub fn reserve(&mut self, additional: usize)

Reserves capacity for at least additional more bytes to be inserted in the given Wtf8Buf. The collection may reserve more space to avoid frequent reallocations.

§Panics

Panics if the new capacity exceeds isize::MAX bytes.

Source

pub fn try_reserve(&mut self, additional: usize) -> Result<(), TryReserveError>

Tries to reserve capacity for at least additional more bytes to be inserted in the given Wtf8Buf. The Wtf8Buf may reserve more space to avoid frequent reallocations. After calling try_reserve, capacity will be greater than or equal to self.len() + additional. Does nothing if capacity is already sufficient. This method preserves the contents even if an error occurs.

§Errors

If the capacity overflows, or the allocator reports a failure, then an error is returned.

Source

pub fn reserve_exact(&mut self, additional: usize)

Source

pub fn try_reserve_exact( &mut self, additional: usize, ) -> Result<(), TryReserveError>

Tries to reserve the minimum capacity for exactly additional more bytes to be inserted in the given Wtf8Buf. After calling try_reserve_exact, capacity will be greater than or equal to self.len() + additional if it returns Ok(()). Does nothing if the capacity is already sufficient.

Note that the allocator may give the Wtf8Buf more space than it requests. Therefore, capacity can not be relied upon to be precisely minimal. Prefer try_reserve if future insertions are expected.

§Errors

If the capacity overflows, or the allocator reports a failure, then an error is returned.

Source

pub const fn capacity(&self) -> usize

Returns the number of bytes that this string buffer can hold without reallocating.

Source

pub fn push_str(&mut self, other: &str)

Append a UTF-8 slice at the end of the string.

Source

pub fn push_wtf8(&mut self, other: &Wtf8)

Append a WTF-8 slice at the end of the string.

Source

pub fn push_char(&mut self, c: char)

Append a Unicode scalar value at the end of the string.

Source

pub fn push(&mut self, code_point: CodePoint)

Append a code point at the end of the string.

Source

pub fn pop(&mut self) -> Option<CodePoint>

Source

pub fn truncate(&mut self, new_len: usize)

Shortens a string to the specified length.

§Panics

Panics if new_len > current length, or if new_len is not a code point boundary.

Source

pub fn insert(&mut self, idx: usize, c: CodePoint)

Inserts a codepoint into this Wtf8Buf at a byte position.

Source

pub fn insert_wtf8(&mut self, idx: usize, w: &Wtf8)

Inserts a WTF-8 slice into this Wtf8Buf at a byte position.

Source

pub fn into_bytes(self) -> Vec<u8> ⓘ

Consumes the WTF-8 string and tries to convert it to a vec of bytes.

Source

pub fn into_string(self) -> Result<String, Wtf8Buf>

Consumes the WTF-8 string and tries to convert it to UTF-8.

This does not copy the data.

If the contents are not well-formed UTF-8 (that is, if the string contains surrogates), the original WTF-8 string is returned instead.

Source

pub fn into_string_lossy(self) -> String

Consumes the WTF-8 string and converts it lossily to UTF-8.

This does not copy the data (but may overwrite parts of it in place).

Surrogates are replaced with "\u{FFFD}" (the replacement character “�”)

Source

pub fn into_box(self) -> Box<Wtf8>

Converts this Wtf8Buf into a boxed Wtf8.

Source

pub fn from_box(boxed: Box<Wtf8>) -> Wtf8Buf

Converts a Box<Wtf8> into a Wtf8Buf.

Methods from Deref<Target = Wtf8>§

Source

pub fn len(&self) -> usize

Returns the length, in WTF-8 bytes.

Source

pub fn is_empty(&self) -> bool

Source

pub fn ascii_byte_at(&self, position: usize) -> u8

Returns the code point at position if it is in the ASCII range, or b'\xFF' otherwise.

§Panics

Panics if position is beyond the end of the string.

Source

pub fn code_points(&self) -> Wtf8CodePoints<'_> ⓘ

Returns an iterator for the string’s code points.

Source

pub fn code_point_indices(&self) -> Wtf8CodePointIndices<'_> ⓘ

Returns an iterator for the string’s code points and their indices.

Source

pub fn as_bytes(&self) -> &[u8] ⓘ

Access raw bytes of WTF-8 data

Source

pub fn as_str(&self) -> Result<&str, Utf8Error>

Tries to convert the string to UTF-8 and return a &str slice.

Returns None if the string contains surrogates.

This does not copy the data.

Source

pub fn to_wtf8_buf(&self) -> Wtf8Buf

Creates an owned Wtf8Buf from a borrowed Wtf8.

Source

pub fn to_string_lossy(&self) -> Cow<'_, str>

Lossily converts the string to UTF-8. Returns a UTF-8 &str slice if the contents are well-formed in UTF-8.

Surrogates are replaced with "\u{FFFD}" (the replacement character “�”).

This only copies the data if necessary (if it contains any surrogate).

Source

pub fn encode_wide(&self) -> EncodeWide<'_> ⓘ

Converts the WTF-8 string to potentially ill-formed UTF-16 and return an iterator of 16-bit code units.

This is lossless: calling Wtf8Buf::from_ill_formed_utf16 on the resulting code units would always return the original WTF-8 string.

Source

pub fn chunks(&self) -> Wtf8Chunks<'_> ⓘ

Source

pub fn map_utf8<'a, I>( &'a self, f: impl Fn(&'a str) -> I, ) -> impl Iterator<Item = CodePoint>
where I: Iterator<Item = char>,

Source

pub fn is_code_point_boundary(&self, index: usize) -> bool

Source

pub fn into_box(&self) -> Box<Wtf8>

Boxes this Wtf8.

Source

pub fn make_ascii_lowercase(&mut self)

Source

pub fn make_ascii_uppercase(&mut self)

Source

pub fn to_ascii_lowercase(&self) -> Wtf8Buf

Source

pub fn to_ascii_uppercase(&self) -> Wtf8Buf

Source

pub fn to_lowercase(&self) -> Wtf8Buf

Source

pub fn to_uppercase(&self) -> Wtf8Buf

Source

pub fn is_ascii(&self) -> bool

Source

pub fn is_utf8(&self) -> bool

Source

pub fn eq_ignore_ascii_case(&self, other: &Self) -> bool

Source

pub fn split(&self, pat: &Wtf8) -> impl Iterator<Item = &Self>

Source

pub fn splitn(&self, n: usize, pat: &Wtf8) -> impl Iterator<Item = &Self>

Source

pub fn rsplit(&self, pat: &Wtf8) -> impl Iterator<Item = &Self>

Source

pub fn rsplitn(&self, n: usize, pat: &Wtf8) -> impl Iterator<Item = &Self>

Source

pub fn trim(&self) -> &Self

Source

pub fn trim_start(&self) -> &Self

Source

pub fn trim_end(&self) -> &Self

Source

pub fn trim_start_matches(&self, f: impl Fn(CodePoint) -> bool) -> &Self

Source

pub fn trim_end_matches(&self, f: impl Fn(CodePoint) -> bool) -> &Self

Source

pub fn trim_matches(&self, f: impl Fn(CodePoint) -> bool) -> &Self

Source

pub fn find(&self, pat: &Wtf8) -> Option<usize>

Source

pub fn rfind(&self, pat: &Wtf8) -> Option<usize>

Source

pub fn find_iter(&self, pat: &Wtf8) -> impl Iterator<Item = usize>

Source

pub fn rfind_iter(&self, pat: &Wtf8) -> impl Iterator<Item = usize>

Source

pub fn contains(&self, pat: &Wtf8) -> bool

Source

pub fn contains_code_point(&self, pat: CodePoint) -> bool

Source

pub fn get(&self, range: impl RangeBounds<usize>) -> Option<&Self>

Source

pub fn ends_with(&self, w: impl AsRef<Wtf8>) -> bool

Source

pub fn starts_with(&self, w: impl AsRef<Wtf8>) -> bool

Source

pub fn strip_prefix(&self, w: impl AsRef<Wtf8>) -> Option<&Self>

Source

pub fn strip_suffix(&self, w: impl AsRef<Wtf8>) -> Option<&Self>

Source

pub fn replace(&self, from: &Wtf8, to: &Wtf8) -> Wtf8Buf

Source

pub fn replacen(&self, from: &Wtf8, to: &Wtf8, n: usize) -> Wtf8Buf

Trait Implementations§

Source §

impl AsRef<Wtf8> for Wtf8Buf

Source §

fn as_ref(&self) -> &Wtf8

Converts this type into a shared reference of the (usually inferred) input type.

Source §

impl Borrow<Wtf8> for Wtf8Buf

Source §

fn borrow(&self) -> &Wtf8

Immutably borrows from an owned value. Read more

Source §

impl Clone for Wtf8Buf

Source §

fn clone(&self) -> Wtf8Buf

Returns a duplicate of the value. Read more

1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

Source §

impl Debug for Wtf8Buf

Formats the string in double quotes, with characters escaped according to char::escape_debug and unpaired surrogates represented as \u{xxxx}, where each x is a hexadecimal digit.

For example, the code units [U+0061, U+D800, U+000A] are formatted as "a\u{D800}\n".

Source §

fn fmt(&self, formatter: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Source §

impl Default for Wtf8Buf

Source §

fn default() -> Wtf8Buf

Returns the “default value” for a type. Read more

Source §

impl Deref for Wtf8Buf

Source §

type Target = Wtf8

The resulting type after dereferencing.

Source §

fn deref(&self) -> &Wtf8

Dereferences the value.

Source §

impl DerefMut for Wtf8Buf

Source §

fn deref_mut(&mut self) -> &mut Wtf8

Mutably dereferences the value.

Source §

impl Display for Wtf8Buf

Formats the string with unpaired surrogates substituted with the replacement character, U+FFFD.

Source §

fn fmt(&self, formatter: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Source §

impl Extend<CodePoint> for Wtf8Buf

Append code points from an iterator to the string.

This replaces surrogate code point pairs with supplementary code points, like concatenating ill-formed UTF-16 strings effectively would.

Source §

fn extend<T: IntoIterator<Item = CodePoint>>(&mut self, iter: T)

Extends a collection with the contents of an iterator. Read more

Source §

fn extend_one(&mut self, item: A)

🔬This is a nightly-only experimental API. (extend_one)

Extends a collection with exactly one element.

Source §

fn extend_reserve(&mut self, additional: usize)

🔬This is a nightly-only experimental API. (extend_one)

Reserves capacity in a collection for the given number of additional elements. Read more

Source §

impl<W: AsRef<Wtf8>> Extend<W> for Wtf8Buf

Source §

fn extend<T: IntoIterator<Item = W>>(&mut self, iter: T)

Extends a collection with the contents of an iterator. Read more

Source §

fn extend_one(&mut self, item: A)

🔬This is a nightly-only experimental API. (extend_one)

Extends a collection with exactly one element.

Source §

fn extend_reserve(&mut self, additional: usize)

🔬This is a nightly-only experimental API. (extend_one)

Reserves capacity in a collection for the given number of additional elements. Read more

Source §

impl Extend<char> for Wtf8Buf

Source §

fn extend<T: IntoIterator<Item = char>>(&mut self, iter: T)

Extends a collection with the contents of an iterator. Read more

Source §

fn extend_one(&mut self, item: A)

🔬This is a nightly-only experimental API. (extend_one)

Extends a collection with exactly one element.

Source §

fn extend_reserve(&mut self, additional: usize)

🔬This is a nightly-only experimental API. (extend_one)

Reserves capacity in a collection for the given number of additional elements. Read more

Source §