pub struct U32Str { /* private fields */ }
Expand description

32-bit wide string slice with undefined encoding.

U32Str is to U32String as OsStr is to OsString.

U32Str are string slices that do not have a defined encoding. While it is sometimes assumed that they contain possibly invalid or ill-formed UTF-32 data, they may be used for any wide encoded string. This is because U32Str is intended to be used with FFI functions, where proper encoding cannot be guaranteed. If you need string slices that are always valid UTF-32 strings, use Utf32Str instead.

Because U32Str does not have a defined encoding, no restrictions are placed on mutating or indexing the slice. This means that even if the string contained properly encoded UTF-32 or other encoding data, mutationing or indexing may result in malformed data. Convert to a Utf32Str if retaining proper UTF-32 encoding is desired.

FFI considerations

U32Str is not aware of nul values and may or may not be nul-terminated. It is intended to be used with FFI functions that directly use string length, where the strings are known to have proper nul-termination already, or where strings are merely being passed through without modification.

U32CStr should be used instead if nul-aware strings are required.

Examples

The easiest way to use U32Str outside of FFI is with the u32str! macro to convert string literals into UTF-32 string slices at compile time:

use widestring::u32str;
let hello = u32str!("Hello, world!");

You can also convert any u32 slice directly:

use widestring::{u32str, U32Str};

let sparkle_heart = [0x1f496];
let sparkle_heart = U32Str::from_slice(&sparkle_heart);

assert_eq!(u32str!("💖"), sparkle_heart);

// This UTf-16 surrogate is invalid UTF-32, but is perfectly valid in U32Str
let malformed_utf32 = [0x0, 0xd83d]; // Note that nul values are also valid an untouched
let s = U32Str::from_slice(&malformed_utf32);

assert_eq!(s.len(), 2);

When working with a FFI, it is useful to create a U32Str from a pointer and a length:

use widestring::{u32str, U32Str};

let sparkle_heart = [0x1f496];
let sparkle_heart = unsafe {
    U32Str::from_ptr(sparkle_heart.as_ptr(), sparkle_heart.len())
};
assert_eq!(u32str!("💖"), sparkle_heart);

Implementations

Coerces a value into a wide string slice.

Constructs a wide string slice from a pointer and a length.

The len argument is the number of elements, not the number of bytes. No copying or allocation is performed, the resulting value is a direct reference to the pointer bytes.

Safety

This function is unsafe as there is no guarantee that the given pointer is valid for len elements.

In addition, the data must meet the safety conditions of std::slice::from_raw_parts. In particular, the returned string reference must not be mutated for the duration of lifetime 'a, except inside an UnsafeCell.

Panics

This function panics if p is null.

Caveat

The lifetime for the returned string is inferred from its usage. To prevent accidental misuse, it’s suggested to tie the lifetime to whichever source lifetime is safe in the context, such as by providing a helper function taking the lifetime of a host value for the string, or by explicit annotation.

Constructs a mutable wide string slice from a mutable pointer and a length.

The len argument is the number of elements, not the number of bytes. No copying or allocation is performed, the resulting value is a direct reference to the pointer bytes.

Safety

This function is unsafe as there is no guarantee that the given pointer is valid for len elements.

In addition, the data must meet the safety conditions of std::slice::from_raw_parts_mut.

Panics

This function panics if p is null.

Caveat

The lifetime for the returned string is inferred from its usage. To prevent accidental misuse, it’s suggested to tie the lifetime to whichever source lifetime is safe in the context, such as by providing a helper function taking the lifetime of a host value for the string, or by explicit annotation.

Constructs a wide string slice from a slice of character data.

No checks are performed on the slice. It may be of any encoding and may contain invalid or malformed data for that encoding.

Constructs a mutable wide string slice from a mutable slice of character data.

No checks are performed on the slice. It may be of any encoding and may contain invalid or malformed data for that encoding.

Copies the string reference to a new owned wide string.

Converts to a slice of the underlying elements of the string.

Converts to a mutable slice of the underlying elements of the string.

Returns a raw pointer to the string.

The caller must ensure that the string outlives the pointer this function returns, or else it will end up pointing to garbage.

The caller must also ensure that the memory the pointer (non-transitively) points to is never written to (except inside an UnsafeCell) using this pointer or any pointer derived from it. If you need to mutate the contents of the string, use as_mut_ptr.

Modifying the container referenced by this string may cause its buffer to be reallocated, which would also make any pointers to it invalid.

Returns an unsafe mutable raw pointer to the string.

The caller must ensure that the string outlives the pointer this function returns, or else it will end up pointing to garbage.

Modifying the container referenced by this string may cause its buffer to be reallocated, which would also make any pointers to it invalid.

Returns the two raw pointers spanning the string slice.

The returned range is half-open, which means that the end pointer points one past the last element of the slice. This way, an empty slice is represented by two equal pointers, and the difference between the two pointers represents the size of the slice.

See as_ptr for warnings on using these pointers. The end pointer requires extra caution, as it does not point to a valid element in the slice.

This function is useful for interacting with foreign interfaces which use two pointers to refer to a range of elements in memory, as is common in C++.

Returns the two unsafe mutable pointers spanning the string slice.

The returned range is half-open, which means that the end pointer points one past the last element of the slice. This way, an empty slice is represented by two equal pointers, and the difference between the two pointers represents the size of the slice.

See as_mut_ptr for warnings on using these pointers. The end pointer requires extra caution, as it does not point to a valid element in the slice.

This function is useful for interacting with foreign interfaces which use two pointers to refer to a range of elements in memory, as is common in C++.

Returns the length of the string as number of elements (not number of bytes).

Returns whether this string contains no data.

Converts a boxed wide string slice into an owned wide string without copying or allocating.

Returns an object that implements Display for printing strings that may contain non-Unicode data.

This method assumes this string is intended to be UTF-32 encoding, but handles ill-formed UTF-32 sequences lossily. The returned struct implements the Display trait in a way that decoding the string is lossy UTF-32 decoding but no heap allocations are performed, such as by to_string_lossy.

By default, invalid Unicode data is replaced with U+FFFD REPLACEMENT CHARACTER (�). If you wish to simply skip any invalid Uncode data and forego the replacement, you may use the alternate formatting with {:#}.

Examples

Basic usage:

use widestring::U32Str;

// 𝄞mus<invalid>ic<invalid>
let s = U32Str::from_slice(&[
    0x1d11e, 0x006d, 0x0075, 0x0073, 0xDD1E, 0x0069, 0x0063, 0xD834,
]);

assert_eq!(format!("{}", s.display()),
"𝄞mus�ic�"
);

Using alternate formatting style to skip invalid values entirely:

use widestring::U32Str;

// 𝄞mus<invalid>ic<invalid>
let s = U32Str::from_slice(&[
    0x1d11e, 0x006d, 0x0075, 0x0073, 0xDD1E, 0x0069, 0x0063, 0xD834,
]);

assert_eq!(format!("{:#}", s.display()),
"𝄞music"
);

Returns a subslice of the string.

This is the non-panicking alternative to indexing the string. Returns None whenever equivalent indexing operation would panic.

Returns a mutable subslice of the string.

This is the non-panicking alternative to indexing the string. Returns None whenever equivalent indexing operation would panic.

Returns an unchecked subslice of the string.

This is the unchecked alternative to indexing the string.

Safety

Callers of this function are responsible that these preconditions are satisfied:

  • The starting index must not exceed the ending index;
  • Indexes must be within bounds of the original slice.

Failing that, the returned string slice may reference invalid memory.

Returns aa mutable, unchecked subslice of the string.

This is the unchecked alternative to indexing the string.

Safety

Callers of this function are responsible that these preconditions are satisfied:

  • The starting index must not exceed the ending index;
  • Indexes must be within bounds of the original slice.

Failing that, the returned string slice may reference invalid memory.

Divide one string slice into two at an index.

The argument, mid, should be an offset from the start of the string.

The two slices returned go from the start of the string slice to mid, and from mid to the end of the string slice.

To get mutable string slices instead, see the split_at_mut method.

Divide one mutable string slice into two at an index.

The argument, mid, should be an offset from the start of the string.

The two slices returned go from the start of the string slice to mid, and from mid to the end of the string slice.

To get immutable string slices instead, see the split_at method.

Creates a new owned string by repeating this string n times.

Panics

This function will panic if the capacity would overflow.

Constructs a U32Str from a char pointer and a length.

The len argument is the number of char elements, not the number of bytes. No copying or allocation is performed, the resulting value is a direct reference to the pointer bytes.

Safety

This function is unsafe as there is no guarantee that the given pointer is valid for len elements.

In addition, the data must meet the safety conditions of std::slice::from_raw_parts. In particular, the returned string reference must not be mutated for the duration of lifetime 'a, except inside an UnsafeCell.

Panics

This function panics if p is null.

Caveat

The lifetime for the returned string is inferred from its usage. To prevent accidental misuse, it’s suggested to tie the lifetime to whichever source lifetime is safe in the context, such as by providing a helper function taking the lifetime of a host value for the string, or by explicit annotation.

Constructs a mutable U32Str from a mutable char pointer and a length.

The len argument is the number of char elements, not the number of bytes. No copying or allocation is performed, the resulting value is a direct reference to the pointer bytes.

Safety

This function is unsafe as there is no guarantee that the given pointer is valid for len elements.

In addition, the data must meet the safety conditions of std::slice::from_raw_parts_mut.

Panics

This function panics if p is null.

Caveat

The lifetime for the returned string is inferred from its usage. To prevent accidental misuse, it’s suggested to tie the lifetime to whichever source lifetime is safe in the context, such as by providing a helper function taking the lifetime of a host value for the string, or by explicit annotation.

Constructs a U32Str from a char slice.

No checks are performed on the slice.

Constructs a mutable U32Str from a mutable char slice.

No checks are performed on the slice.

Decodes a string to an owned OsString.

This makes a string copy of the U16Str. Since U16Str makes no guarantees that its encoding is UTF-16 or that the data valid UTF-16, there is no guarantee that the resulting OsString will have a valid underlying encoding either.

Note that the encoding of OsString is platform-dependent, so on some platforms this may make an encoding conversions, while on other platforms no changes to the string will be made.

Examples
use widestring::U32String;
use std::ffi::OsString;
let s = "MyString";
// Create a wide string from the string
let wstr = U32String::from_str(s);
// Create an OsString from the wide string
let osstr = wstr.to_os_string();

assert_eq!(osstr, OsString::from(s));

Decodes the string to a String if it contains valid UTF-32 data.

This method assumes this string is encoded as UTF-32 and attempts to decode it as such.

Failures

Returns an error if the string contains any invalid UTF-32 data.

Examples
use widestring::U32String;
let s = "MyString";
// Create a wide string from the string
let wstr = U32String::from_str(s);
// Create a regular string from the wide string
let s2 = wstr.to_string().unwrap();

assert_eq!(s2, s);

Decodes the string reference to a String even if it is invalid UTF-32 data.

This method assumes this string is encoded as UTF-16 and attempts to decode it as such. Any invalid sequences are replaced with U+FFFD REPLACEMENT CHARACTER, which looks like this: �

Examples
use widestring::U32String;
let s = "MyString";
// Create a wide string from the string
let wstr = U32String::from_str(s);
// Create a regular string from the wide string
let lossy = wstr.to_string_lossy();

assert_eq!(lossy, s);

Returns an iterator over the chars of a string slice.

As this string has no defined encoding, this method assumes the string is UTF-32. Since it may consist of invalid UTF-32, the iterator returned by this method is an iterator over Result<char, DecodeUtf32Error> instead of chars directly. If you would like a lossy iterator over charss directly, instead use chars_lossy.

It’s important to remember that char represents a Unicode Scalar Value, and may not match your idea of what a ‘character’ is. Iteration over grapheme clusters may be what you actually want. That functionality is not provided by by this crate.

Returns a lossy iterator over the chars of a string slice.

As this string has no defined encoding, this method assumes the string is UTF-32. Since it may consist of invalid UTF-32, the iterator returned by this method will replace unpaired surrogates with U+FFFD REPLACEMENT CHARACTER (�). This is a lossy version of chars.

It’s important to remember that char represents a Unicode Scalar Value, and may not match your idea of what a ‘character’ is. Iteration over grapheme clusters may be what you actually want. That functionality is not provided by by this crate.

Returns an iterator over the chars of a string slice, and their positions.

As this string has no defined encoding, this method assumes the string is UTF-32. Since it may consist of invalid UTF-32, the iterator returned by this method is an iterator over Result<char, DecodeUtf32Error> as well as their positions, instead of chars directly. If you would like a lossy indices iterator over charss directly, instead use char_indices_lossy.

The iterator yields tuples. The position is first, the char is second.

Returns a lossy iterator over the chars of a string slice, and their positions.

As this string slice may consist of invalid UTF-32, the iterator returned by this method will replace invalid values with U+FFFD REPLACEMENT CHARACTER (�), as well as the positions of all characters. This is a lossy version of char_indices.

The iterator yields tuples. The position is first, the char is second.

Trait Implementations

The resulting type after applying the + operator.

Performs the + operation. Read more

Performs the += operation. Read more

Converts this type into a mutable reference of the (usually inferred) input type.

Converts this type into a mutable reference of the (usually inferred) input type.

Converts this type into a mutable reference of the (usually inferred) input type.

Converts this type into a shared reference of the (usually inferred) input type.

Converts this type into a shared reference of the (usually inferred) input type.

Converts this type into a shared reference of the (usually inferred) input type.

Converts this type into a shared reference of the (usually inferred) input type.

Converts this type into a shared reference of the (usually inferred) input type.

Converts this type into a shared reference of the (usually inferred) input type.

Converts this type into a shared reference of the (usually inferred) input type.

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Formats the value using the given formatter. Read more

Returns the “default value” for a type. Read more

Returns the “default value” for a type. Read more

Returns the “default value” for a type. Read more

Extends a collection with the contents of an iterator. Read more

🔬 This is a nightly-only experimental API. (extend_one)

Extends a collection with exactly one element.

🔬 This is a nightly-only experimental API. (extend_one)

Reserves capacity in a collection for the given number of additional elements. Read more

Converts to this type from the input type.

Converts to this type from the input type.

Converts to this type from the input type.

Converts to this type from the input type.

Converts to this type from the input type.

Converts to this type from the input type.

Converts to this type from the input type.

Converts to this type from the input type.

Converts to this type from the input type.

Converts to this type from the input type.

Converts to this type from the input type.

Converts to this type from the input type.

Creates a value from an iterator. Read more

Feeds this value into the given Hasher. Read more

Feeds a slice of this type into the given Hasher. Read more

The returned type after indexing.

Performs the indexing (container[index]) operation. Read more

Performs the mutable indexing (container[index]) operation. Read more

This method returns an Ordering between self and other. Read more

Compares and returns the maximum of two values. Read more

Compares and returns the minimum of two values. Read more

Restrict a value to a certain interval. Read more

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

This method returns an ordering between self and other values if one exists. Read more

This method tests less than (for self and other) and is used by the < operator. Read more

This method tests less than or equal to (for self and other) and is used by the <= operator. Read more

This method tests greater than (for self and other) and is used by the > operator. Read more

This method tests greater than or equal to (for self and other) and is used by the >= operator. Read more

This method returns an ordering between self and other values if one exists. Read more

This method tests less than (for self and other) and is used by the < operator. Read more

This method tests less than or equal to (for self and other) and is used by the <= operator. Read more

This method tests greater than (for self and other) and is used by the > operator. Read more

This method tests greater than or equal to (for self and other) and is used by the >= operator. Read more

This method returns an ordering between self and other values if one exists. Read more

This method tests less than (for self and other) and is used by the < operator. Read more

This method tests less than or equal to (for self and other) and is used by the <= operator. Read more

This method tests greater than (for self and other) and is used by the > operator. Read more

This method tests greater than or equal to (for self and other) and is used by the >= operator. Read more

This method returns an ordering between self and other values if one exists. Read more

This method tests less than (for self and other) and is used by the < operator. Read more

This method tests less than or equal to (for self and other) and is used by the <= operator. Read more

This method tests greater than (for self and other) and is used by the > operator. Read more

This method tests greater than or equal to (for self and other) and is used by the >= operator. Read more

This method returns an ordering between self and other values if one exists. Read more

This method tests less than (for self and other) and is used by the < operator. Read more

This method tests less than or equal to (for self and other) and is used by the <= operator. Read more

This method tests greater than (for self and other) and is used by the > operator. Read more

This method tests greater than or equal to (for self and other) and is used by the >= operator. Read more

This method returns an ordering between self and other values if one exists. Read more

This method tests less than (for self and other) and is used by the < operator. Read more

This method tests less than or equal to (for self and other) and is used by the <= operator. Read more

This method tests greater than (for self and other) and is used by the > operator. Read more

This method tests greater than or equal to (for self and other) and is used by the >= operator. Read more

This method returns an ordering between self and other values if one exists. Read more

This method tests less than (for self and other) and is used by the < operator. Read more

This method tests less than or equal to (for self and other) and is used by the <= operator. Read more

This method tests greater than (for self and other) and is used by the > operator. Read more

This method tests greater than or equal to (for self and other) and is used by the >= operator. Read more

The resulting type after obtaining ownership.

Creates owned data from borrowed data, usually by cloning. Read more

Uses borrowed data to replace owned data, usually by cloning. Read more

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more