pub struct Utf8Error { /* private fields */ }Expand description
Errors which can occur when attempting to interpret a sequence of u8
as a string.
As such, the from_utf8 family of functions and methods for both Strings
and &strs make use of this error, for example.
§Examples
This error type’s methods can be used to create functionality
similar to String::from_utf8_lossy without allocating heap memory:
fn from_utf8_lossy<F>(mut input: &[u8], mut push: F) where F: FnMut(&str) {
    loop {
        match std::str::from_utf8(input) {
            Ok(valid) => {
                push(valid);
                break
            }
            Err(error) => {
                let (valid, after_valid) = input.split_at(error.valid_up_to());
                unsafe {
                    push(std::str::from_utf8_unchecked(valid))
                }
                push("\u{FFFD}");
                if let Some(invalid_sequence_length) = error.error_len() {
                    input = &after_valid[invalid_sequence_length..]
                } else {
                    break
                }
            }
        }
    }
}Implementations§
Source§impl Utf8Error
 
impl Utf8Error
1.5.0 (const: 1.63.0) · Sourcepub const fn valid_up_to(&self) -> usize
 
pub const fn valid_up_to(&self) -> usize
Returns the index in the given string up to which valid UTF-8 was verified.
It is the maximum index such that from_utf8(&input[..index])
would return Ok(_).
§Examples
Basic usage:
use std::str;
// some invalid bytes, in a vector
let sparkle_heart = vec![0, 159, 146, 150];
// std::str::from_utf8 returns a Utf8Error
let error = str::from_utf8(&sparkle_heart).unwrap_err();
// the second byte is invalid here
assert_eq!(1, error.valid_up_to());1.20.0 (const: 1.63.0) · Sourcepub const fn error_len(&self) -> Option<usize>
 
pub const fn error_len(&self) -> Option<usize>
Provides more information about the failure:
- 
None: the end of the input was reached unexpectedly.self.valid_up_to()is 1 to 3 bytes from the end of the input. If a byte stream (such as a file or a network socket) is being decoded incrementally, this could be a validcharwhose UTF-8 byte sequence is spanning multiple chunks.
- 
Some(len): an unexpected byte was encountered. The length provided is that of the invalid byte sequence that starts at the index given byvalid_up_to(). Decoding should resume after that sequence (after inserting aU+FFFD REPLACEMENT CHARACTER) in case of lossy decoding.