Crate utf8_iter

source ·
Expand description

Provides iteration by char over &[u8] containing potentially-invalid UTF-8 such that errors are handled according to the WHATWG Encoding Standard (i.e. the same way as in String::from_utf8_lossy).

The trait Utf8CharsEx provides the convenience method chars() on byte slices themselves instead of having to use the more verbose Utf8Chars::new(slice).

use utf8_iter::Utf8CharsEx;
let data = b"\xFF\xC2\xE2\xE2\x98\xF0\xF0\x9F\xF0\x9F\x92\xE2\x98\x83";
let from_iter: String = data.chars().collect();
let from_std = String::from_utf8_lossy(data);
assert_eq!(from_iter, from_std);

Structs

  • Iterator by Result<char,Utf8CharsError> over &[u8] that contains potentially-invalid UTF-8. There is exactly one Utf8CharsError per each error as defined by the WHATWG Encoding Standard.
  • An iterator over the chars and their positions.
  • Iterator by char over &[u8] that contains potentially-invalid UTF-8. See the crate documentation.
  • A type for signaling UTF-8 errors.

Traits

  • Convenience trait that adds chars() and char_indices() methods similar to the ones on string slices to byte slices.