pub struct Utf8Iterator<R>where
R: Iterator,{ /* private fields */ }
Expand description
A Utf8Iterator
wraps a UTF-8 decoder around an iterator for Read
.
Essentially, the Utf8Iterator
converts a u8
iterator into a char
iterator. The underling iterator can be an
iterator for a BufRead
or a Cursor
, for example.
It is meant to iterate around an I/O. Therefore, it is expecting the inner iterator to be of type Iterator<Item = Result<u8, std::io::Error>>
.
The next()
method will return an Option
, where None
indicates the end of the sequence and a value
will be of type Result
containing a char
or an error, which will describe an UTF-8 decoding error or an IO error from the underling iterator.
Decoding errors will contain the malformed sequences.
§Examples
use rustf8::*;
use std::io::prelude::*;
use std::io::Cursor;
fn some_correct_utf_8_text() {
let input: Vec<u8> = vec![
0xce, 0xba, 0xe1, 0xbd, 0xb9, 0xcf, 0x83, 0xce, 0xbc, 0xce, 0xb5,
];
let stream = Cursor::new(input);
let iter = stream.bytes();
let mut chiter = Utf8Iterator::new(iter);
assert_eq!('κ', chiter.next().unwrap().unwrap());
assert_eq!('ό', chiter.next().unwrap().unwrap());
assert_eq!('σ', chiter.next().unwrap().unwrap());
assert_eq!('μ', chiter.next().unwrap().unwrap());
assert_eq!('ε', chiter.next().unwrap().unwrap());
assert!(chiter.next().is_none());
}
§Errors
The Utf8Iterator
will identify UTF-8 decoding errors returning the enum Utf8IteratorError
.
The error will also contain a Box<u8>
with the malformed sequence.
Subsequent calls to next()
are allowed and will decode valid characters from the point beyond the malformed sequence.
The IO error std::io::ErrorKind::Interrupted
coming from the underling iterator will be transparently consumed by the next()
method.
Therefore there will be no need to treat such error.
§Panics
Panics if trying to use unget()
twice before calling next()
.
§Safety
This crate does not use unsafe {}
.
Once decoded, the values are converted using char::from_u32()
, which should prevent invalid characters anyway.
Implementations§
Source§impl<R> Utf8Iterator<R>
impl<R> Utf8Iterator<R>
Sourcepub fn new(inner: R) -> Self
pub fn new(inner: R) -> Self
Builds a new UTF-8 iterator using the provided iterator for a Read
.
This iterator will not reinitialize once it reaches the end of the sequence.
Also, the decoding will start at the current position of the underling iterator.
Sourcepub fn unget(&mut self, ch: char)
pub fn unget(&mut self, ch: char)
Returns a character to the iterator. That will be the item returned by the subsequent call to next()
.
Calling unget()
twice before calling next()
will panic.
§Example
use rustf8::*;
use std::io::prelude::*;
use std::io::Cursor;
fn unget_test() {
let input: Vec<u8> = vec![
0xce, 0xba, 0xe1, 0xbd, 0xb9, 0xcf, 0x83, 0xce, 0xbc, 0xce, 0xb5,
];
let stream = Cursor::new(input);
let iter = stream.bytes();
let mut chiter = Utf8Iterator::new(iter);
assert_eq!('κ', chiter.next().unwrap().unwrap());
chiter.unget('ε');
assert_eq!('ε', chiter.next().unwrap().unwrap());
assert_eq!('ό', chiter.next().unwrap().unwrap());
assert_eq!('σ', chiter.next().unwrap().unwrap());
assert_eq!('μ', chiter.next().unwrap().unwrap());
assert_eq!('ε', chiter.next().unwrap().unwrap());
chiter.unget('κ');
assert_eq!('κ', chiter.next().unwrap().unwrap());
assert!(chiter.next().is_none());
}
§Panics
Panics if trying to use unget()
twice before calling next()
.
Trait Implementations§
Source§impl<R> Iterator for Utf8Iterator<R>
impl<R> Iterator for Utf8Iterator<R>
Source§fn next(&mut self) -> Option<Self::Item>
fn next(&mut self) -> Option<Self::Item>
Decodes the next UTF-8 sequence and returns the corresponding character.
Source§fn size_hint(&self) -> (usize, Option<usize>)
fn size_hint(&self) -> (usize, Option<usize>)
Source§fn next_chunk<const N: usize>(
&mut self,
) -> Result<[Self::Item; N], IntoIter<Self::Item, N>>where
Self: Sized,
fn next_chunk<const N: usize>(
&mut self,
) -> Result<[Self::Item; N], IntoIter<Self::Item, N>>where
Self: Sized,
iter_next_chunk
)N
values. Read more1.0.0 · Source§fn count(self) -> usizewhere
Self: Sized,
fn count(self) -> usizewhere
Self: Sized,
1.0.0 · Source§fn last(self) -> Option<Self::Item>where
Self: Sized,
fn last(self) -> Option<Self::Item>where
Self: Sized,
Source§fn advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>>
fn advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>>
iter_advance_by
)n
elements. Read more1.0.0 · Source§fn nth(&mut self, n: usize) -> Option<Self::Item>
fn nth(&mut self, n: usize) -> Option<Self::Item>
n
th element of the iterator. Read more1.28.0 · Source§fn step_by(self, step: usize) -> StepBy<Self>where
Self: Sized,
fn step_by(self, step: usize) -> StepBy<Self>where
Self: Sized,
1.0.0 · Source§fn chain<U>(self, other: U) -> Chain<Self, <U as IntoIterator>::IntoIter>
fn chain<U>(self, other: U) -> Chain<Self, <U as IntoIterator>::IntoIter>
1.0.0 · Source§fn zip<U>(self, other: U) -> Zip<Self, <U as IntoIterator>::IntoIter>where
Self: Sized,
U: IntoIterator,
fn zip<U>(self, other: U) -> Zip<Self, <U as IntoIterator>::IntoIter>where
Self: Sized,
U: IntoIterator,
Source§fn intersperse(self, separator: Self::Item) -> Intersperse<Self>
fn intersperse(self, separator: Self::Item) -> Intersperse<Self>
iter_intersperse
)separator
between adjacent
items of the original iterator. Read moreSource§fn intersperse_with<G>(self, separator: G) -> IntersperseWith<Self, G>
fn intersperse_with<G>(self, separator: G) -> IntersperseWith<Self, G>
iter_intersperse
)separator
between adjacent items of the original iterator. Read more1.0.0 · Source§fn map<B, F>(self, f: F) -> Map<Self, F>
fn map<B, F>(self, f: F) -> Map<Self, F>
1.0.0 · Source§fn filter<P>(self, predicate: P) -> Filter<Self, P>
fn filter<P>(self, predicate: P) -> Filter<Self, P>
1.0.0 · Source§fn filter_map<B, F>(self, f: F) -> FilterMap<Self, F>
fn filter_map<B, F>(self, f: F) -> FilterMap<Self, F>
1.0.0 · Source§fn enumerate(self) -> Enumerate<Self>where
Self: Sized,
fn enumerate(self) -> Enumerate<Self>where
Self: Sized,
1.0.0 · Source§fn skip_while<P>(self, predicate: P) -> SkipWhile<Self, P>
fn skip_while<P>(self, predicate: P) -> SkipWhile<Self, P>
1.0.0 · Source§fn take_while<P>(self, predicate: P) -> TakeWhile<Self, P>
fn take_while<P>(self, predicate: P) -> TakeWhile<Self, P>
1.57.0 · Source§fn map_while<B, P>(self, predicate: P) -> MapWhile<Self, P>
fn map_while<B, P>(self, predicate: P) -> MapWhile<Self, P>
1.0.0 · Source§fn skip(self, n: usize) -> Skip<Self>where
Self: Sized,
fn skip(self, n: usize) -> Skip<Self>where
Self: Sized,
n
elements. Read more1.0.0 · Source§fn take(self, n: usize) -> Take<Self>where
Self: Sized,
fn take(self, n: usize) -> Take<Self>where
Self: Sized,
n
elements, or fewer
if the underlying iterator ends sooner. Read more1.0.0 · Source§fn flat_map<U, F>(self, f: F) -> FlatMap<Self, U, F>
fn flat_map<U, F>(self, f: F) -> FlatMap<Self, U, F>
1.29.0 · Source§fn flatten(self) -> Flatten<Self>
fn flatten(self) -> Flatten<Self>
Source§fn map_windows<F, R, const N: usize>(self, f: F) -> MapWindows<Self, F, N>
fn map_windows<F, R, const N: usize>(self, f: F) -> MapWindows<Self, F, N>
iter_map_windows
)f
for each contiguous window of size N
over
self
and returns an iterator over the outputs of f
. Like slice::windows()
,
the windows during mapping overlap as well. Read more1.0.0 · Source§fn inspect<F>(self, f: F) -> Inspect<Self, F>
fn inspect<F>(self, f: F) -> Inspect<Self, F>
1.0.0 · Source§fn by_ref(&mut self) -> &mut Selfwhere
Self: Sized,
fn by_ref(&mut self) -> &mut Selfwhere
Self: Sized,
Iterator
. Read moreSource§fn try_collect<B>(
&mut self,
) -> <<Self::Item as Try>::Residual as Residual<B>>::TryType
fn try_collect<B>( &mut self, ) -> <<Self::Item as Try>::Residual as Residual<B>>::TryType
iterator_try_collect
)Source§fn collect_into<E>(self, collection: &mut E) -> &mut E
fn collect_into<E>(self, collection: &mut E) -> &mut E
iter_collect_into
)1.0.0 · Source§fn partition<B, F>(self, f: F) -> (B, B)
fn partition<B, F>(self, f: F) -> (B, B)
Source§fn is_partitioned<P>(self, predicate: P) -> bool
fn is_partitioned<P>(self, predicate: P) -> bool
iter_is_partitioned
)true
precede all those that return false
. Read more1.27.0 · Source§fn try_fold<B, F, R>(&mut self, init: B, f: F) -> R
fn try_fold<B, F, R>(&mut self, init: B, f: F) -> R
1.27.0 · Source§fn try_for_each<F, R>(&mut self, f: F) -> R
fn try_for_each<F, R>(&mut self, f: F) -> R
1.0.0 · Source§fn fold<B, F>(self, init: B, f: F) -> B
fn fold<B, F>(self, init: B, f: F) -> B
1.51.0 · Source§fn reduce<F>(self, f: F) -> Option<Self::Item>
fn reduce<F>(self, f: F) -> Option<Self::Item>
Source§fn try_reduce<R>(
&mut self,
f: impl FnMut(Self::Item, Self::Item) -> R,
) -> <<R as Try>::Residual as Residual<Option<<R as Try>::Output>>>::TryType
fn try_reduce<R>( &mut self, f: impl FnMut(Self::Item, Self::Item) -> R, ) -> <<R as Try>::Residual as Residual<Option<<R as Try>::Output>>>::TryType
iterator_try_reduce
)1.0.0 · Source§fn all<F>(&mut self, f: F) -> bool
fn all<F>(&mut self, f: F) -> bool
1.0.0 · Source§fn any<F>(&mut self, f: F) -> bool
fn any<F>(&mut self, f: F) -> bool
1.0.0 · Source§fn find<P>(&mut self, predicate: P) -> Option<Self::Item>
fn find<P>(&mut self, predicate: P) -> Option<Self::Item>
1.30.0 · Source§fn find_map<B, F>(&mut self, f: F) -> Option<B>
fn find_map<B, F>(&mut self, f: F) -> Option<B>
Source§fn try_find<R>(
&mut self,
f: impl FnMut(&Self::Item) -> R,
) -> <<R as Try>::Residual as Residual<Option<Self::Item>>>::TryType
fn try_find<R>( &mut self, f: impl FnMut(&Self::Item) -> R, ) -> <<R as Try>::Residual as Residual<Option<Self::Item>>>::TryType
try_find
)1.0.0 · Source§fn position<P>(&mut self, predicate: P) -> Option<usize>
fn position<P>(&mut self, predicate: P) -> Option<usize>
1.0.0 · Source§fn max(self) -> Option<Self::Item>
fn max(self) -> Option<Self::Item>
1.0.0 · Source§fn min(self) -> Option<Self::Item>
fn min(self) -> Option<Self::Item>
1.6.0 · Source§fn max_by_key<B, F>(self, f: F) -> Option<Self::Item>
fn max_by_key<B, F>(self, f: F) -> Option<Self::Item>
1.15.0 · Source§fn max_by<F>(self, compare: F) -> Option<Self::Item>
fn max_by<F>(self, compare: F) -> Option<Self::Item>
1.6.0 · Source§fn min_by_key<B, F>(self, f: F) -> Option<Self::Item>
fn min_by_key<B, F>(self, f: F) -> Option<Self::Item>
1.15.0 · Source§fn min_by<F>(self, compare: F) -> Option<Self::Item>
fn min_by<F>(self, compare: F) -> Option<Self::Item>
1.0.0 · Source§fn unzip<A, B, FromA, FromB>(self) -> (FromA, FromB)
fn unzip<A, B, FromA, FromB>(self) -> (FromA, FromB)
1.36.0 · Source§fn copied<'a, T>(self) -> Copied<Self>
fn copied<'a, T>(self) -> Copied<Self>
Source§fn array_chunks<const N: usize>(self) -> ArrayChunks<Self, N>where
Self: Sized,
fn array_chunks<const N: usize>(self) -> ArrayChunks<Self, N>where
Self: Sized,
iter_array_chunks
)N
elements of the iterator at a time. Read more1.11.0 · Source§fn product<P>(self) -> P
fn product<P>(self) -> P
Source§fn cmp_by<I, F>(self, other: I, cmp: F) -> Ordering
fn cmp_by<I, F>(self, other: I, cmp: F) -> Ordering
iter_order_by
)Iterator
with those
of another with respect to the specified comparison function. Read more1.5.0 · Source§fn partial_cmp<I>(self, other: I) -> Option<Ordering>
fn partial_cmp<I>(self, other: I) -> Option<Ordering>
PartialOrd
elements of
this Iterator
with those of another. The comparison works like short-circuit
evaluation, returning a result without comparing the remaining elements.
As soon as an order can be determined, the evaluation stops and a result is returned. Read moreSource§fn partial_cmp_by<I, F>(self, other: I, partial_cmp: F) -> Option<Ordering>where
Self: Sized,
I: IntoIterator,
F: FnMut(Self::Item, <I as IntoIterator>::Item) -> Option<Ordering>,
fn partial_cmp_by<I, F>(self, other: I, partial_cmp: F) -> Option<Ordering>where
Self: Sized,
I: IntoIterator,
F: FnMut(Self::Item, <I as IntoIterator>::Item) -> Option<Ordering>,
iter_order_by
)Iterator
with those
of another with respect to the specified comparison function. Read moreSource§fn eq_by<I, F>(self, other: I, eq: F) -> bool
fn eq_by<I, F>(self, other: I, eq: F) -> bool
iter_order_by
)1.5.0 · Source§fn lt<I>(self, other: I) -> bool
fn lt<I>(self, other: I) -> bool
Iterator
are lexicographically
less than those of another. Read more1.5.0 · Source§fn le<I>(self, other: I) -> bool
fn le<I>(self, other: I) -> bool
Iterator
are lexicographically
less or equal to those of another. Read more1.5.0 · Source§fn gt<I>(self, other: I) -> bool
fn gt<I>(self, other: I) -> bool
Iterator
are lexicographically
greater than those of another. Read more1.5.0 · Source§fn ge<I>(self, other: I) -> bool
fn ge<I>(self, other: I) -> bool
Iterator
are lexicographically
greater than or equal to those of another. Read more