Struct ByteChunker

Source
pub struct ByteChunker<R> { /* private fields */ }
Expand description

The ByteChunker takes a bytes::Regex, wraps a byte source (that is, a type that implements std::io::Read) and iterates over chunks of bytes from that source that are delimited by the regular expression. It operates very much like bytes::Regex::split, except that it works on an incoming stream of bytes instead of a necessarily-already-in-memory slice.

use regex_chunker::ByteChunker;
use std::io::Cursor;

let text = b"One, two, three, four. Can I have a little more?";
let c = Cursor::new(text);

let chunks: Vec<String> = ByteChunker::new(c, "[ .,?]+")?
    .map(|res| {
        let v = res.unwrap();
        String::from_utf8(v).unwrap()
    }).collect();

assert_eq!(
    &chunks,
    &["One", "two", "three", "four",
    "Can", "I", "have", "a", "little", "more"].clone()
);

It’s also slightly more flexible, in that the the matched bytes can be optionally added to the beginnings or endings of the returned chunks. (By default they are just dropped.)

use regex_chunker::{ByteChunker, MatchDisposition};
use std::io::Cursor;

let text = b"One, two, three, four. Can I have a little more?";
let c = Cursor::new(text);

let chunks: Vec<String> = ByteChunker::new(c, "[ .,?]+")?
    .with_match(MatchDisposition::Append)
    .map(|res| {
        let v = res.unwrap();
        String::from_utf8(v).unwrap()
    }).collect();

assert_eq!(
    &chunks,
    &["One, ", "two, ", "three, ", "four. ",
    "Can ", "I ", "have ", "a ", "little ", "more?"].clone()
);

Implementations§

Source§

impl<R> ByteChunker<R>

Source

pub fn new(source: R, delimiter: &str) -> Result<Self, RcErr>

Return a new ByteChunker wrapping the given writer that will chunk its output by delimiting it with the supplied regex pattern.

Source

pub fn with_buffer_size(self, size: usize) -> Self

Builder-pattern method for setting the read buffer size. Default size is 1024 bytes.

Source

pub fn on_error(self, response: ErrorResponse) -> Self

Builder-pattern method for controlling how the chunker behaves when encountering an error in the course of its operation. Default value is ErrorResponse::Halt.

Source

pub fn with_match(self, behavior: MatchDisposition) -> Self

Builder-pattern method for controlling what the chunker does with the matched text. Default value is MatchDisposition::Drop.

Source

pub fn into_inner(self) -> R

Consumes the ByteChunker and returns its wrapped Reader. The ByteChunker may have read some data from its source that may not yet have been returned or successfully matched; this data may be lost. To retrieve that data, see ByteChunker::into_innards.

Source

pub fn into_innards(self) -> (R, Vec<u8>)

Consumes the ByteChunker and returns its wrapped Reader, as well as any not-yet-processed data that has been read. If this unprocessed data is unimportant, and you just want the reader back, use the more traditional ByteChunker::into_inner.

Source

pub fn with_adapter<A>(self, adapter: A) -> CustomChunker<R, A>

Creates a CustomChunker by combining this ByteChunker with an Adapter type.

Source

pub fn with_simple_adapter<A>(self, adapter: A) -> SimpleCustomChunker<R, A>

Trait Implementations§

Source§

impl<R> Debug for ByteChunker<R>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<R: Read> Iterator for ByteChunker<R>

The ByteChunker specifically doesn’t supply an implementation of Iterator::size_hint because, in general, it’s impossible to tell how much data is left in a reader.

Source§

type Item = Result<Vec<u8>, RcErr>

The type of the elements being iterated over.
Source§

fn next(&mut self) -> Option<Self::Item>

Advances the iterator and returns the next value. Read more
Source§

fn next_chunk<const N: usize>( &mut self, ) -> Result<[Self::Item; N], IntoIter<Self::Item, N>>
where Self: Sized,

🔬This is a nightly-only experimental API. (iter_next_chunk)
Advances the iterator and returns an array containing the next N values. Read more
1.0.0 · Source§

fn size_hint(&self) -> (usize, Option<usize>)

Returns the bounds on the remaining length of the iterator. Read more
1.0.0 · Source§

fn count(self) -> usize
where Self: Sized,

Consumes the iterator, counting the number of iterations and returning it. Read more
1.0.0 · Source§

fn last(self) -> Option<Self::Item>
where Self: Sized,

Consumes the iterator, returning the last element. Read more
Source§

fn advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>>

🔬This is a nightly-only experimental API. (iter_advance_by)
Advances the iterator by n elements. Read more
1.0.0 · Source§

fn nth(&mut self, n: usize) -> Option<Self::Item>

Returns the nth element of the iterator. Read more
1.28.0 · Source§

fn step_by(self, step: usize) -> StepBy<Self>
where Self: Sized,

Creates an iterator starting at the same point, but stepping by the given amount at each iteration. Read more
1.0.0 · Source§

fn chain<U>(self, other: U) -> Chain<Self, <U as IntoIterator>::IntoIter>
where Self: Sized, U: IntoIterator<Item = Self::Item>,

Takes two iterators and creates a new iterator over both in sequence. Read more
1.0.0 · Source§

fn zip<U>(self, other: U) -> Zip<Self, <U as IntoIterator>::IntoIter>
where Self: Sized, U: IntoIterator,

‘Zips up’ two iterators into a single iterator of pairs. Read more
Source§

fn intersperse(self, separator: Self::Item) -> Intersperse<Self>
where Self: Sized, Self::Item: Clone,

🔬This is a nightly-only experimental API. (iter_intersperse)
Creates a new iterator which places a copy of separator between adjacent items of the original iterator. Read more
Source§

fn intersperse_with<G>(self, separator: G) -> IntersperseWith<Self, G>
where Self: Sized, G: FnMut() -> Self::Item,

🔬This is a nightly-only experimental API. (iter_intersperse)
Creates a new iterator which places an item generated by separator between adjacent items of the original iterator. Read more
1.0.0 · Source§

fn map<B, F>(self, f: F) -> Map<Self, F>
where Self: Sized, F: FnMut(Self::Item) -> B,

Takes a closure and creates an iterator which calls that closure on each element. Read more
1.21.0 · Source§

fn for_each<F>(self, f: F)
where Self: Sized, F: FnMut(Self::Item),

Calls a closure on each element of an iterator. Read more
1.0.0 · Source§

fn filter<P>(self, predicate: P) -> Filter<Self, P>
where Self: Sized, P: FnMut(&Self::Item) -> bool,

Creates an iterator which uses a closure to determine if an element should be yielded. Read more
1.0.0 · Source§

fn filter_map<B, F>(self, f: F) -> FilterMap<Self, F>
where Self: Sized, F: FnMut(Self::Item) -> Option<B>,

Creates an iterator that both filters and maps. Read more
1.0.0 · Source§

fn enumerate(self) -> Enumerate<Self>
where Self: Sized,

Creates an iterator which gives the current iteration count as well as the next value. Read more
1.0.0 · Source§

fn peekable(self) -> Peekable<Self>
where Self: Sized,

Creates an iterator which can use the peek and peek_mut methods to look at the next element of the iterator without consuming it. See their documentation for more information. Read more
1.0.0 · Source§

fn skip_while<P>(self, predicate: P) -> SkipWhile<Self, P>
where Self: Sized, P: FnMut(&Self::Item) -> bool,

Creates an iterator that skips elements based on a predicate. Read more
1.0.0 · Source§

fn take_while<P>(self, predicate: P) -> TakeWhile<Self, P>
where Self: Sized, P: FnMut(&Self::Item) -> bool,

Creates an iterator that yields elements based on a predicate. Read more
1.57.0 · Source§

fn map_while<B, P>(self, predicate: P) -> MapWhile<Self, P>
where Self: Sized, P: FnMut(Self::Item) -> Option<B>,

Creates an iterator that both yields elements based on a predicate and maps. Read more
1.0.0 · Source§

fn skip(self, n: usize) -> Skip<Self>
where Self: Sized,

Creates an iterator that skips the first n elements. Read more
1.0.0 · Source§

fn take(self, n: usize) -> Take<Self>
where Self: Sized,

Creates an iterator that yields the first n elements, or fewer if the underlying iterator ends sooner. Read more
1.0.0 · Source§

fn scan<St, B, F>(self, initial_state: St, f: F) -> Scan<Self, St, F>
where Self: Sized, F: FnMut(&mut St, Self::Item) -> Option<B>,

An iterator adapter which, like fold, holds internal state, but unlike fold, produces a new iterator. Read more
1.0.0 · Source§

fn flat_map<U, F>(self, f: F) -> FlatMap<Self, U, F>
where Self: Sized, U: IntoIterator, F: FnMut(Self::Item) -> U,

Creates an iterator that works like map, but flattens nested structure. Read more
1.29.0 · Source§

fn flatten(self) -> Flatten<Self>
where Self: Sized, Self::Item: IntoIterator,

Creates an iterator that flattens nested structure. Read more
Source§

fn map_windows<F, R, const N: usize>(self, f: F) -> MapWindows<Self, F, N>
where Self: Sized, F: FnMut(&[Self::Item; N]) -> R,

🔬This is a nightly-only experimental API. (iter_map_windows)
Calls the given function f for each contiguous window of size N over self and returns an iterator over the outputs of f. Like slice::windows(), the windows during mapping overlap as well. Read more
1.0.0 · Source§

fn fuse(self) -> Fuse<Self>
where Self: Sized,

Creates an iterator which ends after the first None. Read more
1.0.0 · Source§

fn inspect<F>(self, f: F) -> Inspect<Self, F>
where Self: Sized, F: FnMut(&Self::Item),

Does something with each element of an iterator, passing the value on. Read more
1.0.0 · Source§

fn by_ref(&mut self) -> &mut Self
where Self: Sized,

Creates a “by reference” adapter for this instance of Iterator. Read more
1.0.0 · Source§

fn collect<B>(self) -> B
where B: FromIterator<Self::Item>, Self: Sized,

Transforms an iterator into a collection. Read more
Source§

fn try_collect<B>( &mut self, ) -> <<Self::Item as Try>::Residual as Residual<B>>::TryType
where Self: Sized, Self::Item: Try, <Self::Item as Try>::Residual: Residual<B>, B: FromIterator<<Self::Item as Try>::Output>,

🔬This is a nightly-only experimental API. (iterator_try_collect)
Fallibly transforms an iterator into a collection, short circuiting if a failure is encountered. Read more
Source§

fn collect_into<E>(self, collection: &mut E) -> &mut E
where E: Extend<Self::Item>, Self: Sized,

🔬This is a nightly-only experimental API. (iter_collect_into)
Collects all the items from an iterator into a collection. Read more
1.0.0 · Source§

fn partition<B, F>(self, f: F) -> (B, B)
where Self: Sized, B: Default + Extend<Self::Item>, F: FnMut(&Self::Item) -> bool,

Consumes an iterator, creating two collections from it. Read more
Source§

fn is_partitioned<P>(self, predicate: P) -> bool
where Self: Sized, P: FnMut(Self::Item) -> bool,

🔬This is a nightly-only experimental API. (iter_is_partitioned)
Checks if the elements of this iterator are partitioned according to the given predicate, such that all those that return true precede all those that return false. Read more
1.27.0 · Source§

fn try_fold<B, F, R>(&mut self, init: B, f: F) -> R
where Self: Sized, F: FnMut(B, Self::Item) -> R, R: Try<Output = B>,

An iterator method that applies a function as long as it returns successfully, producing a single, final value. Read more
1.27.0 · Source§

fn try_for_each<F, R>(&mut self, f: F) -> R
where Self: Sized, F: FnMut(Self::Item) -> R, R: Try<Output = ()>,

An iterator method that applies a fallible function to each item in the iterator, stopping at the first error and returning that error. Read more
1.0.0 · Source§

fn fold<B, F>(self, init: B, f: F) -> B
where Self: Sized, F: FnMut(B, Self::Item) -> B,

Folds every element into an accumulator by applying an operation, returning the final result. Read more
1.51.0 · Source§

fn reduce<F>(self, f: F) -> Option<Self::Item>
where Self: Sized, F: FnMut(Self::Item, Self::Item) -> Self::Item,

Reduces the elements to a single one, by repeatedly applying a reducing operation. Read more
Source§

fn try_reduce<R>( &mut self, f: impl FnMut(Self::Item, Self::Item) -> R, ) -> <<R as Try>::Residual as Residual<Option<<R as Try>::Output>>>::TryType
where Self: Sized, R: Try<Output = Self::Item>, <R as Try>::Residual: Residual<Option<Self::Item>>,

🔬This is a nightly-only experimental API. (iterator_try_reduce)
Reduces the elements to a single one by repeatedly applying a reducing operation. If the closure returns a failure, the failure is propagated back to the caller immediately. Read more
1.0.0 · Source§

fn all<F>(&mut self, f: F) -> bool
where Self: Sized, F: FnMut(Self::Item) -> bool,

Tests if every element of the iterator matches a predicate. Read more
1.0.0 · Source§

fn any<F>(&mut self, f: F) -> bool
where Self: Sized, F: FnMut(Self::Item) -> bool,

Tests if any element of the iterator matches a predicate. Read more
1.0.0 · Source§

fn find<P>(&mut self, predicate: P) -> Option<Self::Item>
where Self: Sized, P: FnMut(&Self::Item) -> bool,

Searches for an element of an iterator that satisfies a predicate. Read more
1.30.0 · Source§

fn find_map<B, F>(&mut self, f: F) -> Option<B>
where Self: Sized, F: FnMut(Self::Item) -> Option<B>,

Applies function to the elements of iterator and returns the first non-none result. Read more
Source§

fn try_find<R>( &mut self, f: impl FnMut(&Self::Item) -> R, ) -> <<R as Try>::Residual as Residual<Option<Self::Item>>>::TryType
where Self: Sized, R: Try<Output = bool>, <R as Try>::Residual: Residual<Option<Self::Item>>,

🔬This is a nightly-only experimental API. (try_find)
Applies function to the elements of iterator and returns the first true result or the first error. Read more
1.0.0 · Source§

fn position<P>(&mut self, predicate: P) -> Option<usize>
where Self: Sized, P: FnMut(Self::Item) -> bool,

Searches for an element in an iterator, returning its index. Read more
1.0.0 · Source§

fn max(self) -> Option<Self::Item>
where Self: Sized, Self::Item: Ord,

Returns the maximum element of an iterator. Read more
1.0.0 · Source§

fn min(self) -> Option<Self::Item>
where Self: Sized, Self::Item: Ord,

Returns the minimum element of an iterator. Read more
1.6.0 · Source§

fn max_by_key<B, F>(self, f: F) -> Option<Self::Item>
where B: Ord, Self: Sized, F: FnMut(&Self::Item) -> B,

Returns the element that gives the maximum value from the specified function. Read more
1.15.0 · Source§

fn max_by<F>(self, compare: F) -> Option<Self::Item>
where Self: Sized, F: FnMut(&Self::Item, &Self::Item) -> Ordering,

Returns the element that gives the maximum value with respect to the specified comparison function. Read more
1.6.0 · Source§

fn min_by_key<B, F>(self, f: F) -> Option<Self::Item>
where B: Ord, Self: Sized, F: FnMut(&Self::Item) -> B,

Returns the element that gives the minimum value from the specified function. Read more
1.15.0 · Source§

fn min_by<F>(self, compare: F) -> Option<Self::Item>
where Self: Sized, F: FnMut(&Self::Item, &Self::Item) -> Ordering,

Returns the element that gives the minimum value with respect to the specified comparison function. Read more
1.0.0 · Source§

fn unzip<A, B, FromA, FromB>(self) -> (FromA, FromB)
where FromA: Default + Extend<A>, FromB: Default + Extend<B>, Self: Sized + Iterator<Item = (A, B)>,

Converts an iterator of pairs into a pair of containers. Read more
1.36.0 · Source§

fn copied<'a, T>(self) -> Copied<Self>
where T: 'a + Copy, Self: Sized + Iterator<Item = &'a T>,

Creates an iterator which copies all of its elements. Read more
1.0.0 · Source§

fn cloned<'a, T>(self) -> Cloned<Self>
where T: 'a + Clone, Self: Sized + Iterator<Item = &'a T>,

Creates an iterator which clones all of its elements. Read more
Source§

fn array_chunks<const N: usize>(self) -> ArrayChunks<Self, N>
where Self: Sized,

🔬This is a nightly-only experimental API. (iter_array_chunks)
Returns an iterator over N elements of the iterator at a time. Read more
1.11.0 · Source§

fn sum<S>(self) -> S
where Self: Sized, S: Sum<Self::Item>,

Sums the elements of an iterator. Read more
1.11.0 · Source§

fn product<P>(self) -> P
where Self: Sized, P: Product<Self::Item>,

Iterates over the entire iterator, multiplying all the elements Read more
1.5.0 · Source§

fn cmp<I>(self, other: I) -> Ordering
where I: IntoIterator<Item = Self::Item>, Self::Item: Ord, Self: Sized,

Lexicographically compares the elements of this Iterator with those of another. Read more
Source§

fn cmp_by<I, F>(self, other: I, cmp: F) -> Ordering
where Self: Sized, I: IntoIterator, F: FnMut(Self::Item, <I as IntoIterator>::Item) -> Ordering,

🔬This is a nightly-only experimental API. (iter_order_by)
Lexicographically compares the elements of this Iterator with those of another with respect to the specified comparison function. Read more
1.5.0 · Source§

fn partial_cmp<I>(self, other: I) -> Option<Ordering>
where I: IntoIterator, Self::Item: PartialOrd<<I as IntoIterator>::Item>, Self: Sized,

Lexicographically compares the PartialOrd elements of this Iterator with those of another. The comparison works like short-circuit evaluation, returning a result without comparing the remaining elements. As soon as an order can be determined, the evaluation stops and a result is returned. Read more
Source§

fn partial_cmp_by<I, F>(self, other: I, partial_cmp: F) -> Option<Ordering>
where Self: Sized, I: IntoIterator, F: FnMut(Self::Item, <I as IntoIterator>::Item) -> Option<Ordering>,

🔬This is a nightly-only experimental API. (iter_order_by)
Lexicographically compares the elements of this Iterator with those of another with respect to the specified comparison function. Read more
1.5.0 · Source§

fn eq<I>(self, other: I) -> bool
where I: IntoIterator, Self::Item: PartialEq<<I as IntoIterator>::Item>, Self: Sized,

Determines if the elements of this Iterator are equal to those of another. Read more
Source§

fn eq_by<I, F>(self, other: I, eq: F) -> bool
where Self: Sized, I: IntoIterator, F: FnMut(Self::Item, <I as IntoIterator>::Item) -> bool,

🔬This is a nightly-only experimental API. (iter_order_by)
Determines if the elements of this Iterator are equal to those of another with respect to the specified equality function. Read more
1.5.0 · Source§

fn ne<I>(self, other: I) -> bool
where I: IntoIterator, Self::Item: PartialEq<<I as IntoIterator>::Item>, Self: Sized,

Determines if the elements of this Iterator are not equal to those of another. Read more
1.5.0 · Source§

fn lt<I>(self, other: I) -> bool
where I: IntoIterator, Self::Item: PartialOrd<<I as IntoIterator>::Item>, Self: Sized,

Determines if the elements of this Iterator are lexicographically less than those of another. Read more
1.5.0 · Source§

fn le<I>(self, other: I) -> bool
where I: IntoIterator, Self::Item: PartialOrd<<I as IntoIterator>::Item>, Self: Sized,

Determines if the elements of this Iterator are lexicographically less or equal to those of another. Read more
1.5.0 · Source§

fn gt<I>(self, other: I) -> bool
where I: IntoIterator, Self::Item: PartialOrd<<I as IntoIterator>::Item>, Self: Sized,

Determines if the elements of this Iterator are lexicographically greater than those of another. Read more
1.5.0 · Source§

fn ge<I>(self, other: I) -> bool
where I: IntoIterator, Self::Item: PartialOrd<<I as IntoIterator>::Item>, Self: Sized,

Determines if the elements of this Iterator are lexicographically greater than or equal to those of another. Read more
1.82.0 · Source§

fn is_sorted(self) -> bool
where Self: Sized, Self::Item: PartialOrd,

Checks if the elements of this iterator are sorted. Read more
1.82.0 · Source§

fn is_sorted_by<F>(self, compare: F) -> bool
where Self: Sized, F: FnMut(&Self::Item, &Self::Item) -> bool,

Checks if the elements of this iterator are sorted using the given comparator function. Read more
1.82.0 · Source§

fn is_sorted_by_key<F, K>(self, f: F) -> bool
where Self: Sized, F: FnMut(Self::Item) -> K, K: PartialOrd,

Checks if the elements of this iterator are sorted using the given key extraction function. Read more

Auto Trait Implementations§

§

impl<R> Freeze for ByteChunker<R>
where R: Freeze,

§

impl<R> RefUnwindSafe for ByteChunker<R>
where R: RefUnwindSafe,

§

impl<R> Send for ByteChunker<R>
where R: Send,

§

impl<R> Sync for ByteChunker<R>
where R: Sync,

§

impl<R> Unpin for ByteChunker<R>
where R: Unpin,

§

impl<R> UnwindSafe for ByteChunker<R>
where R: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<I> IntoIterator for I
where I: Iterator,

Source§

type Item = <I as Iterator>::Item

The type of the elements being iterated over.
Source§

type IntoIter = I

Which kind of iterator are we turning this into?
Source§

fn into_iter(self) -> I

Creates an iterator from a value. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more