pub struct Pipeline<T: Send + 'static> { /* private fields */ }Expand description
A parallel data pipeline backed by bounded crossbeam channels and worker threads.
Each stage method consumes the pipeline and returns a new one whose type
reflects the output of that stage. All worker threads are tracked in a
shared Arc<Mutex<Vec<JoinHandle>>> so that calling one of the terminal
methods (collect, for_each, wait_for_completion) joins every thread
spawned by the whole graph.
Implementations§
Source§impl<T: Send + 'static> Pipeline<T>
impl<T: Send + 'static> Pipeline<T>
Sourcepub fn new(iter: impl IntoIterator<Item = T> + Send + 'static) -> Self
pub fn new(iter: impl IntoIterator<Item = T> + Send + 'static) -> Self
Create a pipeline from any iterator. A feeder thread is spawned immediately to push items into the first bounded channel.
Sourcepub fn with_buffer(self, size: usize) -> Self
pub fn with_buffer(self, size: usize) -> Self
Override the inter-stage channel capacity (default: 4, matching olympipe).
Source§impl<T: Send + 'static> Pipeline<T>
impl<T: Send + 'static> Pipeline<T>
Sourcepub fn task<U, F>(self, f: F, count: usize) -> Pipeline<U>
pub fn task<U, F>(self, f: F, count: usize) -> Pipeline<U>
Map each item through f. count worker threads run concurrently,
sharing the input channel (MPMC) and writing to a shared output channel.
Sourcepub fn task_or<U, E, F, H>(self, f: F, on_error: H) -> Pipeline<U>
pub fn task_or<U, E, F, H>(self, f: F, on_error: H) -> Pipeline<U>
Map each item through f which may fail. On error, on_error is
called with a clone of the original item and the error; if it returns
Some(v), v is forwarded downstream; None skips the item.
Sourcepub fn batch(self, size: usize) -> Pipeline<Vec<T>>
pub fn batch(self, size: usize) -> Pipeline<Vec<T>>
Group items into Vec<T> of at most size elements.
The last (potentially incomplete) batch is always emitted.
Sourcepub fn temporal_batch(self, window: Duration) -> Pipeline<Vec<T>>
pub fn temporal_batch(self, window: Duration) -> Pipeline<Vec<T>>
Collect items arriving within window of each other into a Vec<T>.
A new batch begins whenever the inter-item gap exceeds window, or when
the upstream channel disconnects.
Sourcepub fn explode<U, I, F>(self, f: F) -> Pipeline<U>
pub fn explode<U, I, F>(self, f: F) -> Pipeline<U>
Apply f to each item and forward every element yielded by the
returned iterator (flatMap).
Sourcepub fn split<A, B, F>(self, f: F) -> (Pipeline<A>, Pipeline<B>)
pub fn split<A, B, F>(self, f: F) -> (Pipeline<A>, Pipeline<B>)
Route each item to one or both of two output pipelines according to f.
Both returned pipelines share the same handle registry, so a single terminal call on either one will join all threads.
The two output channels are unbounded so that the router thread can
fully drain its input even when the two branches are consumed
sequentially (calling collect() on one branch at a time). If you add
further bounded stages after split, back-pressure is restored there.
Sourcepub fn gather(self, others: Vec<Pipeline<T>>) -> Pipeline<T>
pub fn gather(self, others: Vec<Pipeline<T>>) -> Pipeline<T>
Merge self and all pipelines in others into a single output stream.
One forwarder thread is spawned per source; all handles are merged into
the returned pipeline’s handle registry.
Sourcepub fn reduce<U, F>(self, init: U, f: F) -> Pipeline<U>
pub fn reduce<U, F>(self, init: U, f: F) -> Pipeline<U>
Fold all items into a single accumulated value and emit it as the sole output item.
Source§impl<T: Send + 'static> Pipeline<T>
impl<T: Send + 'static> Pipeline<T>
Sourcepub fn collect(self) -> Vec<T>
pub fn collect(self) -> Vec<T>
Collect all output items into a Vec, then join every worker thread.
Sourcepub fn for_each<F>(self, f: F)where
F: FnMut(T),
pub fn for_each<F>(self, f: F)where
F: FnMut(T),
Apply f to each output item, then join every worker thread.
Sourcepub fn wait_for_completion(self)
pub fn wait_for_completion(self)
Drain and discard all output, then join every worker thread.