Trait ParallelReader

Source

pub trait ParallelReader {
    // Required method
    fn process_parallel<P: ParallelProcessor + Clone + 'static>(
        &self,
        processor: P,
        num_threads: usize,
    ) -> Result<()>;
}

Expand description

Trait for IBU readers that can process records in parallel.

This trait is implemented by readers that can efficiently distribute records across multiple threads for parallel processing. Currently implemented by MmapReader for memory-mapped file access.

§Threading Model

The parallel processing uses a divide-and-conquer approach:

The total number of records is divided evenly across threads
Each thread processes its assigned range independently
Within each thread, records are processed in batches for efficiency
Results are aggregated through the processor’s on_batch_complete method

§Performance

Parallel processing typically scales linearly with the number of CPU cores for CPU-bound operations. For I/O-bound operations, the benefits depend on the underlying storage system.

§Examples

use ibu::{MmapReader, ParallelProcessor, ParallelReader, Record};
use std::sync::{Arc, Mutex};

#[derive(Clone, Default)]
struct SimpleCounter {
    local: u64,
    global: Arc<Mutex<u64>>,
}

impl ParallelProcessor for SimpleCounter {
    fn process_record(&mut self, _record: Record) -> ibu::Result<()> {
        self.local += 1;
        Ok(())
    }

    fn on_batch_complete(&mut self) -> ibu::Result<()> {
        *self.global.lock().unwrap() += self.local;
        self.local = 0;
        Ok(())
    }
}

let reader = MmapReader::new("data.ibu")?;
let counter = SimpleCounter::default();

// Process with 4 threads
reader.process_parallel(counter.clone(), 4)?;

// Check results
let total = *counter.global.lock().unwrap();
println!("Processed {} records", total);

Required Methods§

Source

fn process_parallel<P: ParallelProcessor + Clone + 'static>( &self, processor: P, num_threads: usize, ) -> Result<()>

Processes all records in parallel using the specified processor.

Divides the records across the specified number of threads and processes them in parallel. Each thread gets its own clone of the processor.

§Arguments

processor - The processor to use for handling records
num_threads - Number of threads to use (0 = use all available cores)

§Errors

Returns an error if:

Any thread encounters a processing error
Thread creation or coordination fails
The processor returns an error from process_record or on_batch_complete

§Examples

use ibu::{MmapReader, ParallelProcessor, ParallelReader, Record};

#[derive(Clone, Default)]
struct NoOpProcessor;

impl ParallelProcessor for NoOpProcessor {
    fn process_record(&mut self, _record: Record) -> ibu::Result<()> {
        Ok(()) // Do nothing
    }
}

let reader = MmapReader::new("data.ibu")?;
let processor = NoOpProcessor::default();

// Use all available cores
reader.process_parallel(processor, 0)?;

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.

Implementors§

Source §

ParallelReader

Trait ParallelReader Copy item path

§Threading Model

§Performance

§Examples

Required Methods§

fn process_parallel<P: ParallelProcessor + Clone + 'static>( &self, processor: P, num_threads: usize, ) -> Result<()>

§Arguments

§Errors

§Examples

Dyn Compatibility§

Implementors§

impl ParallelReader for MmapReader

Trait ParallelReader