ParallelReader

Trait ParallelReader 

Source
pub trait ParallelReader {
    // Required method
    fn process_parallel<P: ParallelProcessor + Clone + 'static>(
        &self,
        processor: P,
        num_threads: usize,
    ) -> Result<()>;
}
Expand description

Trait for IBU readers that can process records in parallel.

This trait is implemented by readers that can efficiently distribute records across multiple threads for parallel processing. Currently implemented by MmapReader for memory-mapped file access.

§Threading Model

The parallel processing uses a divide-and-conquer approach:

  1. The total number of records is divided evenly across threads
  2. Each thread processes its assigned range independently
  3. Within each thread, records are processed in batches for efficiency
  4. Results are aggregated through the processor’s on_batch_complete method

§Performance

Parallel processing typically scales linearly with the number of CPU cores for CPU-bound operations. For I/O-bound operations, the benefits depend on the underlying storage system.

§Examples

use ibu::{MmapReader, ParallelProcessor, ParallelReader, Record};
use std::sync::{Arc, Mutex};

#[derive(Clone, Default)]
struct SimpleCounter {
    local: u64,
    global: Arc<Mutex<u64>>,
}

impl ParallelProcessor for SimpleCounter {
    fn process_record(&mut self, _record: Record) -> ibu::Result<()> {
        self.local += 1;
        Ok(())
    }

    fn on_batch_complete(&mut self) -> ibu::Result<()> {
        *self.global.lock().unwrap() += self.local;
        self.local = 0;
        Ok(())
    }
}

let reader = MmapReader::new("data.ibu")?;
let counter = SimpleCounter::default();

// Process with 4 threads
reader.process_parallel(counter.clone(), 4)?;

// Check results
let total = *counter.global.lock().unwrap();
println!("Processed {} records", total);

Required Methods§

Source

fn process_parallel<P: ParallelProcessor + Clone + 'static>( &self, processor: P, num_threads: usize, ) -> Result<()>

Processes all records in parallel using the specified processor.

Divides the records across the specified number of threads and processes them in parallel. Each thread gets its own clone of the processor.

§Arguments
  • processor - The processor to use for handling records
  • num_threads - Number of threads to use (0 = use all available cores)
§Errors

Returns an error if:

  • Any thread encounters a processing error
  • Thread creation or coordination fails
  • The processor returns an error from process_record or on_batch_complete
§Examples
use ibu::{MmapReader, ParallelProcessor, ParallelReader, Record};

#[derive(Clone, Default)]
struct NoOpProcessor;

impl ParallelProcessor for NoOpProcessor {
    fn process_record(&mut self, _record: Record) -> ibu::Result<()> {
        Ok(()) // Do nothing
    }
}

let reader = MmapReader::new("data.ibu")?;
let processor = NoOpProcessor::default();

// Use all available cores
reader.process_parallel(processor, 0)?;

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.

Implementors§