Skip to main content

Module batch

Module batch 

Source
Expand description

Batch processing for multiple HEDL files with parallel execution and progress reporting. Batch processing for multiple HEDL files with parallel execution and progress reporting.

This module provides efficient batch processing capabilities for operations on multiple HEDL files. It uses Rayon for parallel processing when beneficial and provides real-time progress reporting with detailed error tracking.

§Features

  • Parallel Processing: Automatic parallelization using Rayon’s work-stealing scheduler
  • Progress Reporting: Real-time progress with file counts and success/failure tracking
  • Error Resilience: Continues processing on errors, collecting all failures for reporting
  • Performance Optimization: Intelligent parallel/serial mode selection based on workload
  • Type Safety: Strongly typed operation definitions with compile-time guarantees

§Architecture

The batch processing system uses a functional architecture with:

  • Operation trait for extensible batch operations
  • Result aggregation with detailed error context
  • Atomic counters for thread-safe progress tracking
  • Zero-copy file path handling

§Examples

use hedl_cli::batch::{BatchExecutor, BatchConfig, ValidationOperation};
use std::path::PathBuf;

// Create a batch processor with default configuration
let processor = BatchExecutor::new(BatchConfig::default());

// Validate multiple files in parallel
let files = vec![
    PathBuf::from("file1.hedl"),
    PathBuf::from("file2.hedl"),
    PathBuf::from("file3.hedl"),
];

let operation = ValidationOperation { strict: true };
let results = processor.process(&files, operation, true)?;

println!("Processed {} files, {} succeeded, {} failed",
    results.total_files(),
    results.success_count(),
    results.failure_count()
);

§Performance Characteristics

  • Small batches (< 10 files): Serial processing to avoid overhead
  • Medium batches (10-100 files): Parallel with Rayon thread pool
  • Large batches (> 100 files): Chunked parallel processing with progress updates

§Thread Safety

All progress tracking uses atomic operations for lock-free concurrent access. Operations are required to be Send + Sync for parallel execution.

§Thread Pool Management

The batch processor supports two thread pool strategies:

§Global Thread Pool (Default)

When max_threads is None, operations use Rayon’s global thread pool:

  • Zero overhead (no pool creation)
  • Shared across all Rayon operations in the process
  • Thread count typically matches CPU core count

§Local Thread Pool (Isolated)

When max_threads is Some(n), each operation creates an isolated local pool:

  • Guaranteed thread count of exactly n threads
  • No global state pollution
  • Supports concurrent operations with different configurations
  • Small creation overhead (~0.5-1ms) and memory cost (~2-8MB per thread)

§Examples

use hedl_cli::batch::{BatchExecutor, BatchConfig};
use std::path::PathBuf;

// Concurrent operations with different thread counts
use std::thread;

let files: Vec<PathBuf> = vec!["a.hedl".into(), "b.hedl".into()];

let handle1 = thread::spawn(|| {
    let processor = BatchExecutor::new(BatchConfig {
        max_threads: Some(2),
        ..Default::default()
    });
    // Uses 2 threads
});

let handle2 = thread::spawn(|| {
    let processor = BatchExecutor::new(BatchConfig {
        max_threads: Some(4),
        ..Default::default()
    });
    // Uses 4 threads, isolated from handle1
});

Structs§

BatchConfig
Configuration for batch processing operations.
BatchExecutor
High-performance batch processor for HEDL files.
BatchResults
Aggregated results from a batch processing operation.
FileResult
Result of processing a single file in a batch operation.
FormatOperation
Batch format operation.
LintOperation
Batch lint operation.
StreamingValidationOperation
Streaming validation operation for memory-efficient validation of large files.
ValidationOperation
Batch validation operation.
ValidationStats
Statistics collected during streaming validation.

Traits§

BatchOperation
Trait for batch operations on HEDL files.
StreamingBatchOperation
Trait for streaming batch operations on HEDL files.

Functions§

get_max_batch_files
Get maximum batch files from environment variable or default.
validate_file_count
Validate file count against configured limit.
warn_large_batch
Warn if file count is large and suggest verbose mode.