Expand description
Batch processing for multiple HEDL files with parallel execution and progress reporting. Batch processing for multiple HEDL files with parallel execution and progress reporting.
This module provides efficient batch processing capabilities for operations on multiple HEDL files. It uses Rayon for parallel processing when beneficial and provides real-time progress reporting with detailed error tracking.
§Features
- Parallel Processing: Automatic parallelization using Rayon’s work-stealing scheduler
- Progress Reporting: Real-time progress with file counts and success/failure tracking
- Error Resilience: Continues processing on errors, collecting all failures for reporting
- Performance Optimization: Intelligent parallel/serial mode selection based on workload
- Type Safety: Strongly typed operation definitions with compile-time guarantees
§Architecture
The batch processing system uses a functional architecture with:
- Operation trait for extensible batch operations
- Result aggregation with detailed error context
- Atomic counters for thread-safe progress tracking
- Zero-copy file path handling
§Examples
use hedl_cli::batch::{BatchExecutor, BatchConfig, ValidationOperation};
use std::path::PathBuf;
// Create a batch processor with default configuration
let processor = BatchExecutor::new(BatchConfig::default());
// Validate multiple files in parallel
let files = vec![
PathBuf::from("file1.hedl"),
PathBuf::from("file2.hedl"),
PathBuf::from("file3.hedl"),
];
let operation = ValidationOperation { strict: true };
let results = processor.process(&files, operation, true)?;
println!("Processed {} files, {} succeeded, {} failed",
results.total_files(),
results.success_count(),
results.failure_count()
);§Performance Characteristics
- Small batches (< 10 files): Serial processing to avoid overhead
- Medium batches (10-100 files): Parallel with Rayon thread pool
- Large batches (> 100 files): Chunked parallel processing with progress updates
§Thread Safety
All progress tracking uses atomic operations for lock-free concurrent access. Operations are required to be Send + Sync for parallel execution.
§Thread Pool Management
The batch processor supports two thread pool strategies:
§Global Thread Pool (Default)
When max_threads is None, operations use Rayon’s global thread pool:
- Zero overhead (no pool creation)
- Shared across all Rayon operations in the process
- Thread count typically matches CPU core count
§Local Thread Pool (Isolated)
When max_threads is Some(n), each operation creates an isolated local pool:
- Guaranteed thread count of exactly
nthreads - No global state pollution
- Supports concurrent operations with different configurations
- Small creation overhead (~0.5-1ms) and memory cost (~2-8MB per thread)
§Examples
use hedl_cli::batch::{BatchExecutor, BatchConfig};
use std::path::PathBuf;
// Concurrent operations with different thread counts
use std::thread;
let files: Vec<PathBuf> = vec!["a.hedl".into(), "b.hedl".into()];
let handle1 = thread::spawn(|| {
let processor = BatchExecutor::new(BatchConfig {
max_threads: Some(2),
..Default::default()
});
// Uses 2 threads
});
let handle2 = thread::spawn(|| {
let processor = BatchExecutor::new(BatchConfig {
max_threads: Some(4),
..Default::default()
});
// Uses 4 threads, isolated from handle1
});Structs§
- Batch
Config - Configuration for batch processing operations.
- Batch
Executor - High-performance batch processor for HEDL files.
- Batch
Results - Aggregated results from a batch processing operation.
- File
Result - Result of processing a single file in a batch operation.
- Format
Operation - Batch format operation.
- Lint
Operation - Batch lint operation.
- Streaming
Validation Operation - Streaming validation operation for memory-efficient validation of large files.
- Validation
Operation - Batch validation operation.
- Validation
Stats - Statistics collected during streaming validation.
Traits§
- Batch
Operation - Trait for batch operations on HEDL files.
- Streaming
Batch Operation - Trait for streaming batch operations on HEDL files.
Functions§
- get_
max_ batch_ files - Get maximum batch files from environment variable or default.
- validate_
file_ count - Validate file count against configured limit.
- warn_
large_ batch - Warn if file count is large and suggest verbose mode.