Module file_io_service

Module file_io_service 

Source
Expand description

§File I/O Service Interface

Domain service trait for efficient file operations with chunked reading, memory mapping for large files, streaming support, and async I/O. Configurable chunk size, buffer size, and concurrency limits. Provides checksum verification, metadata extraction, and comprehensive error handling. Thread-safe operations. See mdBook for configuration and optimization strategies.

  • Partial Results: Return partial results when possible
  • Resource Cleanup: Automatic cleanup on errors

§Performance Considerations

§Memory Usage

  • Streaming: Process files without loading entirely into memory
  • Memory Mapping: Efficient memory usage for large files
  • Buffer Management: Efficient buffer allocation and reuse

§I/O Optimization

  • Sequential Access: Optimize for sequential file access patterns
  • Prefetching: Intelligent prefetching for better performance
  • Caching: File system cache utilization

§Integration

The file I/O service integrates with:

  • File Processor: Used by file processor for chunk-based processing
  • Pipeline Service: Integrated into pipeline processing workflow
  • Storage Systems: Abstracts various storage backend implementations
  • Monitoring: Provides metrics for I/O operations

§Thread Safety

The service interface is designed for thread safety:

  • Concurrent Operations: Safe concurrent access to file operations
  • Resource Sharing: Safe sharing of file handles and resources
  • State Management: Thread-safe state management

§Future Enhancements

Planned enhancements include:

  • Compression: Built-in compression for file operations
  • Encryption: Transparent encryption/decryption during I/O
  • Network Storage: Support for network-based storage systems
  • Caching: Intelligent caching layer for frequently accessed files

§Architecture Note - Infrastructure Port

Important: This service trait is async and represents an infrastructure port, not a pure domain service. This is an intentional exception to the “domain traits should be sync” principle.

§Why FileIOService is Async

File I/O operations are inherently I/O-bound, not CPU-bound:

  • I/O-Bound Operations: File operations involve waiting for disk I/O
  • Non-Blocking Benefits: Async I/O prevents blocking the runtime
  • tokio Integration: Async file operations integrate naturally with tokio
  • Performance: Async I/O provides better concurrency for I/O operations

§Architectural Classification

This trait is classified as an infrastructure port rather than a domain service:

  • Domain Services: CPU-bound business logic (compression, encryption, checksums)
  • Infrastructure Ports: I/O-bound operations (file I/O, network, database)

§Design Trade-offs

We considered making this sync (using std::fs) but chose async because:

  1. Most of the application uses tokio async runtime
  2. File operations benefit from non-blocking I/O
  3. Alternative would be to use blocking thread pool, adding complexity
  4. The trait is already an infrastructure concern (port/interface)

§References

See REFACTORING_STATUS.md Phase 1, item 2 for full discussion.

Structs§

FileIOConfig
Configuration for file I/O operations
FileIOStats
Statistics for file I/O operations
FileInfo
Information about a file being processed
ReadOptions
Options for reading files
ReadResult
Result of a file read operation
WriteOptions
Options for writing files
WriteResult
Result of a file write operation

Traits§

FileIOService
Trait for file I/O operations with memory mapping support