Expand description
§File I/O Service Interface
Domain service trait for efficient file operations with chunked reading, memory mapping for large files, streaming support, and async I/O. Configurable chunk size, buffer size, and concurrency limits. Provides checksum verification, metadata extraction, and comprehensive error handling. Thread-safe operations. See mdBook for configuration and optimization strategies.
- Partial Results: Return partial results when possible
- Resource Cleanup: Automatic cleanup on errors
§Performance Considerations
§Memory Usage
- Streaming: Process files without loading entirely into memory
- Memory Mapping: Efficient memory usage for large files
- Buffer Management: Efficient buffer allocation and reuse
§I/O Optimization
- Sequential Access: Optimize for sequential file access patterns
- Prefetching: Intelligent prefetching for better performance
- Caching: File system cache utilization
§Integration
The file I/O service integrates with:
- File Processor: Used by file processor for chunk-based processing
- Pipeline Service: Integrated into pipeline processing workflow
- Storage Systems: Abstracts various storage backend implementations
- Monitoring: Provides metrics for I/O operations
§Thread Safety
The service interface is designed for thread safety:
- Concurrent Operations: Safe concurrent access to file operations
- Resource Sharing: Safe sharing of file handles and resources
- State Management: Thread-safe state management
§Future Enhancements
Planned enhancements include:
- Compression: Built-in compression for file operations
- Encryption: Transparent encryption/decryption during I/O
- Network Storage: Support for network-based storage systems
- Caching: Intelligent caching layer for frequently accessed files
§Architecture Note - Infrastructure Port
Important: This service trait is async and represents an infrastructure port, not a pure domain service. This is an intentional exception to the “domain traits should be sync” principle.
§Why FileIOService is Async
File I/O operations are inherently I/O-bound, not CPU-bound:
- I/O-Bound Operations: File operations involve waiting for disk I/O
- Non-Blocking Benefits: Async I/O prevents blocking the runtime
- tokio Integration: Async file operations integrate naturally with tokio
- Performance: Async I/O provides better concurrency for I/O operations
§Architectural Classification
This trait is classified as an infrastructure port rather than a domain service:
- Domain Services: CPU-bound business logic (compression, encryption, checksums)
- Infrastructure Ports: I/O-bound operations (file I/O, network, database)
§Design Trade-offs
We considered making this sync (using std::fs) but chose async because:
- Most of the application uses tokio async runtime
- File operations benefit from non-blocking I/O
- Alternative would be to use blocking thread pool, adding complexity
- The trait is already an infrastructure concern (port/interface)
§References
See REFACTORING_STATUS.md Phase 1, item 2 for full discussion.
Structs§
- FileIO
Config - Configuration for file I/O operations
- FileIO
Stats - Statistics for file I/O operations
- File
Info - Information about a file being processed
- Read
Options - Options for reading files
- Read
Result - Result of a file read operation
- Write
Options - Options for writing files
- Write
Result - Result of a file write operation
Traits§
- FileIO
Service - Trait for file I/O operations with memory mapping support