embeddenator-io
Comprehensive I/O library for the Embeddenator ecosystem. Provides serialization, buffering, streaming, and compression utilities for efficient data handling.
Independent component extracted from the Embeddenator monolithic repository. Part of the Embeddenator workspace.
Repository: https://github.com/tzervas/embeddenator-io
Features
- Serialization: Binary (bincode) and JSON formats
- Buffering: Configurable buffer sizes for optimal performance
- Streaming: Memory-efficient processing of large datasets
- Compression: Zstandard and LZ4 support (optional)
- Async Support: Tokio-based async I/O (optional)
- Envelope Format: Binary container with compression metadata
Status
Production Ready - Fully tested and documented I/O component.
Usage
[]
= { = "../embeddenator-io" }
# Enable async support
= { = "../embeddenator-io", = ["async"] }
# Enable compression
= { = "../embeddenator-io", = ["compression-zstd", "compression-lz4"] }
Quick Examples
Serialization
use *;
// Write data to file in bincode format
let data = vec!;
write_bincode_file?;
// Read data from file
let loaded: = read_bincode_file?;
// JSON format
write_json_file?;
let loaded_json: = read_json_file?;
Buffering
use *;
use File;
// Buffered file reading
let file = open?;
let mut reader = buffered_reader;
// Read in chunks
let chunks = read_chunks?;
// Efficient copy
let mut src = open?;
let mut dst = create?;
copy_buffered?;
Streaming
use *;
use File;
// Stream large file
let file = open?;
let mut stream = new;
// Count bytes without loading all data
let total = stream.count_bytes?;
// Fold operation for aggregation
let sum = stream.fold?;
// Stream writer
let mut output = Vecnew;
let mut writer = new;
writer.write_chunk?;
writer.write_chunk?;
writer.flush?;
Envelope Format
use *;
// Wrap data with compression
let data = b"Some data to compress";
let wrapped = wrap_or_legacy?;
// Unwrap automatically detects format
let unwrapped = unwrap_auto?;
assert_eq!;
Development
# Build
# Run all tests
# Run with all features
# Build documentation
Testing
- Unit Tests: 12 tests covering core functionality
- Integration Tests: 16 comprehensive end-to-end tests
- Documentation Tests: 18 examples in documentation
- Total Coverage: 46 tests (100% passing)
Performance
Buffer Sizes
SMALL_BUFFER_SIZE(4KB): Small files, memory-constrainedDEFAULT_BUFFER_SIZE(64KB): Optimal for most use casesLARGE_BUFFER_SIZE(1MB): Large file operations
Format Comparison
| Format | Size | Speed | Precision | Human Readable |
|---|---|---|---|---|
| Bincode | Small | Fast | Exact | No |
| JSON | Large | Slower | Approximate | Yes |
Architecture
See ADR-016 for component decomposition rationale.
See MIGRATION_SUMMARY.md for detailed migration information.
License
MIT