# Streaming Encoding/Decoding
base-d supports streaming mode for processing large files without loading them entirely into memory. This is particularly useful when working with files larger than available RAM.
## Overview
Streaming mode processes data in 4KB chunks, making it memory-efficient for large files while maintaining the same encoding/decoding guarantees.
## Usage
Add the `--stream` (or `-s`) flag to enable streaming mode:
```bash
# Stream encode a large file
base-d --stream -e base64 large_file.bin > encoded.txt
# Stream decode
base-d --stream -d base64 encoded.txt > decoded.bin
# Works with stdin/stdout
cat large_file.bin | base-d --stream -e base64 | base-d --stream -d base64 > output.bin
```
## Encoding Mode Support
### Chunked Mode (Full Streaming Support)
RFC 4648 encodings (base64, base32, base16) fully support streaming:
```bash
# Encode 1GB file with streaming
base-d --stream -e base64 huge_file.dat > encoded.txt
```
Benefits:
- Constant memory usage (~4KB buffer)
- No temporary files
- Progress can be monitored with `pv`
### Byte Range Mode (Full Streaming Support)
Direct byte-to-character mapping also streams perfectly:
```bash
# base100 with streaming
base-d --stream -e base100 data.bin > emoji.txt
```
Benefits:
- 1:1 byte mapping
- Minimal overhead
- Perfect for emoji encodings
### Mathematical Base Conversion (Limited Support)
Mathematical mode treats data as a single large number, so it requires the entire input:
```bash
# This will still load the entire file into memory
base-d --stream -e cards large_file.bin > output.txt
```
While the `--stream` flag is accepted, mathematical mode (used by cards, dna, etc.) will read the entire input before encoding.
## Performance Comparison
### Memory Usage
| base64 (10MB file) | 10MB | ~4KB |
| base64 (1GB file) | 1GB | ~4KB |
| base100 (10MB file) | 10MB | ~4KB |
| cards (any size) | Full file size | Full file size* |
*Mathematical mode always requires full input
### Speed
Streaming mode has minimal performance overhead:
```bash
# Benchmark encoding 100MB file
$ time base-d -e base64 100mb.bin > /dev/null
real 0m0.45s
$ time base-d --stream -e base64 100mb.bin > /dev/null
real 0m0.47s
```
The ~4% overhead is acceptable for the memory savings.
## xxHash Configuration in Streaming
When using streaming mode with hashing enabled, you can configure xxHash algorithms with custom seeds and secrets (for XXH3 variants). This is particularly useful for large files where you want consistent hash customization without loading the entire file into memory.
### CLI Usage
```bash
# Stream encode with xxHash and custom seed
base-d --stream -e base64 --hash xxhash64 --hash-seed 42 large_file.bin > encoded.txt
# Stream with XXH3 secret
# Combine streaming, compression, and custom hash
base-d --stream --compress zstd --hash xxhash3-128 --hash-seed 999 multi_gb_file.bin > compressed.txt
```
### API Usage
```rust
use base_d::{StreamingEncoder, XxHashConfig, HashAlgorithm, Dictionary};
use std::fs::File;
// Create streaming encoder with custom xxHash config
let xxhash_config = XxHashConfig::with_seed(42);
let mut encoder = StreamingEncoder::new(&dictionary, output)
.with_xxhash_config(xxhash_config)
.with_hash(HashAlgorithm::XxHash64);
encoder.encode(&mut large_file)?;
```
The hash is computed incrementally as data streams through, so memory usage remains constant regardless of file size or hash configuration.
## When to Use Streaming
### Use Streaming When:
1. **Large files** - Files larger than available RAM
2. **Pipeline processing** - Data flows through multiple tools
3. **Memory constraints** - Running on low-memory systems
4. **Chunked/ByteRange modes** - Using base64, base32, base100, etc.
### Don't Use Streaming When:
1. **Small files** - Files under 1MB (overhead not worth it)
2. **Mathematical modes** - cards, dna, emoji_faces (no benefit)
3. **Random access needed** - Need to seek within encoded data
## Examples
### Example 1: Process Large Log File
```bash
# Encode a 5GB log file
base-d --stream -e base64 access.log > access.b64
# Decode back
base-d --stream -d base64 access.b64 > access.log
```
### Example 2: Pipeline with Other Tools
```bash
# Compress, encode, and upload
### Example 3: Monitor Progress
```bash
# Show progress while encoding
pv huge_file.bin | base-d --stream -e base64 > encoded.txt
```
### Example 4: Split Large Encoded File
```bash
# Encode and split into 100MB chunks
## Technical Details
### Chunk Size
Default chunk size is 4KB (4096 bytes), which balances:
- Memory efficiency
- I/O performance
- CPU cache usage
For chunked mode, chunks are aligned to encoding group boundaries to avoid padding issues.
### Buffer Management
- **Encoding**: Reads 4KB, encodes, writes immediately
- **Decoding**: Reads 4KB, accumulates complete character groups, decodes, writes
### Error Handling
Streaming mode provides the same error detection as standard mode:
```bash
```
Errors are detected as soon as invalid data is encountered, not after reading the entire input.
## Limitations
1. **Mathematical mode limitation**: No memory savings for base_conversion mode
2. **No random access**: Streaming is forward-only
3. **Progress reporting**: Standard mode can show progress percentage, streaming cannot
## API Usage
For library users, streaming is available programmatically:
```rust
use base_d::{DictionariesConfig, Dictionary, StreamingEncoder, StreamingDecoder};
use std::fs::File;
// Encode
let config = DictionariesConfig::load_default()?;
let dict_config = config.get_dictionary("base64").unwrap();
let chars: Vec<char> = dict_config.chars.chars().collect();
let dictionary = Dictionary::new_with_mode(
chars,
dict_config.mode.clone(),
dict_config.padding.as_ref().and_then(|s| s.chars().next())
)?;
let mut input = File::open("large_file.bin")?;
let mut output = File::create("encoded.txt")?;
let mut encoder = StreamingEncoder::new(&dictionary, output);
encoder.encode(&mut input)?;
```
## See Also
- [Encoding Modes](ENCODING_MODES.md) - Understanding which modes support streaming
- [Performance Tips](../README.md#performance) - Optimization recommendations
- Issue #4 - Original feature request