engram-rs
A unified Rust library for creating, reading, and managing Engram archives - compressed, cryptographically signed archive files with embedded metadata and SQLite database support.
Features
- 📦 Compressed Archives: LZ4 (fast) and Zstd (high compression ratio) with automatic format selection
- 🔐 Cryptographic Signatures: Ed25519 signatures for authenticity and integrity verification
- 📋 Manifest System: JSON-based metadata with file registry, author info, and capabilities
- 💾 Virtual File System (VFS): Direct SQL queries on embedded SQLite databases without extraction
- ⚡ Fast Lookups: O(1) file access via central directory with 320-byte fixed entries
- ✅ Integrity Verification: CRC32 checksums for all files
- 🔒 Encryption Support: AES-256-GCM encryption (per-file or full-archive)
- 🎯 Frame-based Compression: Efficient handling of large files (≥50MB) with incremental decompression
- 🛡️ Battle-Tested: 166 tests covering security, performance, concurrency, and reliability
Installation
Add this to your Cargo.toml:
[]
= "1.0"
Quick Start
Creating an Archive
use ;
Reading from an Archive
use ArchiveReader;
Working with Manifests and Signatures
use ;
use SigningKey;
use OsRng;
Querying Embedded SQLite Databases
use VfsReader;
Archive Format
Engram uses a custom binary format (v1.0) with the following structure:
┌─────────────────────────────────────────┐
│ File Header (64 bytes) │
│ - Magic: 0x89 'E' 'N' 'G' 0x0D 0x0A 0x1A 0x0A │
│ - Format version (major.minor) │
│ - Central directory offset/size │
│ - Entry count, content version │
│ - CRC32 checksum │
├─────────────────────────────────────────┤
│ Local Entry Header 1 (LOCA) │
│ Compressed File Data 1 │
├─────────────────────────────────────────┤
│ Local Entry Header 2 (LOCA) │
│ Compressed File Data 2 │
├─────────────────────────────────────────┤
│ ... │
├─────────────────────────────────────────┤
│ Central Directory │
│ - Entry 1 (320 bytes fixed) │
│ - Entry 2 (320 bytes fixed) │
│ - ... │
├─────────────────────────────────────────┤
│ End of Central Directory (ENDR) │
├─────────────────────────────────────────┤
│ manifest.json (optional) │
│ - Metadata, author, signatures │
└─────────────────────────────────────────┘
Key Features:
- Magic Number: PNG-style magic bytes for file type detection
- Fixed-Width Entries: 320-byte central directory entries enable O(1) file lookup
- Local Headers: Enable sequential streaming reads without central directory
- End-Placed Directory: Enables streaming creation without manifest foreknowledge
- Manifest: JSON metadata with Ed25519 signature support
See ENGRAM_SPECIFICATION.md for complete binary format specification.
Compression
The library automatically selects compression based on file type and size:
| File Type | Size | Compression | Typical Ratio |
|---|---|---|---|
| Text files (.txt, .json, .md, etc.) | ≥ 4KB | Zstd (best ratio) | 50-100x |
| Binary files (.db, .wasm, etc.) | ≥ 4KB | LZ4 (fastest) | 2-5x |
| Already compressed (.png, .jpg, .zip, etc.) | Any | None | 1x |
| Small files | < 4KB | None | N/A |
| Large files | ≥ 50MB | Frame-based | Varies |
Compression Performance:
- Highly compressible data (zeros, patterns): 200-750x
- Text files (JSON, Markdown, code): 50-100x
- Mixed data: 50-100x
- Large files (≥50MB): Automatic 64KB frame compression
You can also manually specify compression:
writer.add_file_with_compression?;
Cryptography
Signatures (Ed25519)
use SigningKey;
use OsRng;
// Generate keypair
let signing_key = generate;
// Sign manifest
manifest.sign?;
// Verify signatures
let results = manifest.verify_signatures?;
println!;
Security:
- Constant-time signature verification (no timing attack vulnerabilities)
- Multiple signatures supported (multi-party signing)
- Signature invalidation on data modification detected
Encryption (AES-256-GCM)
// Encrypt individual files (per-file encryption)
writer.add_encrypted_file?;
// Decrypt when reading
let data = reader.read_encrypted_file?;
Encryption Modes:
- Archive-level: Entire archive encrypted (backup/secure storage)
- Per-file: Individual file encryption (selective decryption, database queries on unencrypted DBs)
Performance
Benchmarks on a test file (10MB, Intel i7-12700K, NVMe SSD):
| Compression | Write Speed | Read Speed | Ratio |
|---|---|---|---|
| None | 450 MB/s | 500 MB/s | 1.0x |
| LZ4 | 380 MB/s | 420 MB/s | 2.1x |
| Zstd | 95 MB/s | 180 MB/s | 3.8x |
Scalability (tested):
- Archive size: Up to 1GB (500MB routinely tested)
- File count: Up to 10,000 files (1,000 files in <50ms)
- File access: O(1) HashMap lookup (sub-millisecond)
- Path length: Up to 255 bytes (engram format limit)
- Directory depth: Up to 20 levels tested
VFS Performance:
- SQLite queries: 80-90% of native filesystem performance
- Cold cache: 60-70% of native (decompression overhead)
- Warm cache: 85-95% of native (cache hits)
Testing & Quality Assurance
engram-rs has undergone comprehensive testing across 4 major phases:
Test Statistics
- Total Tests: 166 (all passing)
- 23 unit tests
- 46 Phase 1 tests (security & integrity)
- 33 Phase 2 tests (concurrency & reliability)
- 16 Phase 3 tests + 4 stress tests (performance & scale)
- 26 Phase 4 tests (security audit)
- 10 integration tests
- 7 v1 feature tests
- 5 debug tests
Phase 1: Security & Integrity (46 tests)
Coverage:
- ✅ Corruption detection (15 tests): Magic number, version, header, central directory, truncation
- ✅ Fuzzing infrastructure: cargo-fuzz ready with seed corpus
- ✅ Signature security (13 tests): Tampering, replay attacks, algorithm downgrade, multi-sig
- ✅ Encryption security (18 tests): Archive-level, per-file, wrong keys, compression+encryption
Findings:
- All corruption scenarios properly detected and rejected
- Signature verification cryptographically sound
- AES-256-GCM implementation secure
- No undefined behavior on malformed inputs
Phase 2: Concurrency & Reliability (33 tests)
Coverage:
- ✅ Concurrent VFS/SQLite access (5 tests): 10 threads × 1,000 queries
- ✅ Multi-reader stress tests (6 tests): 100 concurrent readers, 64K operations
- ✅ Crash recovery (13 tests): Incomplete archives, truncation at 10-90%, corruption
- ✅ Frame compression edge cases (9 tests): 50MB threshold, 200MB files, data integrity
Findings:
- Thread-safe VFS with no resource leaks
- True parallelism via separate file handles
- All incomplete archives properly rejected
- Frame compression works correctly for large files (≥50MB)
Operations Tested:
- 10,000+ concurrent VFS database queries
- 64,000+ multi-reader operations
- 500MB+ data processed
Phase 3: Performance & Scale (16 tests + 4 stress)
Coverage:
- ✅ Large archives (8 tests): 500MB-1GB archives, 10K files, path edge cases
- ✅ Compression validation (8 tests): Text, binary, pre-compressed, effectiveness
Findings:
- Scales to 1GB+ archives with no issues
- 10,000+ files handled efficiently (O(1) lookup)
- Compression ratios: 50-227x typical, 227x for zeros, 59x for text
- Performance: ~120 MB/s write, ~200 MB/s read
Stress Tests (run with --ignored):
- 500MB archive: 4.3 seconds (500MB → 1MB, 500x compression)
- 1GB archive: ~10 seconds
- 10,000 files: ~1 second
Phase 4: Security Audit (26 tests)
Coverage:
- ✅ Path traversal prevention (10 tests): ../, absolute paths, null bytes, normalization
- ✅ ZIP bomb protection (8 tests): Compression ratios, decompression safety
- ✅ Cryptographic attacks (8 tests): Timing attacks, weak keys, side-channels
Findings:
Path Security:
- ⚠️ Path traversal attempts (../, absolute paths) accepted but normalized
- ⚠️ Applications must sanitize paths during extraction
- ✅ 255-byte path limit enforced (rejected at finalize())
- ✅ Case-sensitive storage (File.txt ≠ file.txt)
Compression Security:
- ✅ Excellent compression ratios (200-750x)
- ✅ No recursive compression (prevents nested bombs)
- ✅ Frame compression limits memory (64KB frames)
- ⚠️ Relies on zstd/lz4 library safety checks (no explicit bomb detection)
Cryptographic Security:
- ✅ Ed25519 signatures with constant-time verification
- ✅ No timing attack vulnerabilities detected
- ✅ Weak keys avoided (OsRng used)
- ✅ Signature invalidation on modification detected
- ✅ Multiple signatures supported
Verdict: No critical security vulnerabilities found. engram-rs is production-ready with proper application-level path sanitization.
Documentation
Comprehensive testing documentation:
- TESTING_PLAN.md - Overall testing strategy and status
- TESTING_PHASE_1.1_FINDINGS.md - Corruption detection
- TESTING_PHASE_1.2_FUZZING.md - Fuzzing infrastructure
- TESTING_PHASE_1.3_SIGNATURES.md - Signature security
- TESTING_PHASE_1.4_ENCRYPTION.md - Encryption security
- TESTING_PHASE_2_CONCURRENCY.md - Concurrency tests
- TESTING_PHASE_3_PERFORMANCE.md - Performance tests
- TESTING_PHASE_4_SECURITY.md - Security audit
API Overview
Core Types
ArchiveWriter- Create and write to archivesArchiveReader- Read from existing archivesVfsReader- Query SQLite databases in archivesManifest- Archive metadata and signaturesCompressionMethod- Compression algorithm selectionEngramError- Error types
Convenience Methods
| Operation | Method |
|---|---|
| Create archive | ArchiveWriter::create(path) |
| Open archive | ArchiveReader::open_and_init(path) |
| Open encrypted | ArchiveReader::open_encrypted(path, key) |
| Add file | writer.add_file(name, data) |
| Add from disk | writer.add_file_from_disk(name, path) |
| Read file | reader.read_file(name) |
| List files | reader.list_files() |
| Add manifest | writer.add_manifest(manifest) |
| Sign manifest | manifest.sign(key, signer) |
| Verify signatures | manifest.verify_signatures() |
| Query database | vfs.open_database(name) |
Examples
See the examples/ directory for complete examples:
basic.rs- Creating and reading archivesmanifest.rs- Working with manifests and signaturescompression.rs- Compression optionsvfs.rs- Querying embedded databases
Run examples with:
Running Tests
# Run all tests (fast)
# Run with output
# Run specific test file
# Run stress tests (large archives, many files)
Test Execution Time:
- Regular tests (162 tests): <2 seconds
- Stress tests (4 tests): 5-15 seconds (run with
--ignored)
Compatibility
- Rust: 1.75+ (2021 edition)
- Platforms: Windows, macOS, Linux, BSD
- Architectures: x86_64, aarch64 (ARM64)
Migration from engram-core/engram-vfs
This library replaces the previous two-crate structure:
// Old
use ;
use VfsReader;
// New (engram-rs)
use ;
All functionality is now unified in a single crate with improved APIs:
open_and_init()convenience method (was:open()theninitialize())open_encrypted()convenience method for encrypted archives- Simplified manifest signing workflow
Security Considerations
Path Extraction Safety
engram-rs does not reject path traversal attempts during archive creation. Applications must sanitize paths during extraction:
use ;
Signature Verification
Always verify signatures before trusting archive contents:
let manifest: Manifest = from_json?;
let results = manifest.verify_signatures?;
if !results.iter.all
Resource Limits
For untrusted archives, set resource limits:
# Unix/Linux: Set memory limit
# Monitor decompression size
if
Contributing
Contributions are welcome! See CONTRIBUTING.md for guidelines.
License
Licensed under the MIT License.
Related Projects
- engram-cli - Command-line tool for managing Engram archives
- engram-specification - Complete format specification
- engram-nodejs - Node.js bindings (native module)
Links
- Crates.io: https://crates.io/crates/engram-rs
- Documentation: https://docs.rs/engram-rs
- Repository: https://github.com/blackfall-labs/engram-rs
- Issues: https://github.com/blackfall-labs/engram-rs/issues
- Format Specification: ENGRAM_SPECIFICATION.md