Table of Contents
- Features
- Architecture
- Installation
- Quick Start
- API Documentation
- Performance
- Storage Format
- Security
- CLI Usage
- Advanced Features
- Development
- License
Features
Core Capabilities
- Incremental Snapshots: Only changed files are stored between checkpoints, minimizing storage overhead
- Content-Addressable Storage: Automatic deduplication across all checkpoints using SHA-256 hashing
- LZ4 Compression: Extreme-speed compression (500+ MB/s) with adaptive strategies
- Parallel Processing: Multi-threaded file scanning and hashing leveraging all available CPU cores
- Merkle Tree Verification: Cryptographic integrity checking for tamper detection
- Timeline Branching: Git-like branching model supporting multiple independent timelines
- Atomic Operations: All checkpoint and restore operations are atomic with rollback on failure
- Memory Efficient: Streaming architecture for large files without full memory loading
- Line-Level Diffs: Git-like unified diff output showing exactly what changed between checkpoints
Technical Specifications
- Compression: LZ4 via
lz4_flexcrate with 4+ GB/s decompression speeds - Hashing: SHA-256 for content identification and verification
- Serialization: Bincode for efficient binary storage of metadata
- Concurrency: Rayon-based parallel processing with configurable worker pools
- Storage: Sharded object storage with reference counting for garbage collection
Architecture
System Overview
Titor implements a content-addressable storage system with the following components:
- Storage Layer: Manages compressed object storage with automatic deduplication
- Checkpoint System: Creates and manages independent snapshots with metadata
- Timeline Manager: Maintains DAG structure of checkpoints with branching support
- Verification Engine: Provides cryptographic integrity checking via Merkle trees
- Compression Engine: Adaptive LZ4 compression with configurable strategies
Storage Layout
storage_root/
├── metadata.json # Storage configuration and version info
├── timeline.json # Timeline DAG structure
├── checkpoints/ # Checkpoint metadata directory
│ └── {checkpoint_id}/
│ ├── metadata.json # Checkpoint metadata (size, timestamps, etc.)
│ └── manifest.bin # Binary manifest of all files (bincode format)
├── objects/ # Content-addressable object storage
│ └── {prefix}/ # Two-character sharding for performance
│ └── {hash} # LZ4-compressed file content
└── refs/ # Reference counting for garbage collection
└── {object_hash} # Reference count per object
Object Storage
Files are stored using content-based addressing:
- SHA-256 hash computed for file content
- Objects sharded by first 2 hash characters for filesystem performance
- Reference counting enables safe garbage collection
- Compression applied based on configurable strategies
Installation
Add to your Cargo.toml:
[]
= "0.2.0"
Or install the CLI tool with:
Dependencies
Required system dependencies:
- Rust 1.70+ (for stable async traits)
- Platform-specific filesystem capabilities for symbolic links
Quick Start
Basic Usage
use ;
use PathBuf;
Line-Level Diff Example
use DiffOptions;
// Get detailed diff with line-level changes
let options = DiffOptions ;
let detailed_diff = titor.diff_detailed?;
// Display results
println!;
println!;
for file_diff in &detailed_diff.file_diffs
Advanced Configuration
let titor = new
.compression_strategy
.ignore_patterns
.max_file_size // 100MB limit
.follow_symlinks
.parallel_workers
.build?;
API Documentation
Core Types
Titor
The main interface for checkpoint operations.
CompressionStrategy
Configurable compression strategies for different use cases.
Checkpoint
Represents a point-in-time snapshot.
DiffOptions
Configure line-level diff generation.
FileDiff
Line-level diff information for a single file.
Error Handling
Titor uses a comprehensive error type hierarchy:
Performance
Optimization Strategies
- Parallel File Scanning: Utilizes Rayon for concurrent directory traversal
- Streaming Compression: Processes large files in chunks to minimize memory usage
- Content Deduplication: Identical files stored only once across all checkpoints
- Lazy Loading: Objects loaded from storage only when needed
- Sharded Storage: Two-character prefix sharding prevents filesystem bottlenecks
Storage Format
Object Format
Each stored object consists of:
- 4-byte header indicating compression status
- LZ4-compressed content (if compression applied)
- Original content (if compression not beneficial)
Manifest Format
File manifests use Bincode serialization for efficiency:
Security
Cryptographic Verification
Titor implements multiple layers of integrity checking:
- Content Hashing: SHA-256 hash for each file
- Merkle Trees: Cryptographic proof of entire checkpoint state
- State Hashing: Combined hash of all checkpoint components
- Tamper Detection: Automatic corruption detection during verification
Verification API
let verifier = new;
let report = verifier.verify_complete?;
if !report.is_valid
CLI Usage
Titor includes a comprehensive command-line interface.
Installation
# Install the titor CLI from crates.io
# Or install from local path
Commands
# Initialize repository
# Create checkpoint
# List checkpoints
# Restore to checkpoint
# Show timeline tree
# Compare checkpoints
# Compare with line-level differences (git-like)
# Compare with custom context lines
# Show only statistics
# Ignore whitespace changes
# Verify integrity
# Garbage collection
Advanced Features
Timeline Branching
Create independent branches for experimentation:
// Fork from existing checkpoint
let fork = titor.fork?;
// Work on branch...
// Later, restore to original
titor.restore?;
Auto-Checkpoint Strategies
Configure automatic checkpoint creation:
titor.set_auto_checkpoint;
Checkpoint Hooks
Implement custom logic for checkpoint events:
titor.add_hook;
Development
Building from Source
Running Tests
# Unit and integration tests
# Include ignored tests
# Run with logging
RUST_LOG=debug
Project Structure
titor/
├── src/
│ ├── lib.rs # Public API
│ ├── titor.rs # Core implementation
│ ├── storage.rs # Storage backend
│ ├── checkpoint.rs # Checkpoint types
│ ├── timeline.rs # Timeline management
│ ├── compression.rs # Compression engine
│ ├── verification.rs # Integrity checking
│ └── merkle.rs # Merkle tree implementation
├── benches/ # Performance benchmarks
├── examples/ # Example implementations
└── tests/ # Integration tests
License
Licensed under the MIT License, see LICENSE for more information.