FeatherDB MVCC Layer
Multi-Version Concurrency Control (MVCC) provides snapshot isolation with single-writer, multiple-reader concurrency.
Overview
MVCC allows multiple transactions to read the database concurrently while a single writer makes modifications, without blocking each other. Each transaction sees a consistent snapshot of the database as of its start time, enabling:
- Non-blocking reads: Readers never wait for writers
- Non-blocking writes: Writers never wait for readers
- Consistent snapshots: Each transaction sees a stable view of data
- Conflict detection: Write-write conflicts are detected and prevented
Architecture
┌─────────────────────────────────────────────────────────────────────────┐
│ TransactionManager │
│ - Assigns unique transaction IDs │
│ - Tracks active and aborted transactions │
│ - Enforces single-writer semantics via write lock │
│ - Manages version store for old versions │
│ - Provides GC watermark based on oldest active transaction │
└────────────────────────┬───────────────────────────────────────────────-┘
│
┌────────────────┼────────────────┬────────────────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐
│ Transaction │ │ Snapshot │ │ VersionStore │ │ AbortedTxnSet │
│ │ │ │ │ │ │ │
│ - id │ │ - txn_id │ │ - versions │ │ Tracks aborted │
│ - mode │ │ - hwm │ │ - next_ptr │ │ transaction IDs │
│ - snapshot │ │ - in_progress│ │ │ │ for visibility │
│ - manager ref│ │ - ancestors │ │ │ │ checks │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────────┘
Key Components
1. TransactionManager (transaction.rs)
Central coordinator for all transactions in the database.
Responsibilities:
- Assign unique, monotonically increasing transaction IDs
- Track active transactions for snapshot creation
- Track aborted transactions for visibility checks
- Enforce single-writer semantics via a mutex
- Provide the oldest active transaction ID for garbage collection
- Coordinate transaction commit and abort (with optional WAL integration)
2. Transaction (transaction.rs)
Represents a unit of work in the database.
Transaction Modes:
TransactionMode::ReadOnly- Cannot perform writes, returns error on write attemptsTransactionMode::ReadWrite- Full read and write capability
Key Features:
- Auto-abort on drop: If a transaction is dropped without explicit commit, it is automatically aborted
- WAL integration: Commit and abort can optionally log to the Write-Ahead Log
3. Snapshot (snapshot.rs)
Represents a point-in-time view of the database for consistent reads.
Visibility Rules:
A transaction can see changes made by created_by if:
Row Visibility:
A row is visible if we can see its creation AND cannot see its deletion:
4. Version Chain (version.rs)
Stores multiple versions of data for MVCC reads.
VersionedValue - Current (latest) version with MVCC metadata:
OldVersion - Historical versions in the version chain:
Version Chain Structure:
Current Version ──→ OldVersion₁ ──→ OldVersion₂ ──→ ... ──→ None
(newest) (oldest)
VersionChain Iterator:
The VersionChain struct provides an iterator for walking the version chain and finding visible versions:
let mut chain = new;
if let Some = chain.find_visible
5. Garbage Collection
Removes old versions that no active transaction can see.
GC Rules:
- A version is garbage if
deleted_by < oldest_active_txn - This means all active (and future) transactions can see the deletion
- The old version will never be needed again
GcStats:
Running GC:
let stats = tm.run_gc;
println!;
Isolation Level
The MVCC implementation provides Snapshot Isolation:
| Property | Behavior |
|---|---|
| Dirty Reads | Prevented - uncommitted changes are invisible |
| Non-repeatable Reads | Prevented - snapshot is fixed at transaction start |
| Phantom Reads | Prevented - snapshot is fixed at transaction start |
| Write Skew | Possible - not prevented by snapshot isolation |
Note: Snapshot isolation is weaker than Serializable isolation. Write skew anomalies can occur when two transactions read overlapping data and make disjoint writes based on what they read.
Write-Write Conflict Detection
Write-write conflicts are prevented through single-writer serialization:
// Only one read-write transaction can hold the write lock at a time
+ '_
Conflict detection at the row level:
// When attempting to modify a row, check if it was modified after our snapshot
if !self.snapshot.can_see
Aborted Transaction Handling
The implementation tracks aborted transactions to ensure their changes remain invisible:
// Check visibility accounting for aborted transactions
Cleanup of old aborted transaction IDs:
// Remove aborted txn IDs older than any active transaction's snapshot
tm.cleanup_aborted_txns;
Transaction Lifecycle
1. BEGIN
├── Assign unique TransactionId
├── Create Snapshot (capture high_water_mark, in_progress set)
└── Register in active_txns map
2. READ (can repeat)
├── Check current version visibility using snapshot
├── If not visible, walk version chain
└── Return first visible version (or not found)
3. WRITE (can repeat, ReadWrite mode only)
├── Acquire write lock (blocks other writers)
├── Check for write-write conflict
├── Create new version with our txn_id as created_by
└── Move old version to version chain
4. COMMIT or ABORT
├── COMMIT:
│ ├── Log to WAL (if provided)
│ ├── Remove from active_txns
│ └── Update oldest_active watermark
└── ABORT:
├── Log to WAL (if provided)
├── Remove from active_txns
├── Add to aborted_txns set
└── Update oldest_active watermark
5. DROP (if not committed/aborted)
└── Auto-abort the transaction
Usage Examples
Basic Transaction Usage
use ;
let tm = new;
// Start a read-write transaction
let txn = tm.begin;
println!;
// Check if we can write (will error for read-only transactions)
txn.check_write?;
// Check visibility of another transaction's changes
if txn.can_see
// Commit the transaction
txn.commit?;
Read-Only Transaction
let txn = tm.begin_read_only;
// Read operations here...
// Attempting to write will fail
assert!;
// Read-only transactions can commit or abort
txn.commit?;
Snapshot Visibility Check
let snapshot = txn.snapshot;
// Check if a version is visible
if snapshot.is_visible
Running Garbage Collection
// Run GC manually
let stats = tm.run_gc;
println!;
// Check version store size
println!;
println!;
println!;
WAL Integration
use Wal;
let mut wal = open?;
let txn = tm.begin_read_write;
// ... perform operations ...
// Commit with WAL logging
txn.commit_with_wal?;
// Or abort with WAL logging
// txn.abort_with_wal(&mut wal)?;
Savepoint/Sub-transaction Support
use HashSet;
// Create a snapshot with visible ancestors (for sub-transactions)
let mut ancestors = new;
ancestors.insert;
let snapshot = with_ancestors;
// The child snapshot can now see the parent's uncommitted changes
assert!;
Concurrency Guarantees
| Operation | Behavior |
|---|---|
| Read-Read | No blocking - concurrent reads always allowed |
| Read-Write | No blocking - readers see their snapshot |
| Write-Read | No blocking - writer's uncommitted changes invisible to readers |
| Write-Write | Serialized - single writer lock prevents concurrent writes |
Configuration Options
Currently, the MVCC layer uses sensible defaults. Configurable options include:
| Option | Default | Description |
|---|---|---|
| Initial TxnID | 1 | Starting transaction ID |
| Write Lock | Mutex | Single-writer enforcement |
| Version Store | In-memory HashMap | Storage for old versions |
Performance Considerations
-
Version Chain Length: Long chains slow down reads as they require traversal. Run GC regularly to keep chains short.
-
Active Transaction Count: Long-running transactions prevent GC from reclaiming old versions. Keep transactions short.
-
Write Lock Contention: All writes are serialized through a single lock. This simplifies implementation but can bottleneck write-heavy workloads.
-
Aborted Transaction Set: The aborted transaction set grows until cleanup. Call
cleanup_aborted_txns()periodically. -
Snapshot Memory: Each snapshot stores the set of in-progress transactions. Many concurrent transactions increase memory usage.
Testing
# Run all MVCC tests (using Make)
# Or with cargo directly
# Test specific components
# Run with output
# Run coverage (from project root)
Transaction Timeout
Transactions can be configured with automatic timeouts to prevent long-running transactions from blocking the system.
Configuration
use TransactionConfig;
use Duration;
// Create config with 5 second timeout
let config = new
.with_timeout;
let txn = tm.begin_with_config;
Timeout Behavior
When a transaction exceeds its configured timeout:
- All subsequent operations fail with
Error::TransactionTimeout - The transaction must be rolled back
- Timeout is checked automatically on every operation
Timeout API
// Check if transaction has expired
if txn.is_expired
// Get remaining time before timeout
if let Some = txn.remaining_time
// Get elapsed time since transaction started
let elapsed = txn.elapsed;
println!;
Error Handling
match txn.execute
Deadlock Detection
Deadlock detection prevents circular wait conditions between transactions using cycle detection in a wait-for graph.
How It Works
- Wait Graph: Tracks which transactions are waiting for which other transactions
- Cycle Detection: Uses DFS to detect cycles in the wait graph
- Victim Selection: Aborts the youngest transaction (highest ID) in the cycle
- Automatic Resolution: Deadlocks are detected and resolved automatically
WaitGraph API
use WaitGraph;
let mut graph = new;
// Record that txn1 is waiting for txn2
graph.add_wait;
graph.add_wait;
graph.add_wait; // Creates a cycle
// Detect the cycle
if let Some = graph.detect_cycle
// Clean up when transaction completes
graph.remove_transaction;
Integration Example
// Deadlock detection is integrated into the transaction manager
// When a transaction waits for a lock:
wait_graph.add_wait;
if let Some = wait_graph.detect_cycle
Long-Running Transaction Detection
Monitor long-running transactions to identify potential GC blockers and performance issues.
Detection API
use Duration;
// Find transactions running longer than 5 seconds
let long_running = tm.long_running_transactions;
for txn_info in &long_running
GC Status Monitoring
// Check if GC is blocked
let gc_status = tm.gc_status;
if gc_status.blocked
TransactionInfo
GcStatus
Use Cases
Database Monitoring:
// Periodic health check
let long_txns = tm.long_running_transactions;
if !long_txns.is_empty
GC Health Check:
let status = tm.gc_status;
if status.blocked
Performance Analysis:
// Correlate long-running transactions with GC blocking
let long_running = tm.long_running_transactions;
let gc_status = tm.gc_status;
if let Some = gc_status.oldest_active_txn
Adaptive Garbage Collection
Adaptive GC automatically adjusts collection frequency based on workload patterns and metrics.
Configuration
use ;
let config = GcConfig ;
let mut scheduler = new;
GC Metrics
// Get current metrics
let metrics = scheduler.metrics.snapshot;
println!;
println!;
println!;
println!;
println!;
Adaptive Scheduling
// Check if GC should run based on adaptive heuristics
if scheduler.should_run_gc
// Record version creation
scheduler.metrics.record_version_created;
How Adaptive GC Works
- Metrics Tracking: Monitors versions created, cleaned, and pending
- Heuristics: Decides when to run GC based on:
- Time since last GC run (respects min_interval)
- Number of pending versions (respects max_versions_before_gc)
- Historical GC efficiency
- Performance: Prevents unnecessary GC runs when system is idle
- Safety: Ensures GC runs when memory pressure is high
Future Improvements
- First-committer-wins write-write conflict detection
- Serializable isolation level (prevent write skew)
- Multi-version indexes
- Background GC thread with configurable interval
- Read-only transaction optimization (skip active tracking)