laburnum 1.17.1

# Database Module

A content-addressed, incremental compilation database designed for language servers and build systems. This database provides snapshot isolation, automatic dependency tracking, and adaptive query parallelization for high-performance concurrent workloads.

## Table of Contents

- [Overview](#overview)
- [Core Concepts](#core-concepts)
- [Architecture](#architecture)
- [Writing Data](#writing-data)
- [Reading Data](#reading-data)
- [Advanced Topics](#advanced-topics)
- [Common Patterns](#common-patterns)
- [Important Conventions](#important-conventions)
- [Testing](#testing)
- [File Structure](#file-structure)

## Overview

### Purpose

This database is purpose-built for incremental compilation systems and language servers where:

- **Incremental recompilation** requires tracking which computations depend on which inputs
- **Concurrent reads** must see consistent snapshots while writes are happening
- **Query performance** needs to adapt based on data distribution
- **Cache invalidation** must be precise and automatic via content addressing

### Design Goals

1. **Incremental Compilation**: Content-addressed chunks enable automatic cache invalidation - when source changes, its content hash changes, creating a new chunk rather than replacing the old one.

2. **Snapshot Isolation**: Readers see a consistent view of the database at a point in time, isolated from concurrent writes.

3. **Dependency Tracking**: Every query automatically tracks which chunks were read, building a DAG from source files through compilation stages.

4. **Adaptive Performance**: The system measures sequential vs parallel query execution and automatically chooses the faster approach for each query pattern.

5. **Thread Safety**: Lock-free concurrent access via DashMap, cheap cloning via Arc.

### When to Use This Database

**Use this database when you need:**

- Incremental recompilation with automatic cache invalidation
- Dependency tracking to rebuild only what changed
- Concurrent reads with snapshot isolation
- Integration with async task systems

**Don't use this database when you need:**

- Traditional ACID transactions with rollback
- Mutable records (everything here is immutable)
- SQL queries or relational joins
- Persistent storage (this is an in-memory database)

## Core Concepts

### Content-Addressed Storage

Chunks are identified by their content hash, not by location or name. This has important implications:

**Immutability**: Once a chunk is created, it never changes. This is enforced in `mod.rs:55-62`:

```rust
// Chunks are immutable once created
// If you need to update data, create a new chunk with different content
// The new content will have a different hash, so it will be a different chunk
```

**Automatic Deduplication**: If two computations produce identical results, they get the same `ChunkId` and share storage.

**No Replacement**: When source code changes, parsing it creates a new chunk with a different hash. The old chunk remains until garbage collected. This is why incremental compilation works - you can detect "has this input changed?" by comparing content hashes.

### Partition Keys and Sort Keys

Records are organized hierarchically:

- **Partition Key** (`Ident`): Logical grouping, typically represents record type or source file
- **Sort Key** (`String`): Hierarchical key within partition, enables range queries

Example from diagnostics:

```rust
// Partition key groups all diagnostics together
const DIAGNOSTICS_PK: Ident = Ident::new("diagnostics");

// Sort key enables querying by file, severity, sequence
let sort_key = format!("{}|{:01}|{:04}", source_key, severity, sequence);
writer.insert(DIAGNOSTICS_PK, sort_key, diagnostic);
```

This structure enables efficient queries like "all diagnostics for file X" or "all errors (severity=2) across all files".

### Chunks and Dependencies

A `Chunk` represents the immutable result of a computation (`chunk.rs:23-35`):

```rust
pub struct Chunk<S: RecordStorage> {
    id: ChunkId,                    // Content hash
    task_id: Ident,                 // Stable task identifier
    dependencies: Vec<ChunkId>,     // Chunks read during computation
    index: /* ... */,               // Records organized by partition/sort key
    // ...
}
```

The `dependencies` field forms a directed acyclic graph (DAG):

- **Source file chunk** → **Parse chunk** → **Symbol resolution chunk** → **Type check chunk**

This DAG enables incremental recompilation: if a source file changes (new content hash), the system knows to rebuild all dependent chunks.

### Snapshot Isolation (Lamport Clocks)

Every chunk receives a monotonically increasing timestamp when added to the database (`mod.rs:165`):

```rust
let commit_time = self.current_timestamp.fetch_add(1, Ordering::SeqCst);
```

When you create a `QueryClient`, it captures the current timestamp (`query/client.rs:67-74`). Queries only see chunks with `commit_time <= snapshot_time`, providing a consistent view even as new chunks are added concurrently.

### Fragmentation as a Feature

Multiple chunks can contain records for the same partition key. This is intentional (`mod.rs:63-69`):

**Why this is good:**

- Enables parallel query processing across chunks
- Natural result of incremental compilation (many tasks produce diagnostics)
- No defragmentation overhead

**How performance is maintained:**

- Adaptive parallelization automatically distributes work
- For small chunk counts, sequential processing is faster (avoids overhead)
- For large chunk counts, parallel processing dominates

## Architecture

### Core Types

#### Database<S: RecordStorage> (`mod.rs:74-108`)

The central data structure, cheaply clonable via Arc:

```rust
pub struct Database<S: RecordStorage> {
    chunks: Arc<DashMap<ChunkId, Arc<Chunk<S>>>>,              // All chunks by content hash
    primary_index: Arc<DashMap<Ident, Vec<ChunkId>>>,          // Partition key → chunks
    content_index: Arc<DashMap<ContentHash, (Ident, SortKey)>>,// Hash → location
    entry_chunks: Arc<DashMap<Ident, ChunkId>>,                // GC roots
    query_perf_decisions: Arc<DashMap<QueryKey, QueryModeDecision>>, // Adaptive perf
    current_timestamp: Arc<AtomicU64>,                         // Lamport clock
}
```

**Cloning is cheap** (`mod.rs:110-121`): All fields are Arc-wrapped, so `Database::clone()` just increments reference counts. Pass clones to async tasks freely.

#### Chunk<S: RecordStorage> (`chunk.rs:23-35`)

Immutable computation result:

```rust
pub struct Chunk<S: RecordStorage> {
    id: ChunkId,                                              // Content hash
    task_id: Ident,                                           // Stable task identifier
    parent_task_id: Option<Ident>,                            // Task that spawned this
    commit_time: u64,                                         // Lamport timestamp
    record_count: usize,                                      // Total records
    index: IdentHashMap<BTreeMap<SortKey, S::Index>>,        // Partition → sorted records
    storage: S,                                               // Actual record data
    dependencies: Vec<ChunkId>,                               // Chunks read during computation
}
```

Records are stored in `BTreeMap` for efficient range queries (O(log n) lookup, ordered iteration).

#### RecordStorage Trait (`storage.rs:51-138`)

Abstraction for different storage backends:

```rust
pub trait RecordStorage: Send + Sync + 'static {
    type Index: /* ... */;          // Opaque handle to stored records
    type RecordRef<'a>: /* ... */;  // Borrowed view of a record
    type Builder: RecordStorageBuilder<Storage = Self>;

    fn get(&self, index: &Self::Index) -> Self::RecordRef<'_>;
    fn content_hash(&self) -> ContentHash;
    // ...
}
```

This allows pluggable implementations:

- In-memory with Vec (see `TestStorage` in `tests/storage.rs:79-139`)
- Compressed storage for large datasets
- Persistent storage backends
- Distributed storage

#### QueryClient (`query/client.rs:49-56`)

Snapshot-isolated query interface with dependency tracking:

```rust
pub struct QueryClient<'a, S: RecordStorage> {
    db: &'a Database<S>,
    snapshot_time: u64,                           // Snapshot timestamp
    accessed_chunks: RefCell<HashSet<ChunkId>>,   // Dependency tracking
    pending_deps: RefCell<HashSet<Ident>>,        // Missing partition keys
}
```

The `accessed_chunks` field is automatically populated during queries, building the dependency DAG for incremental recompilation.

#### RecordWriter (`chunk.rs:134-204`)

Builder for creating chunks:

```rust
pub struct RecordWriter<S: RecordStorage> {
    task_id: Ident,                               // Stable task identifier
    parent_task_id: Option<Ident>,                // Parent task
    dependencies: Vec<ChunkId>,                   // Input chunks
    storage_builder: S::Builder,                  // Building storage
    index: /* ... */,                             // Building index
}
```

Call `writer.build()` to finalize into an immutable `Chunk` (`chunk.rs:179-204`).

### Index Structure

The database maintains multiple indexes for different access patterns:

1. **Primary Index** (`primary_index`): Maps partition key → list of chunk IDs containing records for that partition. Used by all queries.

2. **Content Index** (`content_index`): Maps content hash → (partition key, sort key). Used to look up records by their content hash.

3. **Entry Chunks** (`entry_chunks`): Maps source identifier → chunk ID for source files (GC roots). These chunks have no `parent_task_id`.

4. **Query Performance Index** (`query_perf_decisions`): Maps query pattern → performance decision (sequential vs parallel). Populated by adaptive parallelization.

### Thread Safety Model

All data structures use Arc + DashMap:

- **Arc**: Cheap cloning, shared ownership across threads
- **DashMap**: Lock-free concurrent hash map with internal sharding

This design enables:

- Concurrent reads without blocking
- Concurrent writes to different partitions without contention
- Snapshot isolation via Lamport clocks (readers don't block writers)

## Writing Data

### RecordWriter API

The typical write workflow (`chunk.rs:134-163`):

```rust
// 1. Create writer with stable task_id
let mut writer = RecordWriter::new(task_id, dependencies);

// 2. Insert records
writer.insert(partition_key, sort_key, record_data);
writer.insert(partition_key, another_key, more_data);

// 3. Build chunk (finalizes storage, computes content hash)
let chunk = writer.build();

// 4. Add to database (assigns timestamp, updates indexes)
db.add_chunk(chunk)?;
```

In practice, the task system handles steps 3-4 automatically. You just create the writer, insert records, and return it from your task (see `scheduler/task/mod.rs:95-166`).

### Stable Task IDs

**Critical**: Task IDs must be stable across runs for incremental compilation to work.

From `chunk.rs:70-128`:

```rust
// ✓ GOOD - stable, includes input context
let task_id = Ident::new(&format!("parse:{}", uri));
let task_id = Ident::new(&format!("resolve:{}:{}", module, symbol));

// ✗ BAD - not stable
let task_id = Ident::new("parse");                    // No input context
let task_id = Ident::new(&format!("parse:{}", timestamp)); // Non-deterministic
```

**Must include**: computation type, input identifiers, parameters
**Must NOT include**: timestamps, random values, version numbers, output data

### Building Chunks

When you call `writer.build()` (`chunk.rs:179-204`):

1. Storage builder is finalized
2. Content hash is computed from: task_id + storage content hash + dependencies
3. Immutable `Chunk` is created with the computed `ChunkId`

The content hash means: same inputs + same computation + same dependencies = same ChunkId.

### Integration with Scheduler

The scheduler automatically integrates with the database (`scheduler/task/mod.rs:95-166`):

```rust
// Your task returns a RecordWriter
scheduler.queue(move |ctx| async move {
    let mut writer = ctx.new_record_writer(task_id);
    writer.insert(pk, sk, data);
    Some(writer)  // Scheduler builds chunk and adds to DB
}, DEFAULT_LANE);
```

The scheduler:

1. Polls the async task
2. Receives `RecordWriter` result
3. Calls `writer.build()` to create chunk
4. Calls `db.add_chunk(chunk)` to add to database
5. Triggers watchers for affected partition keys

## Reading Data

### QueryClient and Snapshot Isolation

Create a query client from `TaskContext` (`scheduler/task/task_context.rs:91-93`):

```rust
let query_client = ctx.query_client();
```

The client captures the current timestamp (`query/client.rs:67-74`), providing a consistent snapshot:

```rust
let snapshot_time = db.current_timestamp.load(Ordering::SeqCst);
```

All queries filter chunks by `chunk.commit_time <= snapshot_time` (`query/client.rs:77-83`).

### Query API

The fluent `QueryBuilder` API (`query/mod.rs:140-299`):

```rust
// Query all records in a partition
let all_diagnostics = query_client
    .query(DIAGNOSTICS_PK)
    .execute()
    .await;

// Exact match
let record = query_client
    .query(SYMBOLS_PK)
    .sort_key("main.rs|function|main")
    .execute()
    .await;

// Prefix search (hierarchical queries)
let file_diagnostics = query_client
    .query(DIAGNOSTICS_PK)
    .sort_key_begins_with("file:///src/main.rs|")
    .execute()
    .await;

// Range queries
let errors_and_warnings = query_client
    .query(DIAGNOSTICS_PK)
    .sort_key_between("001", "003")  // severity range
    .execute()
    .await;

let low_severity = query_client
    .query(DIAGNOSTICS_PK)
    .sort_key_less_than("002")
    .execute()
    .await;

let high_severity = query_client
    .query(DIAGNOSTICS_PK)
    .sort_key_greater_than("002")
    .execute()
    .await;
```

### QueryResults Iteration

Results implement `QueryResults` (`query_results.rs:24-67`):

```rust
let results = query_client.query(pk).execute().await;

// Check if empty
if results.is_empty() {
    return Ok(());
}

// Get count
let count = results.len();

// Iterate over (metadata, record_ref) pairs
for (metadata, record_ref) in results.iter() {
    // metadata: &RecordMetadata - partition_key, sort_key, chunk_id
    // record_ref: S::RecordRef<'_> - actual record data

    process_record(record_ref)?;
}

// Access by index
if let Some(record) = results.get(&results.records()[0]) {
    // ...
}
```

### Dependency Tracking

Every query automatically tracks accessed chunks (`query/client.rs:365-367`):

```rust
// Record this chunk as a dependency
self.accessed_chunks.borrow_mut().insert(*chunk_id);
```

After your task completes, call `query_client.dependencies()` to get the list of chunks read. The scheduler uses this to build the dependency DAG.

If a query doesn't find a partition key, it records a pending dependency (`query/client.rs:370-391`):

```rust
self.pending_deps.borrow_mut().insert(partition_key);
```

This allows the system to track "task X depends on partition Y, but Y doesn't exist yet" relationships.

### Batch Operations

Query multiple records efficiently (`query/client.rs:115-136`):

```rust
let items = vec![
    (SYMBOLS_PK, "main.rs|function|main".to_string()),
    (SYMBOLS_PK, "lib.rs|function|init".to_string()),
    (TYPES_PK, "main.rs|struct|Config".to_string()),
];

let results = query_client.batch_get_items(items).await;
```

## Advanced Topics

### Adaptive Parallelization

The system automatically learns whether sequential or parallel execution is faster for each query pattern.

**How it works** (`query/mod.rs:27-59`, `query/client.rs:912-989`):

1. Queries are bucketed by `(partition_key, query_type, chunk_count_bucket)`
2. Chunk count < 10: Always sequential (overhead not worth it)
3. Chunk count > 10,000: Always parallel (obvious win)
4. In between: Measure both approaches

**Measurement cycle** (`query/client.rs:912-943`):

1. Run query sequentially, record execution time
2. Run query in parallel, record execution time
3. Compare and lock in the faster mode for this query pattern
4. Store decision in `query_perf_decisions` index

**Chunk count bucketing** (`query/mod.rs:71-98`):

```rust
// Buckets: 0-10, 11-100, 101-1000, 1001-10000, 10001+
let bucket = match chunk_count {
    0..=10 => ChunkCountBucket::Small,
    11..=100 => ChunkCountBucket::Medium,
    // ...
};
```

This prevents re-measuring when chunk count changes slightly.

**Why this matters**: Some queries benefit from parallelization (many chunks, CPU-bound filtering), while others are faster sequentially (few chunks, overhead dominates). The system adapts to actual workload characteristics.

### Garbage Collection

Mark-and-sweep GC from entry chunks (`mod.rs:228-303`):

**Entry chunks** are GC roots - chunks with no `parent_task_id`, typically representing source files (`mod.rs:172-175`):

```rust
if chunk.parent_task_id().is_none() {
    self.entry_chunks.insert(task_id, chunk_id);
}
```

**Mark phase** (`mod.rs:243-263`): Walk dependency graph from entry chunks, marking reachable chunks.

**Sweep phase** (`mod.rs:278-303`): Remove unmarked chunks from all indexes:

- Remove from `chunks` map
- Remove from `primary_index` (if no longer referenced)
- Remove from `content_index`
- Remove from `entry_chunks` (if no longer an entry)

This ensures referential integrity - you can't have dangling references to non-existent chunks.

### Custom Storage Backends

Implement `RecordStorage` and `RecordStorageBuilder` traits (`storage.rs:51-138`):

```rust
pub struct MyStorage {
    records: Vec<MyRecordType>,
}

impl RecordStorage for MyStorage {
    type Index = usize;  // Vec index
    type RecordRef<'a> = &'a MyRecordType;
    type Builder = MyStorageBuilder;

    fn get(&self, index: &Self::Index) -> Self::RecordRef<'_> {
        &self.records[*index]
    }

    fn content_hash(&self) -> ContentHash {
        // Hash all records
        let mut hasher = ContentHasher::new();
        for record in &self.records {
            record.hash(&mut hasher);
        }
        hasher.finish()
    }
}

pub struct MyStorageBuilder {
    records: Vec<MyRecordType>,
}

impl RecordStorageBuilder for MyStorageBuilder {
    type Storage = MyStorage;

    fn insert(&mut self, record: impl LaburnumRecord) -> usize {
        self.records.push(record.downcast());
        self.records.len() - 1
    }

    fn finalize(self) -> MyStorage {
        MyStorage { records: self.records }
    }
}
```

See `tests/storage.rs:79-139` for a complete example.

### Performance Characteristics

- **Query parallelization**: Automatic work distribution across CPU cores for large chunk counts
- **Lock-free concurrent access**: DashMap with internal sharding, no global locks
- **Cheap cloning**: Arc-based database clones (reference count increment only)
- **Content deduplication**: Identical chunks share storage via content hashing
- **Adaptive thresholds**: System learns optimal execution modes per query pattern
- **BTreeMap indexes**: O(log n) range queries within chunks, ordered iteration

## Common Patterns

### Pattern 1: Parse and Write Results

Parsing a file and writing the results:

```rust
async fn on_file_version(
    uri: Uri,
    source: Arc<Source>,
    source_key: SourceKey,
    _ctx: TaskContext<MyStorage, MyServer>,
    mut writer: RecordWriter<MyStorage>,
) -> Result<RecordWriter<MyStorage>, LaburnumError> {
    // Parse the source (reify_content converts rope/string to owned String)
    let content = source.reify_content().ok_or(LaburnumError::FileEvicted)?;
    let (tokens, lex_errors, lex_state) = lex(source_key, &content);
    let (ast, parse_errors, parse_state) = parse(lex_state, &tokens);

    // Write structured data
    build_symbol_table(&uri, source_key, ast, &source, &mut writer).await?;

    // Write diagnostics
    for (sequence, error) in lex_errors.iter().enumerate() {
        let diagnostic = error_to_diagnostic(error, source_key);
        let severity = diagnostic.severity()?;
        let sort_key = format!("{}|{:01}|{:04}", source_key, severity, sequence);

        writer.insert(DIAGNOSTICS_PK, sort_key, diagnostic);
    }

    // Return writer - scheduler builds chunk and adds to DB
    Ok(writer)
}
```

### Pattern 2: Query and React to Changes (Watchers)

Defining a watcher that reacts to partition changes:

```rust
// Define watcher for partition key
laburnum::watchers! {
    (Server: MyServer, Storage: MyStorage),
    SYMBOL_TABLE_PK => handle_symbol_table_changes,
}

fn handle_symbol_table_changes<'a>(
    ctx: &'a mut TaskContext<MyStorage, MyServer>,
    writer: &'a mut RecordWriter<MyStorage>,
) -> Pin<Box<dyn Future<Output = ()> + Send + 'a>> {
    Box::pin(async move {
        let query_client = ctx.query_client();

        // Query all symbols
        let all_symbols = query_client
            .query(SYMBOL_TABLE_PK)
            .execute()
            .await;

        // Build lookup map
        let mut symbol_map = HashMap::new();
        for (_metadata, record_ref) in all_symbols.iter() {
            if let MyRecordRef::Symbol(sym) = record_ref {
                symbol_map.insert(sym.name.clone(), Arc::clone(sym));
            }
        }

        // Get changed keys from watcher context
        let updated = ctx.matched_keys_updated().to_vec();

        // Process each changed symbol
        for key in updated.iter() {
            let query_results = key.get_record(query_client).await;

            // Resolve references, update derived data, etc.
            for (_metadata, record_ref) in query_results.iter() {
                process_symbol(record_ref, &symbol_map, writer)?;
            }
        }
    })
}
```

### Pattern 3: Scheduler Integration

From `scheduler/mod.rs:238-244`:

```rust
// Queue a task that writes to the database
scheduler.queue(move |ctx| async move {
    // Your task logic
    let result = compute_something().await?;

    // Create writer with stable task_id
    let task_id = Ident::new(&format!("compute:{}", input_id));
    let mut writer = ctx.new_record_writer(task_id);

    // Insert results
    writer.insert(OUTPUT_PK, sort_key, result);

    // Return writer - scheduler handles chunk building and DB insertion
    Some(writer)
}, DEFAULT_LANE);
```

## Important Conventions

### Sort Key Formatting

From project CLAUDE.md:

**NO HEX FORMATTING** in sort keys - hex wastes memory with letters a-f. Use zero-padded decimal for proper lexicographic sorting:

```rust
// ❌ BAD - hex wastes memory
format!("{:016x}|{:010x}", source_key, line)

// ✅ GOOD - zero-padded decimal for lexicographic sorting
format!("{}|{:010}", source_key, line)
```

Zero-padding (`:010`) is required for numeric sort keys to ensure proper lexicographic ordering: `"0000000001"` < `"0000000010"` < `"0000000100"`.

### Built-in Partition Key Constants

Built-in Laburnum features export partition key constants:

```rust
use laburnum::diagnostics::DIAGNOSTICS_PK;

writer.insert(DIAGNOSTICS_PK, sort_key, diagnostic);
```

When adding new built-in features:

1. Define `const FEATURE_PK: Ident = Ident::new("feature_name")` in feature module
2. Export the constant publicly
3. Document sort key format in module-level documentation
4. Provide helper functions for sort key generation if complex
5. Use the constant consistently - never create the Ident inline

Example:

```rust
// In src/my_feature/mod.rs
pub const MY_FEATURE_PK: Ident = Ident::new("my_feature");

pub fn my_feature_sort_key(id: u64, timestamp: u64) -> String {
    format!("{:016}|{:016}", id, timestamp)  // Zero-padded decimal, not hex
}
```

### Stable Task Identifier Rules

From `chunk.rs:70-128`, task IDs must be:

**Stable across runs:**

- Same computation on same input must produce same task_id
- Enables incremental compilation cache hits

**Include in task_id:**

- Computation type (parse, resolve, typecheck, etc.)
- Input identifiers (file URI, symbol name, etc.)
- Parameters that affect output (configuration flags, etc.)

**Exclude from task_id:**

- Timestamps or wall clock time
- Random values or UUIDs
- Version numbers (unless they affect output)
- Output data or results
- Process IDs or thread IDs

**Examples:**

```rust
// ✓ GOOD
Ident::new(&format!("parse:{}", file_uri))
Ident::new(&format!("resolve:{}:{}", module_name, symbol_name))
Ident::new(&format!("typecheck:{}:strict={}", function_id, strict_mode))

// ✗ BAD
Ident::new("parse")                                    // Missing input context
Ident::new(&format!("parse:{}:{}", uri, timestamp))   // Non-deterministic
Ident::new(&format!("resolve:{}", Uuid::new_v4()))    // Random component
```

## Testing

### TestStorage Example

The test suite includes a complete `RecordStorage` implementation (`tests/storage.rs:79-139`):

```rust
pub enum TestRecordData {
    Module { exports: Vec<Ident> },
    Function { name: Ident, body: String },
    Struct { name: Ident, fields: Vec<Ident> },
    Laburnum(LaburnumRecord),
}

pub struct TestStorage {
    modules: Vec<TestRecordData>,    // Separate Vecs for each variant
    functions: Vec<TestRecordData>,
    structs: Vec<TestRecordData>,
    laburnum: Vec<LaburnumRecord>,
}

pub enum TestIndex {
    Module(usize),
    Function(usize),
    Struct(usize),
    Laburnum(usize),
}

impl RecordStorage for TestStorage {
    type Index = TestIndex;
    type RecordRef<'a> = TestRecordRef<'a>;
    type Builder = TestStorageBuilder;

    fn get(&self, index: &Self::Index) -> Self::RecordRef<'_> {
        match index {
            TestIndex::Module(i) => TestRecordRef::Module(&self.modules[*i]),
            TestIndex::Function(i) => TestRecordRef::Function(&self.functions[*i]),
            // ...
        }
    }

    // ...
}
```

This demonstrates:

- Enum-based record types with pattern matching
- Indexed storage via separate Vecs per variant
- Content hashing implementation
- `LaburnumRecordRef` downcasting support

### Test Usage Patterns

From `tests/generic_database.rs:34-95`:

```rust
#[test]
async fn test_basic_write_and_read() {
    // Create scheduler (contains database)
    let (scheduler, _conn) = test_scheduler();

    // Write task
    scheduler.queue(move |_ctx| async move {
        let task_id = Ident::new(&format!("module:{}", module_name));
        let mut writer = RecordWriter::new(task_id, vec![]);

        writer.insert(
            task_id,
            module_name.clone(),
            TestRecordData::Module { exports: vec![/* ... */] },
        );

        Some(writer)
    }, DEFAULT_LANE);

    scheduler.spawn_workers();

    // Read task
    scheduler.queue(move |mut ctx| async move {
        let query_client = ctx.query_client();
        let results = query_client
            .get_record(module_id, module_name.clone())
            .await;

        assert!(!results.is_empty());

        for (_metadata, record_ref) in results.iter() {
            match record_ref {
                TestRecordRef::Module(data) => {
                    // Verify data
                }
                _ => panic!("Expected module record"),
            }
        }

        None
    }, DEFAULT_LANE);
}
```

## File Structure

Quick reference of module contents:

| File                        | Purpose                                                                                      |
| --------------------------- | -------------------------------------------------------------------------------------------- |
| `mod.rs`                    | Core `Database<S>` struct, `add_chunk()`, garbage collection, chunk visibility filtering     |
| `chunk.rs`                  | `Chunk<S>`, `ChunkId`, `RecordWriter<S>` types and builders                                  |
| `storage.rs`                | `RecordStorage` and `RecordStorageBuilder` traits for pluggable storage backends             |
| `query_results.rs`          | `QueryResults<S>` container for query output, iteration support                              |
| `stats.rs`                  | Statistics collection (`DbStats`, `TaskStats`) for monitoring                                |
| `query/mod.rs`              | Query enums, `QueryBuilder` fluent API, adaptive parallelization types                       |
| `query/client.rs`           | `QueryClient<S>` with parallel/sequential execution, snapshot isolation, dependency tracking |
| `tests/mod.rs`              | Test module exports and test helpers                                                         |
| `tests/storage.rs`          | `TestStorage` implementation example                                                         |
| `tests/generic_database.rs` | Basic database operation tests (write, read, query, GC)                                      |
| `tests/compilation_dag.rs`  | Dependency graph and incremental compilation tests                                           |

## Summary

This database provides a robust foundation for incremental compilation systems and language servers. Key takeaways:

1. **Content addressing** enables automatic cache invalidation - when inputs change, content hashes change, creating new chunks rather than mutating old ones.

2. **Snapshot isolation** via Lamport clocks provides consistent reads during concurrent writes without blocking.

3. **Dependency tracking** automatically builds a DAG from source files through compilation stages, enabling precise incremental recompilation.

4. **Adaptive parallelization** measures actual performance and chooses the optimal execution mode for each query pattern.

5. **Pluggable storage** via the `RecordStorage` trait allows customization for different workload characteristics.

The architecture embraces immutability, uses content hashing for cache invalidation, and provides both high-level ergonomics (fluent query API, scheduler integration) and low-level control (custom storage implementations).