Module memory

Module memory 

Source
Expand description

Memory management utilities for efficient query execution

This module provides memory-bounded execution for SQL query operators, enabling processing of datasets larger than available memory through disk spilling.

§Components

  • Memory Controller (MemoryController): Budget management and tracking
  • Memory Reservation (MemoryReservation): Per-operator memory tracking
  • External Sort (ExternalSort): Disk-spilling merge sort
  • External Aggregate (ExternalAggregate): Partition-based GROUP BY
  • External Hash Join (ExternalHashJoin): Grace hash join with spilling
  • Spill Files (SpillFile): Temporary file management with auto-cleanup
  • Arena Allocator (QueryArena): Fast bump-pointer allocator

§Architecture

┌─────────────────────────────────────────────────────────────────┐
│                       MemoryController                          │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐ │
│  │ Budget Pool │  │  Tracking   │  │       Metrics           │ │
│  │ (configurable)│ │ (per-operator)│ │ (spills, peak, etc.) │ │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
           │                │                │
           ▼                ▼                ▼
┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│ External     │  │ External     │  │ External     │
│ Sort         │  │ Aggregate    │  │ Hash Join    │
│ (merge sort) │  │ (partitioned)│  │ (grace join) │
└──────────────┘  └──────────────┘  └──────────────┘
           │                │                │
           ▼                ▼                ▼
┌─────────────────────────────────────────────────────────────────┐
│                        SpillFile (temp files)                   │
│         Auto-cleanup on drop, buffered I/O, seeking             │
└─────────────────────────────────────────────────────────────────┘

§Memory-Bounded Execution

use std::sync::Arc;
use vibesql_executor::memory::{MemoryController, MemoryConfig};

// Create controller with 1GB budget
let controller = Arc::new(MemoryController::with_budget(1024 * 1024 * 1024));

// Operators create reservations to track their memory
let mut reservation = controller.create_reservation();

// When memory is exhausted, spill to disk
if !reservation.try_grow(batch_size) {
    spill_to_disk(&data);
    reservation.shrink(data.size());
}

// Check statistics after execution
let stats = controller.stats();
println!("{}", stats); // "Memory: 512MB/1GB (50%), peak: 950MB, spilled: 2GB (3 ops)"

§External Operators

§External Sort

Two-phase external merge sort:

  1. Run generation: Sort in-memory chunks, spill as sorted runs
  2. K-way merge: Merge runs using a tournament tree
let mut sort = ExternalSort::new(controller, config, sort_keys);
for row in input {
    sort.add_row(&row)?;  // Automatically spills when needed
}
for result in sort.finish()? {
    // Rows come out in sorted order
}

§External Aggregate

Partition-based aggregation for GROUP BY:

  1. Hash rows to partitions
  2. Spill partitions when memory exhausted
  3. Process each partition’s groups
let specs = vec![AggregateSpec { function_name: "SUM".into(), .. }];
let mut agg = ExternalAggregate::new(controller, config, specs, 2);
for row in input {
    agg.add_row(&row)?;
}
for result in agg.finish()? {
    // (group_key..., aggregate_values...)
}

§External Hash Join

Grace hash join with partition-based spilling:

  1. Partition both build and probe sides by join key hash
  2. Spill partitions when memory exhausted
  3. Process matching partitions together
let mut join = ExternalHashJoin::new(
    controller, config,
    vec![0],  // build key columns
    vec![0],  // probe key columns
    JoinType::Inner,
);
for row in build_side { join.add_build_row(&row)?; }
for row in probe_side { join.add_probe_row(&row)?; }
for result in join.finish()? {
    // Joined rows
}

§Configuration

Environment variables:

VariableDescriptionDefault
VIBESQL_MEMORY_LIMITTotal memory budget (e.g., “4GB”)1GB
VIBESQL_TEMP_DIRDirectory for spill filessystem temp
VIBESQL_SPILL_THRESHOLDWhen to start spilling (0.0-1.0)0.8
VIBESQL_PARTITION_SIZETarget partition size64MB

Modules§

row_serialization
Row serialization for disk spilling

Structs§

AggregateResultIterator
Iterator over aggregate results
AggregateSpec
Specification for an aggregate function
ExternalAggregate
External aggregate operator
ExternalAggregateConfig
Configuration for external aggregate
ExternalHashJoin
External Hash Join operator
ExternalHashJoinConfig
Configuration for external hash join
ExternalSort
External sort operator
ExternalSortConfig
Configuration for external sort
HashJoinResultIterator
Iterator over hash join results
MemoryConfig
Configuration for memory-bounded execution
MemoryController
Global memory controller for query execution
MemoryReservation
A memory reservation for a single operator
MemoryStats
Statistics snapshot from the memory controller
QueryArena
SpillFile
A handle to a temporary spill file
SpillFileSet
A collection of spill files for managing multiple sorted runs

Enums§

JoinType
Join type for the external hash join
SortedIterator
Iterator over sorted results

Constants§

DEFAULT_MEMORY_BUDGET
Default memory budget: 1GB Conservative default that works on most systems
DEFAULT_SPILL_THRESHOLD
Default spill threshold: 80% Start spilling when 80% of budget is used
DEFAULT_TARGET_PARTITION_BYTES
Default target partition size for external operators: 64MB Tuned for good I/O efficiency while limiting memory per partition
MIN_OPERATOR_MEMORY
Minimum memory for an operator: 4MB Below this, operators may not function correctly

Type Aliases§

SortKey
Sort key for a row: the evaluated ORDER BY values with their directions