MorselConfig

Struct MorselConfig 

Source
pub struct MorselConfig {
    pub morsel_size: usize,
    pub filter_size: usize,
    pub group_by_size: usize,
    pub join_build_size: usize,
    pub join_probe_size: usize,
    pub sort_size: usize,
    pub scan_size: usize,
    pub aggregate_size: usize,
}
Expand description

Configuration for morsel-driven execution.

Supports per-operation morsel sizes based on benchmark data showing that different operations perform optimally with different morsel sizes:

  • Filter/Aggregate: Smaller sizes (2K) improve cache locality
  • GROUP BY: Smaller sizes (2K) benefit from faster hash table merges
  • Join: Medium sizes (4K) balance hash table operations
  • Sort: Larger sizes (8K) improve merge phase efficiency
  • Scan: Larger sizes (8K) optimize sequential I/O

DuckDB uses 2048 for SIMD vectorization alignment (32 x 64-byte AVX-512 elements).

Fields§

§morsel_size: usize

Default morsel size (used when no operation-specific size applies)

§filter_size: usize

Morsel size for filter operations (default: 2048)

§group_by_size: usize

Morsel size for GROUP BY operations (default: 2048)

§join_build_size: usize

Morsel size for hash join build phase (default: 4096)

§join_probe_size: usize

Morsel size for hash join probe phase (default: 4096)

§sort_size: usize

Morsel size for sort operations (default: 8192)

§scan_size: usize

Morsel size for scan/materialize operations (default: 8192)

§aggregate_size: usize

Morsel size for aggregate operations (default: 2048)

Implementations§

Source§

impl MorselConfig

Source

pub fn new(morsel_size: usize) -> Self

Create a new configuration with the given morsel size for all operations.

This uses the same size for all operations, which is useful for testing or when you want uniform behavior. For production use, prefer optimal() which uses per-operation sizes based on benchmark data.

Source

pub fn with_per_operation_sizes() -> Self

Create a new configuration with per-operation optimal sizes.

Uses different morsel sizes for each operation based on benchmark data:

  • Filter/Aggregate: 2048 (cache locality)
  • GROUP BY: 2048 (hash table merge efficiency)
  • Join: 4096 (hash table operations)
  • Sort: 8192 (merge phase efficiency)
  • Scan: 8192 (sequential I/O)
Source

pub fn optimal() -> Self

Calculate optimal morsel size based on hardware characteristics.

Uses per-operation morsel sizes based on benchmark data (see issue #4282). Supports MORSEL_SIZE environment variable override for all operations.

When MORSEL_SIZE is set, that value is used for all operations (uniform mode). Otherwise, per-operation optimal sizes are used.

Source

pub fn for_row_width(avg_row_bytes: usize) -> Self

Create an adaptive configuration based on estimated row width in bytes.

Adjusts morsel size to maintain consistent L3 cache occupancy regardless of row width. Wide rows get smaller morsels, narrow rows get larger morsels.

§Arguments
  • avg_row_bytes - Estimated average size of each row in bytes
§Example
use vibesql_executor::select::morsel::MorselConfig;

// For wide rows (~500 bytes each), use smaller morsels
let wide_config = MorselConfig::for_row_width(500);

// For narrow rows (~20 bytes each), use larger morsels
let narrow_config = MorselConfig::for_row_width(20);

// Narrow rows get larger morsels than wide rows
assert!(narrow_config.morsel_size > wide_config.morsel_size);
Source

pub fn for_schema(schema: &[DataType]) -> Self

Create an adaptive configuration based on a schema (list of column types).

Estimates row width from the schema and adjusts morsel size accordingly. This is the recommended method when schema information is available.

§Arguments
  • schema - Slice of column data types in the row
§Example
use vibesql_executor::select::morsel::MorselConfig;
use vibesql_types::DataType;

let schema = [
    DataType::Integer,
    DataType::Varchar { max_length: Some(100) },
    DataType::Date,
];
let config = MorselConfig::for_schema(&schema);
assert!(config.morsel_size > 0);
Source

pub fn for_selectivity(selectivity: f64) -> Self

Create an adaptive configuration based on estimated filter selectivity.

For filter operations with known selectivity, adjusts morsel size:

  • Low selectivity (few rows pass) -> larger morsels to reduce overhead
  • High selectivity (many rows pass) -> smaller morsels for better balancing
§Arguments
  • selectivity - Fraction of rows expected to pass the filter (0.0 to 1.0)
§Example
use vibesql_executor::select::morsel::MorselConfig;

// For a highly selective filter (1% pass rate), use larger morsels
let selective = MorselConfig::for_selectivity(0.01);

// For a low selectivity filter (90% pass rate), use smaller morsels
let unselective = MorselConfig::for_selectivity(0.90);

// More selective filters get larger morsels
assert!(selective.morsel_size > unselective.morsel_size);
Source

pub fn adaptive(schema: &[DataType], selectivity: Option<f64>) -> Self

Create an adaptive configuration combining row width and selectivity hints.

This is the most accurate method when both schema and selectivity estimates are available (e.g., from query optimizer statistics).

§Arguments
  • schema - Slice of column data types
  • selectivity - Optional filter selectivity (0.0 to 1.0)
§Example
use vibesql_executor::select::morsel::MorselConfig;
use vibesql_types::DataType;

let schema = [DataType::Integer, DataType::Bigint];
let config = MorselConfig::adaptive(&schema, Some(0.05));
assert!(config.morsel_size > 0);

Trait Implementations§

Source§

impl Clone for MorselConfig

Source§

fn clone(&self) -> MorselConfig

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for MorselConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for MorselConfig

Source§

fn default() -> Self

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<G1, G2> Within<G2> for G1
where G2: Contains<G1>,

Source§

fn is_within(&self, b: &G2) -> bool

Source§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,