Struct TableStatistics

Source

pub struct TableStatistics {
    pub row_count: usize,
    pub columns: HashMap<String, ColumnStatistics>,
    pub last_updated: SystemTime,
    pub is_stale: bool,
    pub sample_metadata: Option<SampleMetadata>,
    pub avg_row_bytes: Option<f64>,
}

Expand description

Statistics for an entire table

Fields§

§row_count: usize

Total number of rows

§columns: HashMap<String, ColumnStatistics>

Per-column statistics

§last_updated: SystemTime

Timestamp when stats were last updated

§is_stale: bool

Whether stats are stale (need recomputation)

§sample_metadata: Option<SampleMetadata>

Sampling metadata (Phase 5.2) None if no sampling was used (small table)

§avg_row_bytes: Option<f64>

Average row size in bytes (computed from sampled data)

This provides actual row size measurements that account for:

Real string/varchar fill ratios (not heuristic estimates)
Actual NULL prevalence
True BLOB/CLOB sizes

Used by DML cost estimation to scale WAL write costs. None if statistics were estimated from schema (no actual data sampled).

Implementations§

Source §

impl TableStatistics

Source

pub fn estimate_from_schema(row_count: usize, schema: &TableSchema) -> Self

Create estimated statistics with basic column estimates

This method provides reasonable defaults for column statistics without requiring a full ANALYZE scan. It uses data type information to generate basic statistics using conservative heuristics.

§Heuristics Used

Boolean columns: n_distinct = 2
Integer/Smallint/Bigint/Unsigned columns: n_distinct = sqrt(row_count) (conservative)
Float/Real/DoublePrecision columns: n_distinct = sqrt(row_count) to 100 (high cardinality)
Varchar/Character/Name columns: n_distinct = row_count * 0.5 (assume moderate uniqueness)
Date/Timestamp/Time columns: n_distinct = row_count * 0.8 (high cardinality)
Numeric/Decimal columns: n_distinct = sqrt(row_count) (moderate)
Nullable columns: null_count ≈ row_count * 0.01 (1% estimated nulls)
Non-nullable columns: null_count = 0
All columns: is_stale = true (clearly marked as estimates)

§Arguments

row_count - Total number of rows in the table
schema - Table schema with column definitions

§Example

let stats = TableStatistics::estimate_from_schema(5000, &schema);
// Boolean col: n_distinct = 2
// Integer col: n_distinct = sqrt(5000) ≈ 70
// Varchar col: n_distinct = 2500
// All columns: is_stale = true

Source

pub fn compute(rows: &[Row], schema: &TableSchema) -> Self

Compute statistics by scanning the table

Source

pub fn compute_with_config( rows: &[Row], schema: &TableSchema, sampling_config: Option<SamplingConfig>, enable_histograms: bool, histogram_buckets: usize, bucket_strategy: BucketStrategy, ) -> Self

Compute statistics with sampling (Phase 5.2) and histogram support (Phase 5.1)

§Arguments

rows - All table rows
schema - Table schema
sampling_config - Optional sampling configuration (None = adaptive)
enable_histograms - Whether to build histograms
histogram_buckets - Number of histogram buckets
bucket_strategy - Histogram bucketing strategy

Source

pub fn compute_sampled(rows: &[Row], schema: &TableSchema) -> Self

Compute statistics using adaptive sampling (Phase 5.2 convenience method)

This automatically:

Uses full scan for small tables (< 1000 rows)
Uses 10% sample for medium tables (1K-100K rows)
Uses fixed 10K sample for large tables (> 100K rows)

Source

pub fn compute_full_featured(rows: &[Row], schema: &TableSchema) -> Self

Compute statistics with both sampling and histograms enabled

Source

pub fn estimate_from_row_count(row_count: usize) -> Self

Create estimated statistics from table metadata without full ANALYZE

This provides a fallback for cost estimation when detailed statistics aren’t available (i.e., ANALYZE hasn’t been run). It uses the table’s row count and provides conservative defaults for other fields.

§Use Cases

DML cost estimation when ANALYZE hasn’t been run
Quick cost comparisons before detailed statistics are available

§Limitations

No per-column statistics (empty columns map)
No histogram data
Marked as stale to indicate these are estimates

§Example

let table_stats = table.get_statistics()
    .cloned()
    .unwrap_or_else(|| TableStatistics::estimate_from_row_count(table.row_count()));

Source

pub fn mark_stale(&mut self)

Mark statistics as stale after significant data changes

Source

pub fn needs_refresh(&self) -> bool

Check if statistics should be recomputed

Returns true if stats are marked stale or too old

Trait Implementations§

Source §

impl Clone for TableStatistics

Source §

fn clone(&self) -> TableStatistics

Returns a duplicate of the value. Read more

1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

Source §

impl Debug for TableStatistics

Source §

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl UnwindSafe for TableStatistics

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> CloneToUninit for T
where T: Clone,

Source §

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)

Performs copy-assignment from self to dest. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> Same for T

Source §

type Output = T

Should always be Self

Source §

impl<T> ToOwned for T
where T: Clone,

Source §

type Owned = T

The resulting type after obtaining ownership.

Source §

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

Source §

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

Source §

impl<T, U> TryFrom for T
where U: Into<T>,

Source §

type Error = Infallible

The type returned in the event of a conversion error.

Source §

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

Source §

impl<T, U> TryInto for T
where U: TryFrom<T>,

Source §

type Error = >::Error

The type returned in the event of a conversion error.

Source §

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

Source §

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source §

Struct TableStatistics Copy item path

Fields§

Implementations§

impl TableStatistics

pub fn estimate_from_schema(row_count: usize, schema: &TableSchema) -> Self

§Heuristics Used

§Arguments

§Example

pub fn compute(rows: &[Row], schema: &TableSchema) -> Self

pub fn compute_with_config( rows: &[Row], schema: &TableSchema, sampling_config: Option<SamplingConfig>, enable_histograms: bool, histogram_buckets: usize, bucket_strategy: BucketStrategy, ) -> Self

§Arguments

pub fn compute_sampled(rows: &[Row], schema: &TableSchema) -> Self

pub fn compute_full_featured(rows: &[Row], schema: &TableSchema) -> Self

pub fn estimate_from_row_count(row_count: usize) -> Self

§Use Cases

§Limitations

§Example

pub fn mark_stale(&mut self)

pub fn needs_refresh(&self) -> bool

Trait Implementations§

impl Clone for TableStatistics

fn clone(&self) -> TableStatistics

fn clone_from(&mut self, source: &Self)

impl Debug for TableStatistics

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Auto Trait Implementations§

impl Freeze for TableStatistics

impl RefUnwindSafe for TableStatistics

impl Send for TableStatistics

impl Sync for TableStatistics

impl Unpin for TableStatistics

impl UnwindSafe for TableStatistics

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> Same for T

type Output = T

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<V, T> VZip<V> for Twhere V: MultiLane<T>,

fn vzip(self) -> V

Struct TableStatistics

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,