Expand description
v6.2.2 — selectivity estimation over per-column statistics.
Each selectivity function returns a fraction in [0.0, 1.0] —
the planner multiplies these against row_count to get
estimated input cardinality for each operator. v6.2.3 JOIN
reorder consumes these estimates; v6.2.4 EXPLAIN ANALYZE
surfaces them alongside the actual-rows count.
Defaults follow PG’s “no-stats” guesses so a freshly-loaded table without a prior ANALYZE still gets a plausible plan:
DEFAULT_EQ = 0.005— PG’sDEFAULT_EQ_SELDEFAULT_RANGE = 0.333— PG’sDEFAULT_INEQ_SELDEFAULT_BETWEEN = 0.005— narrower than range; matches PG forBETWEEN x AND ywithout statsDEFAULT_LIKE = 0.005— PG’sDEFAULT_MATCH_SEL
Histogram walks use a binary-search-based “fraction ≤ value”
primitive (fraction_le_value), giving us range estimation in
O(log n_buckets) per call. Equality keys off n_distinct
when the value lands inside the histogram range; out-of-range
values get an extrapolation cap so OUT-OF-RANGE predicates
don’t collapse to zero (which would make the planner pick
degenerate plans like cross-products).
Constants§
- DEFAULT_
BETWEEN - PG’s default for
col BETWEEN a AND bwithout stats. - DEFAULT_
EQ - PG’s default selectivity for
col = constantwhen no histogram is available. v6.2.x can re-tune. - DEFAULT_
LIKE - PG’s default for
col LIKE 'prefix%'without stats. - DEFAULT_
RANGE - PG’s default for
col <= / < / >= / > constantwithout stats.
Functions§
- between
col BETWEEN low AND high— convenience for the inclusive double-bounded shape. Equivalent torangewith both bounds set and inclusive.- equal
col = value. With stats, returns(1 / n_distinct) × (1 - null_frac)whenvaluelies in the histogram range, else scales down by an order of magnitude for out-of-range extrapolation. Without stats, returnsDEFAULT_EQ.- in_list
col IN (v1, v2, …). Sums per-value equality selectivities, clamped at 1.0. Without stats, returnsDEFAULT_EQ × len(values)(also clamped) — the same shape PG would produce.- like_
prefix col LIKE 'prefix%'(or any single-prefix anchored pattern). With stats, estimates asrange(prefix, prefix + "\u{FFFF}")on the assumption the column’s natural ordering is a prefix order (Text lex). Without stats,DEFAULT_LIKE.- range
col >= low AND col <= high(with both bounds optional). WhenlowisNonethe lower side is open at −∞; same forhighand +∞.lo_incl/hi_inclcontrol whether the boundary itself is included (currently a near-no-op since selectivity estimation is approximate at the boundary, but kept in the signature so the planner can pass the parser’s intent through).