Skip to main content

Module substrate

Module substrate 

Source
Expand description

The storage substrate (spec.md#substrate): pond’s one seam to Lance, generic over consumers.

Structs§

ConflictExhausted
Anyhow-chain sentinel pond attaches when retry_lance exhausts attempts against an OCC commit-conflict failure (spec.md#protocol). The wire layer downcasts to this type to classify the outcome as conflict rather than the generic storage_unavailable.
DataLiveness
data/ bytes on disk vs bytes the latest manifest references; the gap is superseded versions awaiting the cleanup retention window.
Handle
IndexIntent
Declarative description of one index pond keeps on a table. Created when its trigger fires; folded forward by pond index optimize.
IndexStatus
MaintenancePolicy
Resolved per-call inputs to the storage-maintenance pass. Built from [maintenance] (and any per-invocation CLI override) at the entry point; threaded down to optimize_table_compact so the substrate never re-reads Config itself.
ResolvedStorage
A storage address with its options assembled and secrets materialized - everything Store::open_with_options needs, plus the binding for display.
RuntimeCaps
Lance cache caps in bytes. None lets the substrate pick the backend-aware default (local FS gets a tighter cap; object stores stay near Lance’s defaults). Wired through Store::open_with_options from [runtime].
ScanOpts
Read-side options for Handle::scan: optional prefilter predicate and optional projection. Default = no filter, all columns.
StorageUrl
A parsed pond storage address. The fat-URL grammar (s3+https://host/bucket/prefix) folds the endpoint into the address so it can never desync from the bucket (the litestream out-of-band-endpoint failure class); parsing splits it back into the URL Lance opens plus the object_store options the endpoint implies.
TableOptimizeOutcome
What Handle::optimize_table did for one table.
TableSizes
On-disk byte totals for the three session datasets, plus everything else under the data-dir root. Sized by listing through Lance’s object-store layer (spec.md#lance-chokepoints-storage) so file:// and s3:// behave alike.

Enums§

BindVia
How a creds set got bound to a URL - surfaced in binding lines so a wrong match is visible before any auth error.
CheckFailure
pond storage check failure classes, each with its own exit code at the CLI so cron and CI can branch on them. Display carries only the fix-naming lead; the underlying error is exposed separately through CheckFailure::concise_cause so surfaces stay one readable line instead of trailing the upstream chain (Lance flattens its inner errors into each level’s Display, so the raw chain prints the same failure several times over).
CredsBinding
IndexParamsKind
The lance-native shape of an IndexIntent’s params, dispatched to the right IndexParams at create time.
IndexTrigger
When an IndexIntent should exist on disk.
OptimizeEvent
Boundary event during one Handle::optimize_table pass. The CLI binds a progress callback to render a live spinner; library callers pass None.
OptimizePhase
PhaseOutcome
Per-phase result for one table’s pass through Handle::optimize_table. spec.md#substrate 3.7 (lance-index-maintenance): the indices phase and the compaction phase get independent retry budgets and independent commits, so a hot writer that starves the Rewrite cannot abort the index Update.
Predicate
ScalarValue
Table

Constants§

COMPACTION_ABSORB_FACTOR
Keep a task only when the merged-in remainder is >= largest/this: size-tiered amortization, O(log n) lifetime rewrites per row.
DEFAULT_COMPACTION_FRAGMENT_CAP
Per-task fragment-count backstop: tasks this wide always run, bounding manifest growth even when the amplification veto would skip them. As policy cap, 0 disables the veto (tests).
DEFAULT_INDEX_LAG_THRESHOLD
Default minimum unindexed-fragment count required before a per-intent append/rebuild step is admitted into optimize_table_indices. Lower values make each commit smaller and more frequent (bad on remote stores); higher values let fragments accumulate behind the brute-force fallback. 4 is the floor of the documented 4-8 band.
TARGET_FRAGMENT_BYTES
Fragments are sized by bytes, not Lance’s 1M-row default: kilobyte-average rows make a row target tolerate multi-GiB fragments that compaction re-rewrites wholesale to absorb tiny appends (~190 GiB/day of churn).
VECTOR_INDEX_ACTIVATION_ROWS
Embedded-row count at which pond builds the IVF_PQ vector index on messages.vector (spec.md#search). Below it, vector search runs a brute-force flat scan - exact and fast at small and medium scale, and IVF_PQ cannot train well on fewer vectors anyway.

Functions§

default_cleanup_older_than
Default manifest-retention window for the safe cleanup pass. Matches LanceDB’s recommended OSS-operator practice (lancedb docs: performance.mdx, tables/update.mdx). With delete_unverified=false, Lance’s 7-day in-progress guard still protects unverified files regardless of this value (UNVERIFIED_THRESHOLD_DAYS in lance/dataset/cleanup.rs).
index_lag_threshold
init_index_lag_threshold
Seed the process-wide index-lag threshold from [maintenance].index_lag_threshold. First call wins (mirrors embed::init_model_id / sessions::init_embedding_dim).
is_commit_conflict
True when the chain root is one of Lance’s commit-conflict variants (CommitConflict, RetryableCommitConflict, TooMuchWriteContention). Everything else (timeouts, IAM denials, disk errors) is not a conflict.
storage_check
Probe a resolved storage destination end-to-end (spec.md#substrate): a conditional PutMode::Create pair proving the If-None-Match -> 412 OCC primitive Lance’s commit handler relies on, then read-back and delete of the synthetic key.
unmatched_creds_sets
Names of defined creds sets that bound to none of this invocation’s URLs (spec.md#creds-scope-match: misbinding must never be silent). Empty when the invocation touched no credential-taking URL - a local-only command must not nag about sets kept for remote work.

Type Aliases§

OptimizeProgressFn