Skip to main content

Store

Struct Store 

Source
pub struct Store { /* private fields */ }

Implementations§

Source§

impl Store

Source

pub async fn open(location: &Url) -> Result<Self>

Open against a local filesystem URL or a remote one for which the caller has no extra options to pass (env vars suffice). CLI verbs that load [storage] from config should call Store::open_with_options instead so the same options flow into every dataset open and write.

Source

pub async fn open_with_options( location: &Url, storage_options: HashMap<String, String>, caps: RuntimeCaps, ) -> Result<Self>

Open with object-store options (S3 creds, region, endpoint, …) threaded through Lance verbatim. Keys are the standard object_store config names; pond does not parse them. Empty options + default caps is equivalent to Store::open. Cache caps come from the [runtime] config block via crate::substrate::RuntimeCaps.

Source

pub async fn open_local(path: impl AsRef<Path>) -> Result<Self>

Convenience for tests and CLI verbs holding a &Path: wraps the path in a file://... URL via config::url_for_path before opening. Routes through Store::open_with_options so the production policy is applied; tests get the backend-aware local-FS defaults.

Source

pub async fn export_clean_lance_datasets( &self, dest: &Path, ) -> Result<LanceArchiveExport>

Export clean, index-free Lance datasets into dest.

This rewrites the visible rows of each table instead of copying the dataset roots. The resulting manifests therefore contain no references to the source store’s _indices, while messages.vector and messages.embedding_model remain ordinary data columns and are preserved.

Source

pub async fn import_clean_lance_datasets( &self, source: &Path, ) -> Result<LanceArchiveImport>

Source

pub async fn upsert_sessions( &self, sessions: &[Session], ) -> Result<Vec<UpsertStatus>>

Source

pub async fn upsert_messages( &self, session: &Session, messages: &[MessageWrite<'_>], ) -> Result<Vec<UpsertStatus>>

Source

pub async fn upsert_parts(&self, parts: &[Part]) -> Result<Vec<UpsertStatus>>

Source

pub async fn get_session( &self, session_id: &str, ) -> Result<Option<SessionWithMessages>>

Source

pub async fn session_ids(&self) -> Result<Vec<String>>

Every session id currently in the store, unsorted.

Source

pub async fn child_sessions( &self, parent_session_id: &str, ) -> Result<Vec<Session>>

Source

pub async fn session_last_ingested_at( &self, ) -> Result<HashMap<String, DateTime<Utc>>>

session_id -> wall-clock time of the Lance manifest version that last wrote the row for the per-session staleness skip (spec.md#adapter-integrity-event-ordering). Reads Lance’s _row_last_updated_at_version system column (available because pond enables stable row ids per spec.md#lance-table-creation-stable-row-ids) and joins it against Dataset::versions() for commit timestamps.

Source

pub async fn session_view( &self, session_id: &str, params: SessionViewParams<'_>, ) -> Result<GetLookup<SessionPage>>

Whole-session view for pond_get session mode (spec.md#protocol). Conversational filters to search_text IS NOT NULL; Complete and Verbatim scan every message. Every mode attaches compact part summaries; Verbatim additionally inlines full parts. after_id is an exclusive lower bound (a message id); the page is bounded by limit and a byte budget and never cuts mid-message.

Source

pub async fn message_view( &self, message_id: &str, params: MessageViewParams<'_>, ) -> Result<GetLookup<MessagePage>>

Message-scope retrieval for pond_get message mode (spec.md#protocol): the target with its full parts (paginated by after_id over part ordinals, then budget) plus up to 2*context_depth siblings around it. None when no stored message carries message_id. Sibling parts are carried for summarizing; the target’s parts ride target_parts.

Source

pub async fn scan_conversational_messages( &self, session_id: &str, ) -> Result<Vec<ConversationalRow>>

Conversational scan over one session: rows ordered by (timestamp, id), IsNotNull("search_text") pushed down at the read seam (spec.md#search-prefilter-pushdown).

Source

pub async fn session_id_for_message( &self, message_id: &str, ) -> Result<Option<String>>

Locate the session id for a stored message. Cheap when only the routing hint is needed - callers that need the messages use scan_all_messages.

Source

pub async fn row_counts(&self) -> Result<(usize, usize, usize)>

Source

pub async fn corpus_stats(&self, include_subagents: bool) -> Result<CorpusStats>

Compute the per-adapter / per-project rollup that drives pond status. One scan over messages projecting the three columns the rollup keys on (source_agent, project, session_id), aggregated in-memory. Bounded by the cross product of adapters and projects, which stays small on real corpora.

Source

pub async fn write_embeddings(&self, rows: &[EmbeddedMessage]) -> Result<()>

Write a batch of embeddings into messages: set vector and embedding_model on each row by (session_id, id) (spec.md#session-embed-from-canonical). The column update goes through the write seam and lands as a new manifest version (append-only).

Source

pub fn pending_embedding_messages( &self, ) -> impl Stream<Item = Result<PendingMessage>> + '_

Stream the backlog of messages needing embedding: rows with search_text set whose vector is null (spec.md#session-embed-from-canonical).

Source

pub fn pending_or_stale_messages( &self, ) -> impl Stream<Item = Result<PendingMessage>> + '_

Stream messages that are either never embedded or stale under the current model. pond embed --force feeds this to the same unconditional merge_update as the normal backlog; the filter makes that semantically equivalent to the conditional update in spec.md#session-embed-from-canonical.

BM25 full-text retriever over messages.search_text.

Source

pub async fn has_embeddings(&self) -> Result<bool>

Whether any messages row carries a vector (spec.md#search) - the signal that flips search from FTS-only to hybrid. The single-active- model invariant (see MESSAGE_SCALAR_INDICES) means any non-null vector belongs to the current model.

Vector kNN retriever over messages.vector, prefiltered by the caller’s scalar predicate (spec.md#search-prefilter-pushdown). Combines the caller’s filter with vector IS NOT NULL to exclude un-embedded rows from the scan; the brute-force kNN path requires this (the IVF_PQ path would skip them anyway). The single-active-model invariant lets pond drop the per-row model filter: every non-null vector belongs to the current model.

Source

pub async fn explain_vector_plan( &self, query: &[f32], limit: usize, filter: &Predicate, search: Option<&SearchConfig>, ) -> Result<String>

The DataFusion plan string for a filtered vector scan - the search-prefilter-pushdown regression guard reads it.

Source

pub async fn explain_fts_plan( &self, query: &str, limit: usize, filter: &Predicate, ) -> Result<String>

Source

pub async fn message_metas_by_keys( &self, keys: &[MessageKey], ) -> Result<Vec<MessageMeta>>

Hydrate search hits: fetch message metadata for (session_id, message_id) keys.

Source

pub async fn session_message_counts( &self, session_ids: &[String], ) -> Result<BTreeMap<String, usize>>

Total message count per session, for search session summaries.

Source

pub async fn unindexed_message_backlog(&self) -> Result<usize>

Rows appended to messages since the FTS index was last optimized. A missing index reports the whole table; the query is manifest-only.

Source

pub async fn unindexed_vector_backlog(&self) -> Result<usize>

Rows added or rewritten in messages since the IVF_PQ vector index was last optimized. Below VECTOR_INDEX_ACTIVATION_ROWS no index exists yet, so the caller must read embedding_progress too and distinguish “index not built yet” from “index trails data”.

Source

pub async fn embedding_progress(&self) -> Result<EmbeddingProgress>

Embedding coverage: how many messages rows carry a vector and how many are still eligible. Drives the pond status embeddings line and the pond embed progress bar’s known total. embedded reads the vector IS NOT NULL count directly - the single-active-model invariant (see MESSAGE_SCALAR_INDICES) means there is no need to scope by the embedding_model column.

Source

pub async fn stale_embedding_count(&self) -> Result<usize>

Count rows whose embedding_model is not the currently configured model AND whose vector is still populated - the signal pond embed uses to detect a model swap and require --force.

Source

pub async fn optimize_indices( &self, progress: Option<OptimizeProgressFn>, cleanup: Option<CleanupConfig>, ) -> Result<OptimizeOutcome>

Run the per-table maintenance cycle (compact + indices) across every table, never short-circuiting. spec.md#lance-index-maintenance: indices and compaction commit independently, so a hot writer that starves compaction on one table does not abort the index work the operator asked for on other tables (or even on the same table).

Source

pub async fn build_indices_only( &self, progress: Option<OptimizeProgressFn>, ) -> Result<OptimizeOutcome>

Fold trailing fragments into existing indices across every table, without running compaction. Used by pond embed’s tail so newly written vectors land in the FTS / IVF_PQ / btree / bitmap indices without paying the compaction retry budget while embed itself may still be writing in a sibling process.

Source

pub async fn rebuild_indices(&self, intent_name: Option<&str>) -> Result<()>

Source

pub async fn index_status(&self) -> Result<Vec<IndexStatus>>

Source

pub async fn drop_vector_index(&self) -> Result<()>

Drop the IVF_PQ index on messages.vector. Used by pond embed --force before re-bootstrapping under a different model. Silent when the index does not exist.

Source

pub async fn table_sizes(&self) -> Result<TableSizes>

On-disk byte totals per dataset, sized through Lance’s object store (spec.md#lance-chokepoints-storage) so pond status works on any backend.

Source

pub async fn text_script_histogram( &self, max_messages: usize, ) -> Result<Vec<(String, usize)>>

Histogram of Unicode script classes in messages.search_text, computed from a sample of up to max_messages non-null rows. Returned classes are sorted descending by character count. Lets pond status tell an agent whether the corpus is monolingual or mixed - the agent then knows whether bilingual querying is worth attempting (cross-lingual recall is a caller-layer concern; pond does not translate internally).

Source

pub async fn message_vector_by_id( &self, message_id: &str, ) -> Result<Option<Vec<f32>>>

Fetch the stored vector for one message, for similar_to-style retrieval. Returns Ok(None) when the message exists but is not yet embedded, or when no message has that id. Vectors are stored Float16 (embedding_vector_type); the search path takes &[f32], so we widen here. Cheap rare op - one predicate scan, no index.

Source

pub async fn parts_for_messages( &self, session_id: &str, message_ids: &[String], ) -> Result<BTreeMap<(String, String), Vec<Part>>>

Every part of these messages, full fidelity (file blobs included). The canonical read primitive - restore/export, verbatim mode, and the message-mode target all need the complete set.

Source

pub async fn summary_parts_for_messages( &self, session_id: &str, message_ids: &[String], ) -> Result<BTreeMap<(String, String), Vec<Part>>>

Only the parts that yield a [PartSummary] (SUMMARY_PART_TYPES), skipping text/reasoning (and their blobs) that would summarize to nothing. For the summary-only reads (conversational/complete session views, search hits) - it never feeds restore/export.

Trait Implementations§

Source§

impl Debug for Store

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl !Freeze for Store

§

impl !RefUnwindSafe for Store

§

impl Send for Store

§

impl Sync for Store

§

impl Unpin for Store

§

impl UnsafeUnpin for Store

§

impl !UnwindSafe for Store

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> Conv for T

Source§

fn conv<T>(self) -> T
where Self: Into<T>,

Converts self into T using Into<T>. Read more
Source§

impl<T> DropFlavorWrapper<T> for T

Source§

type Flavor = MayDrop

The DropFlavor that wraps T into Self
Source§

impl<T> FmtForward for T

Source§

fn fmt_binary(self) -> FmtBinary<Self>
where Self: Binary,

Causes self to use its Binary implementation when Debug-formatted.
Source§

fn fmt_display(self) -> FmtDisplay<Self>
where Self: Display,

Causes self to use its Display implementation when Debug-formatted.
Source§

fn fmt_lower_exp(self) -> FmtLowerExp<Self>
where Self: LowerExp,

Causes self to use its LowerExp implementation when Debug-formatted.
Source§

fn fmt_lower_hex(self) -> FmtLowerHex<Self>
where Self: LowerHex,

Causes self to use its LowerHex implementation when Debug-formatted.
Source§

fn fmt_octal(self) -> FmtOctal<Self>
where Self: Octal,

Causes self to use its Octal implementation when Debug-formatted.
Source§

fn fmt_pointer(self) -> FmtPointer<Self>
where Self: Pointer,

Causes self to use its Pointer implementation when Debug-formatted.
Source§

fn fmt_upper_exp(self) -> FmtUpperExp<Self>
where Self: UpperExp,

Causes self to use its UpperExp implementation when Debug-formatted.
Source§

fn fmt_upper_hex(self) -> FmtUpperHex<Self>
where Self: UpperHex,

Causes self to use its UpperHex implementation when Debug-formatted.
Source§

fn fmt_list(self) -> FmtList<Self>
where &'a Self: for<'a> IntoIterator,

Formats each item in a sequence. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, W> HasTypeWitness<W> for T
where W: MakeTypeWitness<Arg = T>, T: ?Sized,

Source§

const WITNESS: W = W::MAKE

A constant of the type witness
Source§

impl<T> Identity for T
where T: ?Sized,

Source§

const TYPE_EQ: TypeEq<T, <T as Identity>::Type> = TypeEq::NEW

Proof that Self is the same type as Self::Type, provides methods for casting between Self and Self::Type.
Source§

type Type = T

The same type as Self, used to emulate type equality bounds (T == U) with associated type equality constraints (T: Identity<Type = U>).
Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<Unshared, Shared> IntoShared<Shared> for Unshared
where Shared: FromUnshared<Unshared>,

Source§

fn into_shared(self) -> Shared

Creates a shared type from an unshared type.
Source§

impl<T> Pipe for T
where T: ?Sized,

Source§

fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> R
where Self: Sized,

Pipes by value. This is generally the method you want to use. Read more
Source§

fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> R
where R: 'a,

Borrows self and passes that borrow into the pipe function. Read more
Source§

fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> R
where R: 'a,

Mutably borrows self and passes that borrow into the pipe function. Read more
Source§

fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
where Self: Borrow<B>, B: 'a + ?Sized, R: 'a,

Borrows self, then passes self.borrow() into the pipe function. Read more
Source§

fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
where Self: BorrowMut<B>, B: 'a + ?Sized, R: 'a,

Mutably borrows self, then passes self.borrow_mut() into the pipe function. Read more
Source§

fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
where Self: AsRef<U>, U: 'a + ?Sized, R: 'a,

Borrows self, then passes self.as_ref() into the pipe function.
Source§

fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
where Self: AsMut<U>, U: 'a + ?Sized, R: 'a,

Mutably borrows self, then passes self.as_mut() into the pipe function.
Source§

fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
where Self: Deref<Target = T>, T: 'a + ?Sized, R: 'a,

Borrows self, then passes self.deref() into the pipe function.
Source§

fn pipe_deref_mut<'a, T, R>( &'a mut self, func: impl FnOnce(&'a mut T) -> R, ) -> R
where Self: DerefMut<Target = T> + Deref, T: 'a + ?Sized, R: 'a,

Mutably borrows self, then passes self.deref_mut() into the pipe function.
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> Tap for T

Source§

fn tap(self, func: impl FnOnce(&Self)) -> Self

Immutable access to a value. Read more
Source§

fn tap_mut(self, func: impl FnOnce(&mut Self)) -> Self

Mutable access to a value. Read more
Source§

fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
where Self: Borrow<B>, B: ?Sized,

Immutable access to the Borrow<B> of a value. Read more
Source§

fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
where Self: BorrowMut<B>, B: ?Sized,

Mutable access to the BorrowMut<B> of a value. Read more
Source§

fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
where Self: AsRef<R>, R: ?Sized,

Immutable access to the AsRef<R> view of a value. Read more
Source§

fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
where Self: AsMut<R>, R: ?Sized,

Mutable access to the AsMut<R> view of a value. Read more
Source§

fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
where Self: Deref<Target = T>, T: ?Sized,

Immutable access to the Deref::Target of a value. Read more
Source§

fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
where Self: DerefMut<Target = T> + Deref, T: ?Sized,

Mutable access to the Deref::Target of a value. Read more
Source§

fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self

Calls .tap() only in debug builds, and is erased in release builds.
Source§

fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self

Calls .tap_mut() only in debug builds, and is erased in release builds.
Source§

fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
where Self: Borrow<B>, B: ?Sized,

Calls .tap_borrow() only in debug builds, and is erased in release builds.
Source§

fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
where Self: BorrowMut<B>, B: ?Sized,

Calls .tap_borrow_mut() only in debug builds, and is erased in release builds.
Source§

fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
where Self: AsRef<R>, R: ?Sized,

Calls .tap_ref() only in debug builds, and is erased in release builds.
Source§

fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
where Self: AsMut<R>, R: ?Sized,

Calls .tap_ref_mut() only in debug builds, and is erased in release builds.
Source§

fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
where Self: Deref<Target = T>, T: ?Sized,

Calls .tap_deref() only in debug builds, and is erased in release builds.
Source§

fn tap_deref_mut_dbg<T>(self, func: impl FnOnce(&mut T)) -> Self
where Self: DerefMut<Target = T> + Deref, T: ?Sized,

Calls .tap_deref_mut() only in debug builds, and is erased in release builds.
Source§

impl<T> TryConv for T

Source§

fn try_conv<T>(self) -> Result<T, Self::Error>
where Self: TryInto<T>,

Attempts to convert self into T using TryInto<T>. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<G1, G2> Within<G2> for G1
where G2: Contains<G1>,

Source§

fn is_within(&self, b: &G2) -> bool

Source§

impl<T> ErasedDestructor for T
where T: 'static,

Source§

impl<T> MaybeSend for T
where T: Send,

Source§

impl<T> MaybeSend for T
where T: Send,

Source§

impl<E> ResultError for E
where E: Send + Debug + Sync,