Skip to main content

DataSource

Trait DataSource 

Source
pub trait DataSource: Send + Sync {
    // Required methods
    fn id(&self) -> &str;
    fn refresh(
        &self,
        config: &SamplerConfig,
        cursor: Option<&SourceCursor>,
        limit: Option<usize>,
    ) -> Result<SourceSnapshot, SamplerError>;
    fn reported_record_count(
        &self,
        config: &SamplerConfig,
    ) -> Result<u128, SamplerError>;

    // Provided method
    fn default_triplet_recipes(&self) -> Vec<TripletRecipe> { ... }
}
Expand description

Sampler-facing data source interface.

Implementations may be streaming or index-backed. For a fixed dataset state and cursor, refresh output should be deterministic.

Required Methods§

Source

fn id(&self) -> &str

Stable source identifier used in records, metrics, and persistence state.

Source

fn refresh( &self, config: &SamplerConfig, cursor: Option<&SourceCursor>, limit: Option<usize>, ) -> Result<SourceSnapshot, SamplerError>

Fetch up to limit records starting from cursor state.

Return the next cursor position in SourceSnapshot.cursor.

Source

fn reported_record_count( &self, config: &SamplerConfig, ) -> Result<u128, SamplerError>

Exact metadata record count reported by the source.

This is intended for estimators that must avoid iterating records. Implementations should return Ok(count) only when the count is exact for the source scope. Return Err when exact counting is not possible or the source is unavailable.

Keep this consistent with refresh by using the same backend scope, filtering, and logical corpus definition.

Provided Methods§

Source

fn default_triplet_recipes(&self) -> Vec<TripletRecipe>

Optional source-provided default triplet recipes.

Used when sampler config does not provide explicit recipes.

Implementors§