pub trait DataSource: Send + Sync {
// Required methods
fn id(&self) -> &str;
fn refresh(
&self,
config: &SamplerConfig,
cursor: Option<&SourceCursor>,
limit: Option<usize>,
) -> Result<SourceSnapshot, SamplerError>;
fn reported_record_count(
&self,
config: &SamplerConfig,
) -> Result<u128, SamplerError>;
// Provided method
fn default_triplet_recipes(&self) -> Vec<TripletRecipe> { ... }
}Expand description
Sampler-facing data source interface.
Implementations may be streaming or index-backed. For a fixed dataset state and cursor, refresh output should be deterministic.
Required Methods§
Sourcefn refresh(
&self,
config: &SamplerConfig,
cursor: Option<&SourceCursor>,
limit: Option<usize>,
) -> Result<SourceSnapshot, SamplerError>
fn refresh( &self, config: &SamplerConfig, cursor: Option<&SourceCursor>, limit: Option<usize>, ) -> Result<SourceSnapshot, SamplerError>
Fetch up to limit records starting from cursor state.
Return the next cursor position in SourceSnapshot.cursor.
Sourcefn reported_record_count(
&self,
config: &SamplerConfig,
) -> Result<u128, SamplerError>
fn reported_record_count( &self, config: &SamplerConfig, ) -> Result<u128, SamplerError>
Exact metadata record count reported by the source.
This is intended for estimators that must avoid iterating records.
Implementations should return Ok(count) only when the count is
exact for the source scope. Return Err when exact counting is not
possible or the source is unavailable.
Keep this consistent with refresh by using the same backend scope,
filtering, and logical corpus definition.
Provided Methods§
Sourcefn default_triplet_recipes(&self) -> Vec<TripletRecipe>
fn default_triplet_recipes(&self) -> Vec<TripletRecipe>
Optional source-provided default triplet recipes.
Used when sampler config does not provide explicit recipes.