Skip to main content

DistributedQueryEngine

Struct DistributedQueryEngine 

Source
pub struct DistributedQueryEngine { /* private fields */ }
Expand description

Distributed query engine implementing two-phase TF-IDF.

This engine executes queries across multiple shards by:

  1. First collecting term frequencies from all shards
  2. Computing global document frequencies
  3. Re-executing queries with the global DF for accurate scoring
  4. Merging and normalizing results across shards

Implementations§

Source§

impl DistributedQueryEngine

Source

pub fn new(config: DistributedHybridConfig) -> Self

Create a new distributed query engine with the given configuration.

Source

pub fn with_defaults() -> Self

Create a query engine with default configuration.

Source

pub fn config(&self) -> &DistributedHybridConfig

Get the configuration.

Source

pub fn get_local_term_frequencies( &self, shard: &ShardedColony, terms: &[String], ) -> HashMap<String, u64>

Phase 1: Get term frequencies from a shard.

Collects how many documents in this shard contain each query term. This is used to compute local document frequencies.

Source

pub fn aggregate_global_df( &self, local_dfs: Vec<HashMap<String, u64>>, ) -> HashMap<String, u64>

Phase 2: Aggregate global document frequencies.

Combines local document frequencies from all shards to compute the global DF for each term across the entire distributed graph.

Source

pub fn execute_local_query( &self, shard: &ShardedColony, request: &LocalQueryRequest, ) -> LocalQueryResult

Phase 3: Execute local query with global DF.

Computes TF-IDF scores for nodes in a single shard using the global document frequencies for accurate IDF computation.

Source

pub fn merge_results(&self, results: Vec<LocalQueryResult>) -> Vec<ScoredNode>

Phase 4: Merge results from all shards.

Combines results from multiple shards, normalizes scores across shards, sorts by score, and returns the top-k results.

Source

pub fn distributed_query( &self, shards: &[&ShardedColony], query_text: &str, ) -> Vec<ScoredNode>

Execute a full distributed query across multiple shards.

This is the main entry point for distributed queries. It coordinates all four phases of the two-phase TF-IDF algorithm:

  1. Collects local term frequencies from each shard
  2. Aggregates them into global document frequencies
  3. Executes local queries on each shard with global DF
  4. Merges and normalizes results
§Arguments
  • shards - Slice of shard references to query
  • query_text - The raw query text to search for
§Returns

A vector of scored nodes, sorted by relevance (highest first).

Source

pub fn local_query( &self, shard: &ShardedColony, query_text: &str, ) -> Vec<ScoredNode>

Execute a query on a single shard (for non-distributed use).

This is useful for testing or when the data resides in a single shard.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> FutureExt for T

Source§

fn with_context(self, otel_cx: Context) -> WithContext<Self>

Attaches the provided Context to this type, returning a WithContext wrapper. Read more
Source§

fn with_current_context(self) -> WithContext<Self>

Attaches the current Context to this type, returning a WithContext wrapper. Read more
Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more