Expand description
Distributed Search Infrastructure
This module provides primitives for distributed semantic search across multiple nodes/shards. It handles:
- Sharding: Partitioning data across nodes
- Query routing: Fan-out queries to relevant shards
- Result aggregation: Merge results from multiple shards
- Topology management: Track available nodes
§Architecture
┌─────────────────────────────────────────────────────────────┐
│ Distributed Search │
│ ┌─────────────┐ │
│ │ Query │ │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ ┌──────────────────────────────────┐ │
│ │ Router │───▶│ Shard Cluster │ │
│ └──────┬──────┘ │ ┌─────┐ ┌─────┐ ┌─────┐ │ │
│ │ │ │ S0 │ │ S1 │ │ S2 │ ... │ │
│ ▼ │ └──┬──┘ └──┬──┘ └──┬──┘ │ │
│ ┌─────────────┐ └────│───────│───────│───────────┘ │
│ │ Aggregator │◀────────┴───────┴───────┘ │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Results │ │
│ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘§Example
ⓘ
use embeddenator_retrieval::distributed::{
DistributedSearch, Shard, ShardId, DistributedConfig,
};
use embeddenator_vsa::SparseVec;
// Create shards (each could be on a different node)
let mut shard0 = Shard::new(ShardId(0));
shard0.add(1, SparseVec::from_data(b"document one"));
shard0.finalize();
let mut shard1 = Shard::new(ShardId(1));
shard1.add(2, SparseVec::from_data(b"document two"));
shard1.finalize();
// Create distributed search coordinator
let mut search = DistributedSearch::new(DistributedConfig::default());
search.add_shard(shard0);
search.add_shard(shard1);
// Execute distributed query
let query = SparseVec::from_data(b"document");
let (results, stats) = search.query(&query, 10)?;Structs§
- Distributed
Config - Configuration for distributed search
- Distributed
Result - Aggregated result from distributed query
- Distributed
Search - Distributed search coordinator
- Distributed
Search Builder - Builder for creating a distributed search cluster
- Query
Stats - Statistics from a distributed query
- Shard
- A single shard containing a partition of the search corpus
- Shard
Assigner - Shard assignment helper
- ShardId
- Unique identifier for a shard
- Shard
Result - Result from a single shard query
Enums§
- Distributed
Error - Error type for distributed operations
- Shard
Status - Shard status
- Sharding
Strategy - Sharding strategy for partitioning data