Skip to main content

Module distributed

Module distributed 

Source
Expand description

Distributed Search Infrastructure

This module provides primitives for distributed semantic search across multiple nodes/shards. It handles:

  • Sharding: Partitioning data across nodes
  • Query routing: Fan-out queries to relevant shards
  • Result aggregation: Merge results from multiple shards
  • Topology management: Track available nodes

§Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Distributed Search                        │
│  ┌─────────────┐                                            │
│  │   Query     │                                            │
│  └──────┬──────┘                                            │
│         │                                                    │
│         ▼                                                    │
│  ┌─────────────┐    ┌──────────────────────────────────┐   │
│  │   Router    │───▶│         Shard Cluster            │   │
│  └──────┬──────┘    │  ┌─────┐ ┌─────┐ ┌─────┐        │   │
│         │           │  │ S0  │ │ S1  │ │ S2  │ ...    │   │
│         ▼           │  └──┬──┘ └──┬──┘ └──┬──┘        │   │
│  ┌─────────────┐    └────│───────│───────│───────────┘   │
│  │ Aggregator  │◀────────┴───────┴───────┘                │
│  └──────┬──────┘                                            │
│         │                                                    │
│         ▼                                                    │
│  ┌─────────────┐                                            │
│  │   Results   │                                            │
│  └─────────────┘                                            │
└─────────────────────────────────────────────────────────────┘

§Example

use embeddenator_retrieval::distributed::{
    DistributedSearch, Shard, ShardId, DistributedConfig,
};
use embeddenator_vsa::SparseVec;

// Create shards (each could be on a different node)
let mut shard0 = Shard::new(ShardId(0));
shard0.add(1, SparseVec::from_data(b"document one"));
shard0.finalize();

let mut shard1 = Shard::new(ShardId(1));
shard1.add(2, SparseVec::from_data(b"document two"));
shard1.finalize();

// Create distributed search coordinator
let mut search = DistributedSearch::new(DistributedConfig::default());
search.add_shard(shard0);
search.add_shard(shard1);

// Execute distributed query
let query = SparseVec::from_data(b"document");
let (results, stats) = search.query(&query, 10)?;

Structs§

DistributedConfig
Configuration for distributed search
DistributedResult
Aggregated result from distributed query
DistributedSearch
Distributed search coordinator
DistributedSearchBuilder
Builder for creating a distributed search cluster
QueryStats
Statistics from a distributed query
Shard
A single shard containing a partition of the search corpus
ShardAssigner
Shard assignment helper
ShardId
Unique identifier for a shard
ShardResult
Result from a single shard query

Enums§

DistributedError
Error type for distributed operations
ShardStatus
Shard status
ShardingStrategy
Sharding strategy for partitioning data