Skip to main content

Crate tensor_store

Crate tensor_store 

Source
Expand description

TensorStore - Unified Storage Layer for Neumann

A thread-safe, sharded key-value store optimized for tensor data:

  • Dense and sparse vector storage with HNSW indexing
  • Relational tables with SIMD-accelerated filtering
  • Graph structures with CSR-optimized traversal
  • Automatic hot/cold tiering with mmap backing

§Architecture

TensorStore
  +-- SlabRouter (key classification and routing)
  |     +-- MetadataSlab (arbitrary key-value)
  |     +-- EmbeddingSlab (dense embeddings)
  |     +-- RelationalSlab (columnar tables)
  |     +-- GraphTensor (nodes and edges)
  +-- HNSWIndex (similarity search)
  +-- EntityIndex (string <-> ID mapping)
  +-- CacheRing (LRU/LFU eviction)
  +-- TieredStore (hot/cold storage)

§Quick Start

use tensor_store::{TensorStore, TensorData, TensorValue, ScalarValue};

let store = TensorStore::new();

// Store a tensor entity
let mut data = TensorData::new();
data.set("name", TensorValue::Scalar(ScalarValue::String("example".into())));
data.set("embedding", TensorValue::Vector(vec![0.1, 0.2, 0.3]));
store.put("entity:1", data).unwrap();

// Retrieve it
let retrieved = store.get("entity:1").unwrap();
assert!(retrieved.has("name"));
assert!(retrieved.has("embedding"));

§Thread Safety

All types use parking_lot locks (no lock poisoning) and sharded designs for high concurrent throughput. Typical performance:

  • PUT: ~3.2M ops/sec
  • GET: ~5M ops/sec

§Module Overview

ModulePurpose
slab_routerKey routing and WAL durability
hnswHierarchical Navigable Small World index
sparse_vectorMemory-efficient sparse embeddings
delta_vectorArchetype-based delta compression
relational_slabColumn-oriented table storage
graph_tensorCSR graph with BFS/shortest path
cache_ringFixed-size eviction cache
tieredHot/cold storage with auto-migration
mmapMemory-mapped cold storage
consistent_hashPartition routing with virtual nodes

Re-exports§

pub use binary_quantization::BinaryThreshold;
pub use binary_quantization::BinaryVector;
pub use blob_log::BlobLog;
pub use blob_log::BlobLogSnapshot;
pub use blob_log::ChunkHash;
pub use cache_ring::CacheRing;
pub use cache_ring::CacheRingSnapshot;
pub use cache_ring::CacheStats;
pub use cache_ring::EvictionScorer;
pub use cache_ring::EvictionStrategy;
pub use consistent_hash::ConsistentHashConfig;
pub use consistent_hash::ConsistentHashPartitioner;
pub use consistent_hash::ConsistentHashStats;
pub use delta_vector::ArchetypeRegistry;
pub use delta_vector::CoverageStats;
pub use delta_vector::DeltaVector;
pub use delta_vector::DeltaVectorError;
pub use delta_vector::KMeans;
pub use delta_vector::KMeansConfig;
pub use delta_vector::KMeansInit;
pub use delta_vector::MAX_DIMENSION as DELTA_MAX_DIMENSION;
pub use distance::DistanceMetric;
pub use distance::GeometricConfig;
pub use durable_blob_log::BlobWalRecord;
pub use durable_blob_log::ChunkLocation;
pub use durable_blob_log::DurableBlobLog;
pub use durable_blob_log::DurableBlobLogConfig;
pub use durable_blob_log::DurableBlobLogError;
pub use durable_blob_log::DurableChunkHash;
pub use embedding_slab::CompressedEmbedding;
pub use embedding_slab::EmbeddingError;
pub use embedding_slab::EmbeddingSlab;
pub use embedding_slab::EmbeddingSlabSnapshot;
pub use embedding_slab::EmbeddingSlot;
pub use entity_index::EntityId;
pub use entity_index::EntityIndex;
pub use entity_index::EntityIndexConfig;
pub use entity_index::EntityIndexError;
pub use entity_index::EntityIndexSnapshot;
pub use entity_index::DEFAULT_MAX_ENTITIES;
pub use graph_tensor::EdgeId;
pub use graph_tensor::GraphTensor;
pub use graph_tensor::GraphTensorSnapshot;
pub use hnsw::EmbeddingStorage;
pub use hnsw::EmbeddingStorageError;
pub use hnsw::HNSWConfig;
pub use hnsw::HNSWDistanceMetric;
pub use hnsw::HNSWIndex;
pub use hnsw::HNSWMemoryStats;
pub use hnsw::ScalarQuantizedVector;
pub use instrumentation::HNSWAccessStats;
pub use instrumentation::HNSWStatsSnapshot;
pub use instrumentation::ShardAccessSnapshot;
pub use instrumentation::ShardAccessTracker;
pub use instrumentation::ShardStatsSnapshot;
pub use ivf::IVFConfig;
pub use ivf::IVFIndex;
pub use ivf::IVFIndexState;
pub use ivf::IVFStorage;
pub use metadata_slab::MetadataSlab;
pub use metadata_slab::MetadataSlabSnapshot;
pub use mmap::MmapError;
pub use mmap::MmapStore;
pub use mmap::MmapStoreBuilder;
pub use mmap::MmapStoreMut;
pub use mmap_regional::CompactionStats;
pub use mmap_regional::RegionalMmapConfig;
pub use mmap_regional::RegionalMmapError;
pub use mmap_regional::RegionalMmapStore;
pub use mmap_regional::SortedRunBuilder;
pub use partitioned::PartitionedError;
pub use partitioned::PartitionedGet;
pub use partitioned::PartitionedPut;
pub use partitioned::PartitionedResult;
pub use partitioned::PartitionedStore;
pub use partitioner::PartitionId;
pub use partitioner::PartitionResult;
pub use partitioner::Partitioner;
pub use partitioner::PhysicalNodeId;
pub use pq::ADCTable;
pub use pq::PQCodebook;
pub use pq::PQConfig;
pub use pq::PQVector;
pub use relational_slab::ColumnDef;
pub use relational_slab::ColumnType;
pub use relational_slab::ColumnValue;
pub use relational_slab::RangeOp;
pub use relational_slab::RelationalError;
pub use relational_slab::RelationalSlab;
pub use relational_slab::RelationalSlabSnapshot;
pub use relational_slab::Row;
pub use relational_slab::RowId;
pub use relational_slab::TableSchema;
pub use semantic_partitioner::EncodedEmbedding;
pub use semantic_partitioner::RoutingMethod;
pub use semantic_partitioner::SemanticPartitionResult;
pub use semantic_partitioner::SemanticPartitioner;
pub use semantic_partitioner::SemanticPartitionerConfig;
pub use semantic_partitioner::SemanticPartitionerStats;
pub use slab_router::SlabRouter;
pub use slab_router::SlabRouterConfig;
pub use slab_router::SlabRouterError;
pub use slab_router::SlabRouterSnapshot;
pub use snapshot::detect_version as snapshot_detect_version;
pub use snapshot::load as snapshot_load;
pub use snapshot::migrate_v2_to_v3 as snapshot_migrate;
pub use snapshot::save_v3 as snapshot_save;
pub use snapshot::HNSWNodeSnapshot;
pub use snapshot::HNSWSnapshot;
pub use snapshot::SnapshotFormatError;
pub use snapshot::SnapshotHeader;
pub use snapshot::SnapshotVersion;
pub use snapshot::V3Snapshot;
pub use snapshot::VoronoiPartitionerConfigSnapshot;
pub use snapshot::VoronoiSnapshot;
pub use sparse_vector::SparseVector;
pub use sparse_vector::SparseVectorError;
pub use sparse_vector::MAX_DIMENSION as SPARSE_MAX_DIMENSION;
pub use tiered::MigrationStrategy;
pub use tiered::TieredConfig;
pub use tiered::TieredError;
pub use tiered::TieredStats;
pub use tiered::TieredStore;
pub use voronoi::LocalityKey;
pub use voronoi::LocalityKeyGenerator;
pub use voronoi::VoronoiPartitioner;
pub use voronoi::VoronoiPartitionerConfig;
pub use voronoi::VoronoiRegion;
pub use wal::SyncMode;
pub use wal::TensorWal;
pub use wal::WalConfig;
pub use wal::WalEntry;
pub use wal::WalError;
pub use wal::WalRecovery;
pub use wal::WalResult;
pub use wal::WalStatus;

Modules§

binary_quantization
Binary Quantization for extreme vector compression.
blob_log
Append-only blob log with segment management.
cache_ring
Fixed-size cache ring with configurable eviction strategies.
consistent_hash
Consistent hash ring partitioner with virtual nodes.
delta_vector
Delta-encoded vectors for efficient storage of clustered embeddings.
distance
Distance metrics for geometric vector operations.
durable_blob_log
Durable blob log with WAL-based crash recovery.
embedding_slab
Dense embedding storage with chunked allocation.
entity_index
Vocabulary-based entity index for O(log n) lookup with stable IDs.
fields
Reserved field prefixes for unified entity storage.
graph_tensor
CSR-based graph storage with append log.
hnsw
HNSW (Hierarchical Navigable Small World) index for approximate nearest neighbor search.
instrumentation
Memory instrumentation for tracking shard and node access patterns.
ivf
IVF (Inverted File Index) for large-scale partitioned search.
metadata_slab
Sharded BTreeMap-based metadata storage slab.
mmap
Memory-mapped cold storage for tensor data.
mmap_regional
Region-aware memory-mapped storage for geometric locality.
partitioned
Partition-aware store wrapper for distributed operations.
partitioner
Data partitioning traits for distributed storage.
pq
Product Quantization for memory-efficient vector storage.
relational_slab
Columnar storage for relational data.
semantic_partitioner
Semantic partitioner for embedding-based data distribution.
slab_router
Slab router for directing operations to specialized storage backends.
snapshot
Snapshot format v2/v3 with backward compatibility.
sparse_vector
Sparse Vector - Storage where zero doesn’t exist
tiered
Two-tier hot/cold storage with automatic data migration.
voronoi
Voronoi partitioner with explicit geometric region boundaries.
wal
Write-Ahead Log for crash recovery.

Structs§

BloomFilter
Thread-safe Bloom filter for fast negative lookups.
EntityStore
Unified entity store that provides a shared storage layer for all engines.
TensorData
An entity that can hold scalar properties, vector embeddings, and pointers to other tensors.
TensorStore
Thread-safe key-value store backed by SlabRouter.

Enums§

ScalarValue
Scalar value types for entity properties.
SnapshotError
Errors that can occur during snapshot operations.
TensorStoreError
Errors that can occur during tensor store operations.
TensorValue
Represents different types of values a tensor can hold

Constants§

DEFAULT_SPARSITY_THRESHOLD
Default sparsity threshold for auto-sparsification (70%)
DEFAULT_VALUE_THRESHOLD
Default value threshold for pruning small values

Type Aliases§

Result
Result type for tensor store operations.
SnapshotResult
Result type for snapshot operations.