Expand description
TensorStore - Unified Storage Layer for Neumann
A thread-safe, sharded key-value store optimized for tensor data:
- Dense and sparse vector storage with HNSW indexing
- Relational tables with SIMD-accelerated filtering
- Graph structures with CSR-optimized traversal
- Automatic hot/cold tiering with mmap backing
§Architecture
TensorStore
+-- SlabRouter (key classification and routing)
| +-- MetadataSlab (arbitrary key-value)
| +-- EmbeddingSlab (dense embeddings)
| +-- RelationalSlab (columnar tables)
| +-- GraphTensor (nodes and edges)
+-- HNSWIndex (similarity search)
+-- EntityIndex (string <-> ID mapping)
+-- CacheRing (LRU/LFU eviction)
+-- TieredStore (hot/cold storage)§Quick Start
use tensor_store::{TensorStore, TensorData, TensorValue, ScalarValue};
let store = TensorStore::new();
// Store a tensor entity
let mut data = TensorData::new();
data.set("name", TensorValue::Scalar(ScalarValue::String("example".into())));
data.set("embedding", TensorValue::Vector(vec![0.1, 0.2, 0.3]));
store.put("entity:1", data).unwrap();
// Retrieve it
let retrieved = store.get("entity:1").unwrap();
assert!(retrieved.has("name"));
assert!(retrieved.has("embedding"));§Thread Safety
All types use parking_lot locks (no lock poisoning) and sharded designs
for high concurrent throughput. Typical performance:
- PUT: ~3.2M ops/sec
- GET: ~5M ops/sec
§Module Overview
| Module | Purpose |
|---|---|
slab_router | Key routing and WAL durability |
hnsw | Hierarchical Navigable Small World index |
sparse_vector | Memory-efficient sparse embeddings |
delta_vector | Archetype-based delta compression |
relational_slab | Column-oriented table storage |
graph_tensor | CSR graph with BFS/shortest path |
cache_ring | Fixed-size eviction cache |
tiered | Hot/cold storage with auto-migration |
mmap | Memory-mapped cold storage |
consistent_hash | Partition routing with virtual nodes |
Re-exports§
pub use binary_quantization::BinaryThreshold;pub use binary_quantization::BinaryVector;pub use blob_log::BlobLog;pub use blob_log::BlobLogSnapshot;pub use blob_log::ChunkHash;pub use cache_ring::CacheRing;pub use cache_ring::CacheRingSnapshot;pub use cache_ring::CacheStats;pub use cache_ring::EvictionScorer;pub use cache_ring::EvictionStrategy;pub use consistent_hash::ConsistentHashConfig;pub use consistent_hash::ConsistentHashPartitioner;pub use consistent_hash::ConsistentHashStats;pub use delta_vector::ArchetypeRegistry;pub use delta_vector::CoverageStats;pub use delta_vector::DeltaVector;pub use delta_vector::DeltaVectorError;pub use delta_vector::KMeans;pub use delta_vector::KMeansConfig;pub use delta_vector::KMeansInit;pub use delta_vector::MAX_DIMENSION as DELTA_MAX_DIMENSION;pub use distance::DistanceMetric;pub use distance::GeometricConfig;pub use durable_blob_log::BlobWalRecord;pub use durable_blob_log::ChunkLocation;pub use durable_blob_log::DurableBlobLog;pub use durable_blob_log::DurableBlobLogConfig;pub use durable_blob_log::DurableBlobLogError;pub use durable_blob_log::DurableChunkHash;pub use embedding_slab::CompressedEmbedding;pub use embedding_slab::EmbeddingError;pub use embedding_slab::EmbeddingSlab;pub use embedding_slab::EmbeddingSlabSnapshot;pub use embedding_slab::EmbeddingSlot;pub use entity_index::EntityId;pub use entity_index::EntityIndex;pub use entity_index::EntityIndexConfig;pub use entity_index::EntityIndexError;pub use entity_index::EntityIndexSnapshot;pub use entity_index::DEFAULT_MAX_ENTITIES;pub use graph_tensor::EdgeId;pub use graph_tensor::GraphTensor;pub use graph_tensor::GraphTensorSnapshot;pub use hnsw::EmbeddingStorage;pub use hnsw::EmbeddingStorageError;pub use hnsw::HNSWConfig;pub use hnsw::HNSWDistanceMetric;pub use hnsw::HNSWIndex;pub use hnsw::HNSWMemoryStats;pub use hnsw::ScalarQuantizedVector;pub use instrumentation::HNSWAccessStats;pub use instrumentation::HNSWStatsSnapshot;pub use instrumentation::ShardAccessSnapshot;pub use instrumentation::ShardAccessTracker;pub use instrumentation::ShardStatsSnapshot;pub use ivf::IVFConfig;pub use ivf::IVFIndex;pub use ivf::IVFIndexState;pub use ivf::IVFStorage;pub use metadata_slab::MetadataSlab;pub use metadata_slab::MetadataSlabSnapshot;pub use mmap::MmapError;pub use mmap::MmapStore;pub use mmap::MmapStoreBuilder;pub use mmap::MmapStoreMut;pub use mmap_regional::CompactionStats;pub use mmap_regional::RegionalMmapConfig;pub use mmap_regional::RegionalMmapError;pub use mmap_regional::RegionalMmapStore;pub use mmap_regional::SortedRunBuilder;pub use partitioned::PartitionedError;pub use partitioned::PartitionedGet;pub use partitioned::PartitionedPut;pub use partitioned::PartitionedResult;pub use partitioned::PartitionedStore;pub use partitioner::PartitionId;pub use partitioner::PartitionResult;pub use partitioner::Partitioner;pub use partitioner::PhysicalNodeId;pub use pq::ADCTable;pub use pq::PQCodebook;pub use pq::PQConfig;pub use pq::PQVector;pub use relational_slab::ColumnDef;pub use relational_slab::ColumnType;pub use relational_slab::ColumnValue;pub use relational_slab::RangeOp;pub use relational_slab::RelationalError;pub use relational_slab::RelationalSlab;pub use relational_slab::RelationalSlabSnapshot;pub use relational_slab::Row;pub use relational_slab::RowId;pub use relational_slab::TableSchema;pub use semantic_partitioner::EncodedEmbedding;pub use semantic_partitioner::RoutingMethod;pub use semantic_partitioner::SemanticPartitionResult;pub use semantic_partitioner::SemanticPartitioner;pub use semantic_partitioner::SemanticPartitionerConfig;pub use semantic_partitioner::SemanticPartitionerStats;pub use slab_router::SlabRouter;pub use slab_router::SlabRouterConfig;pub use slab_router::SlabRouterError;pub use slab_router::SlabRouterSnapshot;pub use snapshot::detect_version as snapshot_detect_version;pub use snapshot::load as snapshot_load;pub use snapshot::migrate_v2_to_v3 as snapshot_migrate;pub use snapshot::save_v3 as snapshot_save;pub use snapshot::HNSWNodeSnapshot;pub use snapshot::HNSWSnapshot;pub use snapshot::SnapshotFormatError;pub use snapshot::SnapshotHeader;pub use snapshot::SnapshotVersion;pub use snapshot::V3Snapshot;pub use snapshot::VoronoiPartitionerConfigSnapshot;pub use snapshot::VoronoiSnapshot;pub use sparse_vector::SparseVector;pub use sparse_vector::SparseVectorError;pub use sparse_vector::MAX_DIMENSION as SPARSE_MAX_DIMENSION;pub use tiered::MigrationStrategy;pub use tiered::TieredConfig;pub use tiered::TieredError;pub use tiered::TieredStats;pub use tiered::TieredStore;pub use voronoi::LocalityKey;pub use voronoi::LocalityKeyGenerator;pub use voronoi::VoronoiPartitioner;pub use voronoi::VoronoiPartitionerConfig;pub use voronoi::VoronoiRegion;pub use wal::SyncMode;pub use wal::TensorWal;pub use wal::WalConfig;pub use wal::WalEntry;pub use wal::WalError;pub use wal::WalRecovery;pub use wal::WalResult;pub use wal::WalStatus;
Modules§
- binary_
quantization - Binary Quantization for extreme vector compression.
- blob_
log - Append-only blob log with segment management.
- cache_
ring - Fixed-size cache ring with configurable eviction strategies.
- consistent_
hash - Consistent hash ring partitioner with virtual nodes.
- delta_
vector - Delta-encoded vectors for efficient storage of clustered embeddings.
- distance
- Distance metrics for geometric vector operations.
- durable_
blob_ log - Durable blob log with WAL-based crash recovery.
- embedding_
slab - Dense embedding storage with chunked allocation.
- entity_
index - Vocabulary-based entity index for O(log n) lookup with stable IDs.
- fields
- Reserved field prefixes for unified entity storage.
- graph_
tensor - CSR-based graph storage with append log.
- hnsw
- HNSW (Hierarchical Navigable Small World) index for approximate nearest neighbor search.
- instrumentation
- Memory instrumentation for tracking shard and node access patterns.
- ivf
- IVF (Inverted File Index) for large-scale partitioned search.
- metadata_
slab - Sharded
BTreeMap-based metadata storage slab. - mmap
- Memory-mapped cold storage for tensor data.
- mmap_
regional - Region-aware memory-mapped storage for geometric locality.
- partitioned
- Partition-aware store wrapper for distributed operations.
- partitioner
- Data partitioning traits for distributed storage.
- pq
- Product Quantization for memory-efficient vector storage.
- relational_
slab - Columnar storage for relational data.
- semantic_
partitioner - Semantic partitioner for embedding-based data distribution.
- slab_
router - Slab router for directing operations to specialized storage backends.
- snapshot
- Snapshot format v2/v3 with backward compatibility.
- sparse_
vector - Sparse Vector - Storage where zero doesn’t exist
- tiered
- Two-tier hot/cold storage with automatic data migration.
- voronoi
- Voronoi partitioner with explicit geometric region boundaries.
- wal
- Write-Ahead Log for crash recovery.
Structs§
- Bloom
Filter - Thread-safe Bloom filter for fast negative lookups.
- Entity
Store - Unified entity store that provides a shared storage layer for all engines.
- Tensor
Data - An entity that can hold scalar properties, vector embeddings, and pointers to other tensors.
- Tensor
Store - Thread-safe key-value store backed by
SlabRouter.
Enums§
- Scalar
Value - Scalar value types for entity properties.
- Snapshot
Error - Errors that can occur during snapshot operations.
- Tensor
Store Error - Errors that can occur during tensor store operations.
- Tensor
Value - Represents different types of values a tensor can hold
Constants§
- DEFAULT_
SPARSITY_ THRESHOLD - Default sparsity threshold for auto-sparsification (70%)
- DEFAULT_
VALUE_ THRESHOLD - Default value threshold for pruning small values
Type Aliases§
- Result
- Result type for tensor store operations.
- Snapshot
Result - Result type for snapshot operations.