Vectorizer Rust SDK
High-performance Rust SDK for Vectorizer vector database.
Package: vectorizer-sdk
Version: 3.2.0 (RPC-first; HTTP fallback retained)
v3.2 — backpressure-aware HTTP client (HTTP 429 + Retry-After)
The HTTP transport honors the server-side bulk-upsert backpressure
shipped in Vectorizer 3.2.0
(#263). On HTTP
429 Too Many Requests the client parses Retry-After (seconds
form, 1 s default, 30 s cap), sleeps, and retries up to 3 times
before surfacing VectorizerError::RateLimit. Pre-3.2.0 clients
bounced 429s into a generic 5xx and lost the retry budget. The
vectorizer-sdk parses Retry-After via parse_retry_after_secs
in src/http_transport.rs; lock-in tests live at
tests/retry_after_parse.rs.
v3.1 — /insert_vectors + stable client-id upserts
VectorizerClient::insert_vectors(...)— bulk-insert pre-computed embeddings with caller-supplied vector ids. Skips the embedding pipeline entirely.insert/insert_texts: the requestidis now used verbatim as the storedVector.id(non-chunked) or as<id>#<chunk_index>(chunked). Re-running the same payload upserts in place.- Chunked vectors expose a flat payload layout (
{content, file_path, chunk_index, parent_id, ...user_metadata}); legacy nested payloads from ≤ 3.0.x stay readable during the deprecation window.
Client-id contract: non-empty, length ≤ 256, no leading/trailing
whitespace, must not contain #.
✅ Status: v3.0.0 — VectorizerRPC default transport
v3.x ships with VectorizerRPC — length-prefixed MessagePack over
raw TCP — as the recommended primary transport. The HTTP path that
shipped in 2.x stays available behind the http Cargo feature
(default-on for backward compat). Pick the constructor that matches
the URL scheme you have:
| URL | Constructor | Transport |
|---|---|---|
vectorizer://host:15503 |
RpcClient::connect_url(url) |
Binary RPC (recommended) |
vectorizer://host |
RpcClient::connect_url(url) |
RPC on default port 15503 |
host:15503 (no scheme) |
RpcClient::connect_url(url) or RpcClient::connect("host:port") |
RPC |
http://host:15002 |
VectorizerClient (HTTP path below) |
REST (legacy) |
Quick Start (RPC, recommended)
[]
= "3.2"
= { = "1", = ["full"] }
use ;
async
See examples/rpc_quickstart.rs for the runnable version. Wire spec:
docs/specs/VECTORIZER_RPC.md.
Connection pooling
use ;
let pool = new;
let conn = pool.acquire.await?;
let collections = conn.client.list_collections.await?;
// `conn` returns to the pool on Drop.
Error handling
RpcClient returns Result<T, RpcClientError>. The variants:
Io(std::io::Error)— TCP-level failure.Server(String)— server returnedErr(message).ConnectionClosed— the background reader task exited (peer closed, or write failure mid-call).NotAuthenticated— local guard against issuing a data-plane command beforeHELLOsucceeded; saves an unnecessary round-trip.Encode(rmp_serde::encode::Error)— should be unreachable for v1 shapes (every type derivesSerialize).
Quick Start (HTTP, legacy)
The 2.x VectorizerClient is preserved unchanged. The flat
1,989-line client.rs was split into a per-surface module tree in
the phase4_split-sdk-rust-client refactor — every public method
keeps its name and signature, but the implementation now lives
next to the surface it belongs to:
sdks/rust/src/
├── transport.rs # Transport trait (impl by HttpTransport, RpcTransport, ...)
├── http_transport.rs # REST backend
├── rpc/ # RPC backend (default in v3.x)
└── client/ # REST facade, split per API surface
├── mod.rs # struct VectorizerClient + ctors + with_transport()
├── core.rs # health_check
├── collections.rs # list/create/get/delete collection
├── vectors.rs # get_vector, insert_texts, embed_text
├── search.rs # search_vectors, intelligent/semantic/contextual/hybrid/multi
├── discovery.rs # discover, filter/score/expand
├── files.rs # 10 file-ops + upload + upload_config
├── graph.rs # 10 graph ops (nodes, edges, path, discovery)
└── qdrant.rs # 25 Qdrant-compatible /qdrant/* endpoints
Rust permits multiple impl blocks for the same struct across
files of the same module, so every per-surface file just adds an
impl VectorizerClient { ... } block. The struct definition,
constructors, transport selection (get_read_transport /
get_write_transport), and the make_request helper live in
client/mod.rs; per-surface files only contain the user-facing
methods.
RPC-readiness regression guard
VectorizerClient::with_transport(Arc<dyn Transport>, base_url) is
exposed as the test-only entry point that builds the client from
any Transport implementation. The
tests/mock_transport_regression.rs integration test exercises one
method from each of the eight per-surface modules through an
in-memory mock, proving the surface modules don't hard-code
HttpTransport. When phase6_sdk-rust-rpc's RpcTransport lands,
it satisfies the same Transport trait — every per-surface call
routes through it without a single per-method edit.
To opt into a slim build with RPC only:
[]
= { = "3.0", = false, = ["rpc"] }
To use the HTTP client:
use *;
async
Features
- 🚀 High Performance: Optimized async transport layer
- 🔄 Async/Await: Full async/await support with Tokio
- 📡 Multiple Protocols: HTTP/HTTPS and UMICP support
- 🔍 Semantic Search: Vector similarity search with multiple metrics
- 🧠 Intelligent Search: Advanced multi-query search with domain expansion
- 🎯 Contextual Search: Context-aware search with metadata filtering
- 🔗 Multi-Collection Search: Cross-collection search with intelligent aggregation
- 📦 Batch Operations: Efficient bulk text insertion
- 🛡️ Type Safety: Strongly typed API with comprehensive error handling
- 🔧 Easy Setup: Simple client creation with sensible defaults
- 📊 Health Monitoring: Built-in health checks and statistics
Installation
HTTP Transport (Default)
Add to Cargo.toml:
[]
= "2.2.0"
= { = "1.35", = ["full"] }
= "1.0"
UMICP Transport (High Performance)
Enable the UMICP feature for high-performance protocol support:
[]
= { = "2.1.0", = ["umicp"] }
= { = "1.35", = ["full"] }
= "1.0"
Configuration
HTTP Configuration (Default)
use ;
// Default configuration
let client = new_default?;
// Custom URL
let client = new_with_url?;
// With API key
let client = new_with_api_key?;
// Advanced configuration
let client = new?;
UMICP Configuration (High Performance)
UMICP (Universal Messaging and Inter-process Communication Protocol) provides significant performance benefits.
Using Connection String
use VectorizerClient;
let client = from_connection_string?;
println!;
Using Explicit Configuration
use ;
let client = new?;
When to Use UMICP
Use UMICP when:
- Large Payloads: Inserting or searching large batches of vectors
- High Throughput: Need maximum performance for production workloads
- Low Latency: Need minimal protocol overhead
Use HTTP when:
- Development: Quick testing and debugging
- Firewall Restrictions: Only HTTP/HTTPS allowed
- Simple Deployments: No need for custom protocol setup
Protocol Comparison
| Feature | HTTP/HTTPS | UMICP |
|---|---|---|
| Transport | reqwest (standard HTTP) | umicp-core crate |
| Performance | Standard | Optimized for large payloads |
| Latency | Standard | Lower overhead |
| Firewall | Widely supported | May require configuration |
| Build Time | Fast | Requires UMICP feature |
Master/Slave Configuration (Read/Write Separation)
Vectorizer supports Master-Replica replication for high availability and read scaling. The SDK provides automatic routing - writes go to master, reads are distributed across replicas.
Basic Setup
use ;
// Configure with master and replicas - SDK handles routing automatically
let client = builder
.master
.replica
.replica
.api_key
.read_preference
.build?;
// Writes automatically go to master
client.create_collection.await?;
client.insert_texts.await?;
// Reads automatically go to replicas (load balanced)
let results = client.search_vectors.await?;
let collections = client.list_collections.await?;
Read Preferences
| Preference | Description | Use Case |
|---|---|---|
ReadPreference::Replica |
Route reads to replicas (round-robin) | Default for high read throughput |
ReadPreference::Master |
Route all reads to master | When you need read-your-writes consistency |
ReadPreference::Nearest |
Route to the node with lowest latency | Geo-distributed deployments |
Read-Your-Writes Consistency
For operations that need to immediately read what was just written:
// Option 1: Override read preference for specific operation
client.insert_texts.await?;
let result = client.get_vector_with_preference.await?;
// Option 2: Use a scoped master context
client.with_master.await?;
Automatic Operation Routing
The SDK automatically classifies operations:
| Operation Type | Routed To | Methods |
|---|---|---|
| Writes | Always Master | insert_texts, insert_vectors, update_vector, delete_vector, create_collection, delete_collection |
| Reads | Based on ReadPreference |
search_vectors, get_vector, list_collections, intelligent_search, semantic_search, hybrid_search |
Standalone Mode (Single Node)
For development or single-node deployments:
// Single node - no replication
let client = new_with_api_key?;
API Endpoints
✅ Health & Monitoring
health_check()- Server health and statisticslist_collections()- List all available collections
✅ Collection Management
create_collection()- Create new vector collectionget_collection_info()- Get collection details (limited support)delete_collection()- Delete collection (limited support)
✅ Vector Operations
search_vectors()- Semantic search with text queriesinsert_texts()- Batch text insertion (limited support)get_vector()- Retrieve individual vectors (limited support)
✅ Embedding (Future)
embed_text()- Generate embeddings (endpoint not available)
Tier demotion (issue #265)
Three methods cover the cortex consolidation-tier pruner pattern: delete one vector, batch-delete by id, and move vectors between collections without re-embedding.
use VectorizerClient;
let client = new;
// Single delete.
client.delete_vector.await?;
// Batch delete with per-id status.
let report = client
.delete_vectors
.await?;
println!;
// Tier demotion: move aged vectors hot → warm without re-embedding.
let aged: = collect_aged_ids.await?;
let mv = client
.move_to_collection
.await?;
for row in mv.results.iter.filter
The move_to_collection server endpoint inserts into dst BEFORE
deleting from src. A mid-batch crash leaves a recoverable duplicate
(never data loss). Per-id outcomes (ok | missing_in_src |
dst_insert_failed | src_delete_failed) populate MoveReport.results
without aborting the batch.
Control surface (3.4)
Admin / observability
use VectorizerClient;
async
Auth
use VectorizerClient;
async
Replication
use VectorizerClient;
async
Discovery pipeline
The discovery pipeline chains six stages from broad search to final LLM-ready prompt:
use VectorizerClient;
async
Hub backups
use VectorizerClient;
async
Examples
Run the examples to see the SDK in action:
# Basic usage example
# Comprehensive test suite (9/9 tests passing)
Testing
The SDK includes comprehensive tests that verify:
- ✅ Client creation and configuration
- ✅ Health check functionality
- ✅ Collection listing and information
- ✅ Vector search operations
- ✅ Collection creation
- ✅ Error handling and edge cases
Test Results: 9/9 endpoints functional (100% success rate)
Compatibility
- Rust: 1.90.0+ (Rust 2024 edition)
- Vectorizer Server: v0.20.0+
- HTTP: REST API with JSON payloads
- UMICP: Optional feature (enable with
--features umicp) - Async Runtime: Tokio 1.35+
Building
HTTP Only (Default)
With UMICP Support
Run Tests
# HTTP tests only
# UMICP tests
# Specific test
Run Examples
# HTTP example
# UMICP example (requires feature)
Error Handling
The SDK provides comprehensive error types:
use ;
match client.search_vectors.await
Qdrant Feature Parity
The SDK provides full compatibility with Qdrant 1.14.x REST API:
Snapshots API
// List collection snapshots
let snapshots = client.qdrant_list_collection_snapshots.await?;
// Create snapshot
let snapshot = client.qdrant_create_collection_snapshot.await?;
// Delete snapshot
client.qdrant_delete_collection_snapshot.await?;
// Recover from snapshot
client.qdrant_recover_collection_snapshot.await?;
// Full snapshot (all collections)
let full_snapshot = client.qdrant_create_full_snapshot.await?;
Sharding API
// List shard keys
let shard_keys = client.qdrant_list_shard_keys.await?;
// Create shard key
let shard_config = json!;
client.qdrant_create_shard_key.await?;
// Delete shard key
client.qdrant_delete_shard_key.await?;
Cluster Management API
// Get cluster status
let status = client.qdrant_get_cluster_status.await?;
// Recover current peer
client.qdrant_cluster_recover.await?;
// Remove peer
client.qdrant_remove_peer.await?;
// Metadata operations
let metadata_keys = client.qdrant_list_metadata_keys.await?;
let key_value = client.qdrant_get_metadata_key.await?;
let value = json!;
client.qdrant_update_metadata_key.await?;
Query API
// Basic query
let query_request = json!;
let results = client.qdrant_query_points.await?;
// Query with prefetch (multi-stage retrieval)
let prefetch_request = json!;
let results = client.qdrant_query_points.await?;
// Batch query
let batch_request = json!;
let results = client.qdrant_batch_query_points.await?;
// Query groups
let groups_request = json!;
let results = client.qdrant_query_points_groups.await?;
Search Groups & Matrix API
// Search groups
let search_groups_request = json!;
let groups = client.qdrant_search_points_groups.await?;
// Search matrix pairs (pairwise similarity)
let matrix_request = json!;
let pairs = client.qdrant_search_matrix_pairs.await?;
// Search matrix offsets (compact format)
let offsets = client.qdrant_search_matrix_offsets.await?;
Contributing
This SDK is ready for production use. All endpoints have been tested and verified functional.