Ruvector CLI
Command-line interface and MCP server for high-performance vector database operations.
Professional CLI tools for managing Ruvector vector databases with sub-millisecond query performance, batch operations, and MCP integration.
🌟 Overview
The Ruvector CLI provides a comprehensive command-line interface for:
- Database Management: Create and configure vector databases
- Data Operations: Insert, search, and export vector data
- Performance Benchmarking: Test query performance and throughput
- Format Support: JSON, CSV, and NumPy array formats
- MCP Server: Model Context Protocol server for AI integrations
- Batch Processing: Efficient bulk operations with progress tracking
⚡ Quick Start
Installation
Install via Cargo:
Or build from source:
# Clone repository
# Build CLI
# Install locally
Basic Usage
# Create a new database
# Insert vectors from JSON
# Search for similar vectors
# Show database information
# Run performance benchmark
📋 Command Reference
Global Options
All commands support these global options:
Commands
create - Create a New Database
Create a new vector database with specified dimensions.
)
Examples:
# Create database for 384-dimensional embeddings (e.g., MiniLM)
# Create database with custom path
# Create for large embeddings (e.g., text-embedding-3-large)
insert - Insert Vectors from File
Bulk insert vectors from JSON, CSV, or NumPy files.
)
Input Formats:
JSON (array of vector entries):
CSV (id, vector_json, metadata_json):
id,vector,metadata
doc_1,"[0.1, 0.2, 0.3]","{\"title\": \"Document 1\"}"
doc_2,"[0.4, 0.5, 0.6]","{\"title\": \"Document 2\"}"
NumPy (.npy file with 2D array):
=
Examples:
# Insert from JSON file
# Insert from CSV with progress
# Insert from NumPy array
# Batch insert without progress bar
search - Search for Similar Vectors
Find k-nearest neighbors for a query vector.
)
Query Formats:
# Comma-separated floats
# JSON array
# From file (using shell)
Examples:
# Search for top 10 similar vectors
# Search with full vector output
# Search for top 50 results
Output:
🔍 Search Results (top 10)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
#1 doc_42 similarity: 0.9876
#2 doc_128 similarity: 0.9543
#3 doc_89 similarity: 0.9321
...
Search completed in 0.48ms
info - Show Database Information
Display database statistics and configuration.
Examples:
# Show default database info
# Show custom database info
Output:
📊 Database Statistics
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Total vectors: 1,234,567
Dimensions: 384
Distance metric: Cosine
HNSW Configuration:
M: 16
ef_construction: 200
ef_search: 100
benchmark - Run Performance Benchmark
Test query performance with random vectors.
Examples:
# Quick benchmark (1000 queries)
# Extended benchmark (10,000 queries)
# Benchmark specific database
Output:
Running benchmark...
Queries: 1000
Dimensions: 384
Benchmark Results:
Total time: 0.48s
Queries per second: 2083
Average latency: 0.48ms
export - Export Database to File
Export vector data to JSON or CSV format.
)
Examples:
# Export to JSON
# Export to CSV
# Export with custom database
Note: Export functionality requires
VectorDB::all_ids()method. This feature is planned for a future release.
import - Import from Other Vector Databases
Import vectors from external vector database formats.
Examples:
# Import from FAISS index
# Import from Pinecone export
# Import from Weaviate backup
Note: Import functionality for external databases is planned for future releases.
🔧 Configuration
Configuration File
Create a ruvector.toml configuration file for default settings:
[]
= "./ruvector.db"
= 384
= "Cosine" # Cosine, Euclidean, DotProduct, Manhattan
[]
= 16
= 200
= 100
[]
= "Scalar" # Scalar, Product, or None
[]
= true
= true
= 1000
[]
= "127.0.0.1"
= 3000
= true
Configuration Locations
The CLI searches for configuration files in this order:
- Path specified via
--configflag ./ruvector.toml(current directory)./.ruvector.toml(current directory, hidden)~/.config/ruvector/config.toml(user config)/etc/ruvector/config.toml(system config)
Environment Variables
Override configuration with environment variables:
# Database settings
# MCP server settings
# Run with environment overrides
🔌 MCP Server
The Ruvector CLI includes a Model Context Protocol (MCP) server for AI agent integration.
Start MCP Server
STDIO Transport (for local AI tools):
SSE Transport (for web-based AI tools):
With Configuration:
MCP Integration Examples
Claude Desktop Integration (claude_desktop_config.json):
HTTP/SSE Client:
const evtSource = ;
evtSource.;
// Send search request
;
📊 Common Workflows
RAG System Setup
Build a retrieval-augmented generation (RAG) system:
# 1. Create database for your embedding model
# 2. Generate embeddings and save to JSON
# (Use your preferred embedding model)
# 3. Insert embeddings
# 4. Query for relevant context
# 5. Start MCP server for AI agent access
Semantic Search Engine
Build a semantic search system:
# Create database
# Batch insert documents
# Benchmark performance
# Search interface via MCP
Migration from Other Databases
Migrate from existing vector databases:
# 1. Export from source database
# (Use source database's export tools)
# 2. Create Ruvector database
# 3. Import data (planned feature)
# 4. Verify migration
Performance Testing
Test vector database performance:
# Create test database
# Generate synthetic test data
# Insert test data
# Run comprehensive benchmark
# Test search performance
🎯 Shell Completion
Generate shell completion scripts for faster command entry:
Bash
# Generate completion script
# Or add to ~/.bashrc
Zsh
# Add to ~/.zshrc
&&
Fish
# Generate and save completion
⚙️ Performance Tips
Optimize Insertion
# Use larger batch sizes for bulk inserts (set in config)
# Disable progress bar for maximum speed
Optimize Search
Configure HNSW parameters for your use case:
[]
# Higher M = better recall, more memory
= 32
# Higher ef_construction = better index quality, slower builds
= 400
# Higher ef_search = better recall, slower queries
= 200
Memory Optimization
Enable quantization to reduce memory usage:
[]
= "Product" # 4-8x memory reduction
Benchmarking Tips
# Run warm-up queries first
# Then benchmark
# Test different k values
for; do
done
🔗 Related Documentation
- Rust API Reference - Core Ruvector API
- Getting Started Guide - Complete tutorial
- Performance Tuning - Optimization guide
- Main README - Project overview
🐛 Troubleshooting
Common Issues
Database file not found:
# Ensure database exists
# Or create it first
Dimension mismatch:
# Error: "Vector dimension mismatch"
# Solution: Ensure all vectors match database dimensions
# Check database dimensions
Invalid query format:
# Use proper JSON or comma-separated format
MCP server connection issues:
# Check if port is available
# Try different port
# Enable debug logging
🤝 Contributing
Contributions welcome! Please see the Contributing Guidelines.
Development Setup
# Clone repository
# Run tests
# Check formatting
# Run clippy
# Build release
📜 License
MIT License - see LICENSE for details.
🙏 Acknowledgments
Built with:
- clap - Command-line argument parsing
- tokio - Async runtime
- serde - Serialization framework
- indicatif - Progress bars and spinners
- colored - Terminal colors