# Ruvector CLI
[](https://opensource.org/licenses/MIT)
[](https://www.rust-lang.org)
**Command-line interface and MCP server for high-performance vector database operations.**
> Professional CLI tools for managing Ruvector vector databases with sub-millisecond query performance, batch operations, and MCP integration.
## 🌟 Overview
The Ruvector CLI provides a comprehensive command-line interface for:
- **Database Management**: Create and configure vector databases
- **Data Operations**: Insert, search, and export vector data
- **Performance Benchmarking**: Test query performance and throughput
- **Format Support**: JSON, CSV, and NumPy array formats
- **MCP Server**: Model Context Protocol server for AI integrations
- **Batch Processing**: Efficient bulk operations with progress tracking
## ⚡ Quick Start
### Installation
Install via Cargo:
```bash
cargo install ruvector-cli
```
Or build from source:
```bash
# Clone repository
git clone https://github.com/ruvnet/ruvector.git
cd ruvector
# Build CLI
cargo build --release -p ruvector-cli
# Install locally
cargo install --path crates/ruvector-cli
```
### Basic Usage
```bash
# Create a new database
ruvector create --dimensions 384 --path ./my-vectors.db
# Insert vectors from JSON
ruvector insert --db ./my-vectors.db --input vectors.json --format json
# Search for similar vectors
ruvector search --db ./my-vectors.db --query "[0.1, 0.2, 0.3, ...]" --top-k 10
# Show database information
ruvector info --db ./my-vectors.db
# Run performance benchmark
ruvector benchmark --db ./my-vectors.db --queries 1000
```
## 📋 Command Reference
### Global Options
All commands support these global options:
```bash
-c, --config <FILE> Configuration file path
-d, --debug Enable debug logging
--no-color Disable colored output
-h, --help Print help information
-V, --version Print version information
```
### Commands
#### `create` - Create a New Database
Create a new vector database with specified dimensions.
```bash
ruvector create [OPTIONS] --dimensions <DIMENSIONS>
Options:
-p, --path <PATH> Database file path [default: ./ruvector.db]
-d, --dimensions <DIMENSIONS> Vector dimensions (required)
```
**Examples:**
```bash
# Create database for 384-dimensional embeddings (e.g., MiniLM)
ruvector create --dimensions 384
# Create database with custom path
ruvector create --dimensions 1536 --path ./embeddings.db
# Create for large embeddings (e.g., text-embedding-3-large)
ruvector create --dimensions 3072 --path ./large-embeddings.db
```
#### `insert` - Insert Vectors from File
Bulk insert vectors from JSON, CSV, or NumPy files.
```bash
ruvector insert [OPTIONS] --input <FILE>
Options:
-d, --db <PATH> Database file path [default: ./ruvector.db]
-i, --input <FILE> Input file path (required)
-f, --format <FORMAT> Input format: json, csv, npy [default: json]
--no-progress Hide progress bar
```
**Input Formats:**
**JSON** (array of vector entries):
```json
[
{
"id": "doc_1",
"vector": [0.1, 0.2, 0.3, ...],
"metadata": {"title": "Document 1", "category": "tech"}
},
{
"id": "doc_2",
"vector": [0.4, 0.5, 0.6, ...],
"metadata": {"title": "Document 2", "category": "science"}
}
]
```
**CSV** (id, vector_json, metadata_json):
```csv
id,vector,metadata
doc_1,"[0.1, 0.2, 0.3]","{\"title\": \"Document 1\"}"
doc_2,"[0.4, 0.5, 0.6]","{\"title\": \"Document 2\"}"
```
**NumPy** (.npy file with 2D array):
```python
import numpy as np
vectors = np.random.randn(1000, 384).astype(np.float32)
np.save('vectors.npy', vectors)
```
**Examples:**
```bash
# Insert from JSON file
ruvector insert --input embeddings.json --format json
# Insert from CSV with progress
ruvector insert --input data.csv --format csv
# Insert from NumPy array
ruvector insert --input vectors.npy --format npy
# Batch insert without progress bar
ruvector insert --input large-dataset.json --no-progress
```
#### `search` - Search for Similar Vectors
Find k-nearest neighbors for a query vector.
```bash
ruvector search [OPTIONS] --query <VECTOR>
Options:
-d, --db <PATH> Database file path [default: ./ruvector.db]
-q, --query <VECTOR> Query vector (comma-separated or JSON array)
-k, --top-k <K> Number of results to return [default: 10]
--show-vectors Show full vectors in results
```
**Query Formats:**
```bash
# Comma-separated floats
ruvector search --query "0.1, 0.2, 0.3, 0.4, ..."
# JSON array
ruvector search --query "[0.1, 0.2, 0.3, 0.4, ...]"
# From file (using shell)
ruvector search --query "$(cat query.json)"
```
**Examples:**
```bash
# Search for top 10 similar vectors
ruvector search --query "[0.1, 0.2, 0.3, ...]" --top-k 10
# Search with full vector output
ruvector search --query "0.1, 0.2, 0.3, ..." --show-vectors
# Search for top 50 results
ruvector search --query "[0.1, 0.2, ...]" -k 50
```
**Output:**
```
🔍 Search Results (top 10)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
#1 doc_42 similarity: 0.9876
#2 doc_128 similarity: 0.9543
#3 doc_89 similarity: 0.9321
...
Search completed in 0.48ms
```
#### `info` - Show Database Information
Display database statistics and configuration.
```bash
ruvector info [OPTIONS]
Options:
-d, --db <PATH> Database file path [default: ./ruvector.db]
```
**Examples:**
```bash
# Show default database info
ruvector info
# Show custom database info
ruvector info --db ./embeddings.db
```
**Output:**
```
📊 Database Statistics
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Total vectors: 1,234,567
Dimensions: 384
Distance metric: Cosine
HNSW Configuration:
M: 16
ef_construction: 200
ef_search: 100
```
#### `benchmark` - Run Performance Benchmark
Test query performance with random vectors.
```bash
ruvector benchmark [OPTIONS]
Options:
-d, --db <PATH> Database file path [default: ./ruvector.db]
-n, --queries <N> Number of queries to run [default: 1000]
```
**Examples:**
```bash
# Quick benchmark (1000 queries)
ruvector benchmark
# Extended benchmark (10,000 queries)
ruvector benchmark --queries 10000
# Benchmark specific database
ruvector benchmark --db ./prod.db --queries 5000
```
**Output:**
```
Running benchmark...
Queries: 1000
Dimensions: 384
Benchmark Results:
Total time: 0.48s
Queries per second: 2083
Average latency: 0.48ms
```
#### `export` - Export Database to File
Export vector data to JSON or CSV format.
```bash
ruvector export [OPTIONS] --output <FILE>
Options:
-d, --db <PATH> Database file path [default: ./ruvector.db]
-o, --output <FILE> Output file path (required)
-f, --format <FORMAT> Output format: json, csv [default: json]
```
**Examples:**
```bash
# Export to JSON
ruvector export --output backup.json --format json
# Export to CSV
ruvector export --output export.csv --format csv
# Export with custom database
ruvector export --db ./prod.db --output prod-backup.json
```
> **Note**: Export functionality requires `VectorDB::all_ids()` method. This feature is planned for a future release.
#### `import` - Import from Other Vector Databases
Import vectors from external vector database formats.
```bash
ruvector import [OPTIONS] --source <TYPE> --source-path <PATH>
Options:
-d, --db <PATH> Database file path [default: ./ruvector.db]
-s, --source <TYPE> Source database type: faiss, pinecone, weaviate
-p, --source-path <PATH> Source file or connection path
```
**Examples:**
```bash
# Import from FAISS index
ruvector import --source faiss --source-path ./index.faiss
# Import from Pinecone export
ruvector import --source pinecone --source-path ./pinecone-export.json
# Import from Weaviate backup
ruvector import --source weaviate --source-path ./weaviate-backup.json
```
> **Note**: Import functionality for external databases is planned for future releases.
## 🔧 Configuration
### Configuration File
Create a `ruvector.toml` configuration file for default settings:
```toml
[database]
storage_path = "./ruvector.db"
dimensions = 384
distance_metric = "Cosine" # Cosine, Euclidean, DotProduct, Manhattan
[database.hnsw]
m = 16
ef_construction = 200
ef_search = 100
[database.quantization]
type = "Scalar" # Scalar, Product, or None
[cli]
progress = true
colors = true
batch_size = 1000
[mcp]
host = "127.0.0.1"
port = 3000
cors = true
```
### Configuration Locations
The CLI searches for configuration files in this order:
1. Path specified via `--config` flag
2. `./ruvector.toml` (current directory)
3. `./.ruvector.toml` (current directory, hidden)
4. `~/.config/ruvector/config.toml` (user config)
5. `/etc/ruvector/config.toml` (system config)
### Environment Variables
Override configuration with environment variables:
```bash
# Database settings
export RUVECTOR_STORAGE_PATH="./my-db.db"
export RUVECTOR_DIMENSIONS=384
export RUVECTOR_DISTANCE_METRIC="cosine"
# MCP server settings
export RUVECTOR_MCP_HOST="0.0.0.0"
export RUVECTOR_MCP_PORT=3000
# Run with environment overrides
ruvector info
```
## 🔌 MCP Server
The Ruvector CLI includes a **Model Context Protocol (MCP)** server for AI agent integration.
### Start MCP Server
**STDIO Transport** (for local AI tools):
```bash
ruvector-mcp --transport stdio
```
**SSE Transport** (for web-based AI tools):
```bash
ruvector-mcp --transport sse --host 0.0.0.0 --port 3000
```
**With Configuration:**
```bash
ruvector-mcp --config ./ruvector.toml --transport sse --debug
```
### MCP Integration Examples
**Claude Desktop Integration** (`claude_desktop_config.json`):
```json
{
"mcpServers": {
"ruvector": {
"command": "ruvector-mcp",
"args": ["--transport", "stdio"],
"env": {
"RUVECTOR_STORAGE_PATH": "/path/to/vectors.db"
}
}
}
}
```
**HTTP/SSE Client:**
```javascript
const evtSource = new EventSource('http://localhost:3000/sse');
evtSource.addEventListener('message', (event) => {
const data = JSON.parse(event.data);
console.log('MCP Response:', data);
});
// Send search request
fetch('http://localhost:3000/mcp', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
method: 'search',
params: {
query: [0.1, 0.2, 0.3],
k: 10
}
})
});
```
## 📊 Common Workflows
### RAG System Setup
Build a retrieval-augmented generation (RAG) system:
```bash
# 1. Create database for your embedding model
ruvector create --dimensions 384 --path ./rag-embeddings.db
# 2. Generate embeddings and save to JSON
# (Use your preferred embedding model)
# 3. Insert embeddings
ruvector insert --db ./rag-embeddings.db --input embeddings.json
# 4. Query for relevant context
ruvector search --db ./rag-embeddings.db \
--query "[0.123, 0.456, ...]" \
--top-k 5
# 5. Start MCP server for AI agent access
ruvector-mcp --transport stdio
```
### Semantic Search Engine
Build a semantic search system:
```bash
# Create database
ruvector create --dimensions 768 --path ./search-engine.db
# Batch insert documents
ruvector insert \
--db ./search-engine.db \
--input documents.json \
--format json
# Benchmark performance
ruvector benchmark --db ./search-engine.db --queries 10000
# Search interface via MCP
ruvector-mcp --transport sse --port 8080
```
### Migration from Other Databases
Migrate from existing vector databases:
```bash
# 1. Export from source database
# (Use source database's export tools)
# 2. Create Ruvector database
ruvector create --dimensions 1536 --path ./migrated.db
# 3. Import data (planned feature)
ruvector import \
--db ./migrated.db \
--source pinecone \
--source-path ./pinecone-export.json
# 4. Verify migration
ruvector info --db ./migrated.db
ruvector benchmark --db ./migrated.db
```
### Performance Testing
Test vector database performance:
```bash
# Create test database
ruvector create --dimensions 384 --path ./benchmark.db
# Generate synthetic test data
python generate_test_vectors.py --count 100000 --dims 384 --output test.npy
# Insert test data
ruvector insert --db ./benchmark.db --input test.npy --format npy
# Run comprehensive benchmark
ruvector benchmark --db ./benchmark.db --queries 10000
# Test search performance
time ruvector search --db ./benchmark.db --query "[0.1, 0.2, ...]" -k 100
```
## 🎯 Shell Completion
Generate shell completion scripts for faster command entry:
### Bash
```bash
# Generate completion script
ruvector --help > /dev/null # Trigger clap completion
complete -C ruvector ruvector
# Or add to ~/.bashrc
echo 'complete -C ruvector ruvector' >> ~/.bashrc
```
### Zsh
```bash
# Add to ~/.zshrc
autoload -U compinit && compinit
complete -o nospace -C ruvector ruvector
```
### Fish
```bash
# Generate and save completion
ruvector --help > /dev/null
complete -c ruvector -f
```
## ⚙️ Performance Tips
### Optimize Insertion
```bash
# Use larger batch sizes for bulk inserts (set in config)
[cli]
batch_size = 10000
# Disable progress bar for maximum speed
ruvector insert --input large-file.json --no-progress
```
### Optimize Search
Configure HNSW parameters for your use case:
```toml
[database.hnsw]
# Higher M = better recall, more memory
m = 32
# Higher ef_construction = better index quality, slower builds
ef_construction = 400
# Higher ef_search = better recall, slower queries
ef_search = 200
```
### Memory Optimization
Enable quantization to reduce memory usage:
```toml
[database.quantization]
type = "Product" # 4-8x memory reduction
```
### Benchmarking Tips
```bash
# Run warm-up queries first
ruvector search --query "[...]" -k 10
ruvector search --query "[...]" -k 10
# Then benchmark
ruvector benchmark --queries 10000
# Test different k values
for k in 10 50 100; do
time ruvector search --query "[...]" -k $k
done
```
## 🔗 Related Documentation
- **[Rust API Reference](../../docs/api/RUST_API.md)** - Core Ruvector API
- **[Getting Started Guide](../../docs/guide/GETTING_STARTED.md)** - Complete tutorial
- **[Performance Tuning](../../docs/optimization/PERFORMANCE_TUNING_GUIDE.md)** - Optimization guide
- **[Main README](../../README.md)** - Project overview
## 🐛 Troubleshooting
### Common Issues
**Database file not found:**
```bash
# Ensure database exists
ruvector info --db ./ruvector.db
# Or create it first
ruvector create --dimensions 384 --path ./ruvector.db
```
**Dimension mismatch:**
```bash
# Error: "Vector dimension mismatch"
# Solution: Ensure all vectors match database dimensions
# Check database dimensions
ruvector info --db ./ruvector.db
```
**Invalid query format:**
```bash
# Use proper JSON or comma-separated format
ruvector search --query "[0.1, 0.2, 0.3]" # JSON
ruvector search --query "0.1, 0.2, 0.3" # CSV
```
**MCP server connection issues:**
```bash
# Check if port is available
lsof -i :3000
# Try different port
ruvector-mcp --transport sse --port 8080
# Enable debug logging
ruvector-mcp --transport sse --debug
```
## 🤝 Contributing
Contributions welcome! Please see the [Contributing Guidelines](../../docs/development/CONTRIBUTING.md).
### Development Setup
```bash
# Clone repository
git clone https://github.com/ruvnet/ruvector.git
cd ruvector/crates/ruvector-cli
# Run tests
cargo test
# Check formatting
cargo fmt -- --check
# Run clippy
cargo clippy -- -D warnings
# Build release
cargo build --release
```
## 📜 License
MIT License - see [LICENSE](../../LICENSE) for details.
## 🙏 Acknowledgments
Built with:
- **clap** - Command-line argument parsing
- **tokio** - Async runtime
- **serde** - Serialization framework
- **indicatif** - Progress bars and spinners
- **colored** - Terminal colors
---
**Built by [rUv](https://ruv.io) • Part of the [Ruvector](https://github.com/ruvnet/ruvector) ecosystem**
[Main Documentation](../../README.md) • [API Reference](../../docs/api/RUST_API.md) • [GitHub](https://github.com/ruvnet/ruvector)