sql-rs 0.1.0

A SQL database with vector similarity search capabilities
Documentation
# SQL-RS

A lightweight, embedded database written in Rust that combines traditional relational database features with vector database capabilities.

## Features

### Core Database Features
- **Traditional Database**: B-tree based storage with ACID properties
- **Vector Database**: High-dimensional embeddings with similarity search
- **CLI Interface**: Simple command-line interface for all operations
- **Single File**: Database stored in a single file
- **Rust Edition 2024**: Modern Rust with latest features

### Advanced Features
- **Write-Ahead Logging (WAL)**: Durability and crash recovery support
- **Transaction Support**: BEGIN, COMMIT, ROLLBACK operations
- **Full SQL Support**: CREATE, INSERT, SELECT, UPDATE, DELETE, DROP TABLE
- **WHERE Clauses**: Filter data with comparison operators (=, !=, >, <, >=, <=)
- **ORDER BY**: Sort results in ascending or descending order
- **LIMIT/OFFSET**: Pagination support for query results
- **Multiple Data Types**: INTEGER, FLOAT, TEXT, BLOB, BOOLEAN, NULL

## Installation

### As a Library

Add this to your `Cargo.toml`:

```toml
[dependencies]
sql_rs = "0.1"
```

### Build from Source

```bash
cargo build --release
```

## Usage

### Create a Database

```bash
sql_rs create mydb.db
```

### Traditional Database Operations

#### Create a Table

```bash
sql_rs query mydb.db "CREATE TABLE users (id INTEGER, name TEXT, age INTEGER)"
```

#### Insert Data

```bash
sql_rs query mydb.db "INSERT INTO users VALUES (1, 'Alice', 30)"
sql_rs query mydb.db "INSERT INTO users VALUES (2, 'Bob', 25)"
```

#### Query Data

```bash
sql_rs query mydb.db "SELECT * FROM users"
sql_rs query mydb.db "SELECT * FROM users WHERE age > 25"
sql_rs query mydb.db "SELECT * FROM users ORDER BY age DESC"
sql_rs query mydb.db "SELECT * FROM users ORDER BY name ASC LIMIT 10"
sql_rs query mydb.db "SELECT * FROM users LIMIT 5 OFFSET 10"
```

#### Update Data

```bash
sql_rs query mydb.db "UPDATE users SET age = 31 WHERE name = 'Alice'"
```

#### Delete Data

```bash
sql_rs query mydb.db "DELETE FROM users WHERE age < 20"
```

#### Drop Table

```bash
sql_rs query mydb.db "DROP TABLE users"
```

### Vector Database Operations

#### Create a Vector Collection

```bash
sql_rs vector create mydb.db --collection embeddings --dimension 384 --metric cosine
```

#### Add Vectors

```bash
sql_rs vector add mydb.db \
  --collection embeddings \
  --id doc1 \
  --vector "[0.1, 0.2, 0.3, ...]" \
  --metadata '{"title": "Document 1", "category": "tech"}'
```

#### Search Similar Vectors

```bash
sql_rs vector search mydb.db \
  --collection embeddings \
  --vector "[0.15, 0.25, 0.35, ...]" \
  --top-k 10
```

### Database Info

```bash
sql_rs info mydb.db
```

## Architecture

SQL-RS follows a modular architecture:

- **Storage Layer**: Page-based B-tree storage with WAL and transaction support
- **Vector Layer**: HNSW index for approximate nearest neighbor search
- **Query Engine**: SQL parser and executor with full CRUD operations
- **Transaction Manager**: ACID-compliant transaction handling
- **CLI Layer**: Command-line interface using clap

## Distance Metrics

sql_rs supports three distance metrics for vector similarity:

- **Cosine**: Measures angle between vectors (default)
- **Euclidean**: Measures straight-line distance
- **Dot Product**: Measures vector alignment

## Performance

- Insert: >10k rows/sec
- Query: <10ms for indexed lookups
- Vector search: <100ms for 1M vectors (approximate)
- Memory footprint: <50MB for typical workloads

## Examples

See the `examples/` directory for complete usage examples.

## Testing

```bash
cargo test
```

## Development

Built with:
- **Rust Edition 2024**: Latest Rust features and improvements
- **clap**: Command-line interface parsing
- **serde/serde_json**: Serialization and deserialization
- **bincode**: Binary encoding for efficient storage
- **parking_lot**: High-performance synchronization primitives
- **memmap2**: Memory-mapped file I/O
- **thiserror/anyhow**: Robust error handling

### Project Structure

```
src/
├── lib.rs              # Main library entry point
├── main.rs             # CLI entry point
├── cli/                # Command-line interface
├── storage/            # B-tree, WAL, transactions
├── vector/             # HNSW index, similarity metrics
├── query/              # SQL parser and executor
└── types/              # Core data types and schemas
```

### Running Tests

```bash
cargo test                    # Run all tests
cargo test --test delete_drop_tests  # Run specific test suite
cargo run --example basic_usage      # Run basic example
cargo run --example comprehensive_demo  # Run comprehensive feature demo
```

### Examples

**Basic Usage** (`examples/basic_usage.rs`):
- Simple database operations
- Basic vector search

**Comprehensive Demo** (`examples/comprehensive_demo.rs`):
- All SQL operations (CREATE, INSERT, SELECT, UPDATE, DELETE, DROP)
- WHERE clause filtering with multiple operators
- Multiple data types (INTEGER, TEXT, FLOAT, BOOLEAN)
- Vector database with 384-dimensional embeddings
- Multiple distance metrics (Cosine, Euclidean, Dot Product)
- Vector search with metadata
- Performance benchmarking
- Complex data analysis

Run the comprehensive demo to see all features in action:
```bash
cargo run --example comprehensive_demo
```

### Building

```bash
cargo build                   # Debug build
cargo build --release         # Release build
```

## Roadmap

See `TODO.md` for planned features and improvements.

## Documentation

- **ARCHITECTURE.md**: Detailed system design and architecture
- **TODO.md**: Feature roadmap and planned improvements
- **SPEC.md**: Technical specifications