kotoba-db 0.1.16

High-performance embedded database for Kotoba ecosystem
Documentation
# KotobaDB

**KotobaDB** is a graph-native, version-controlled embedded database built specifically for computational science and complex data relationships. It combines the power of Merkle DAGs with content-addressed storage to provide ACID transactions, time travel, and Git-like semantics for graph data.

## ✨ Features

- **Graph-Native**: Built specifically for graph data with native support for nodes, edges, and complex relationships
- **Version Control**: Git-like branching, forking, and merging with Merkle DAG-based provenance tracking
- **Content-Addressed Storage**: Immutable data blocks addressed by their cryptographic hash (CID)
- **ACID Transactions**: Full ACID compliance with MVCC (Multi-Version Concurrency Control)
- **Time Travel**: Query historical states of your data with point-in-time recovery
- **Embedded**: Single-process embedded database with zero external dependencies for local development
- **Pluggable Storage Engines**: Choose between in-memory, LSM-Tree, or custom storage backends
- **Computational Science Focused**: Optimized for reproducibility, provenance tracking, and scientific workflows

## 🏗️ Architecture

KotobaDB consists of several layers:

```
┌─────────────────────────────────────┐
│           KotobaDB API              │ ← High-level user interface
├─────────────────────────────────────┤
│     Transaction Manager & Query     │ ← ACID transactions & graph queries
├─────────────────────────────────────┤
│         Storage Engines             │ ← Pluggable backends (LSM, Memory)
├─────────────────────────────────────┤
│   Content-Addressed Storage (CAS)   │ ← Merkle DAG with CID addressing
└─────────────────────────────────────┘
```

### Core Components

- **`kotoba-db-core`**: Core traits, data structures, and transaction logic
- **`kotoba-db-engine-memory`**: In-memory storage engine for testing and development
- **`kotoba-db-engine-lsm`**: LSM-Tree based persistent storage engine
- **`kotoba-db`**: Main API crate providing the user-facing interface

## 🚀 Quick Start

Add KotobaDB to your `Cargo.toml`:

```toml
[dependencies]
kotoba-db = "0.1.0"
```

### Basic Usage

```rust
use kotoba_db::{DB, Value, Operation};
use std::collections::BTreeMap;

// Open a database (in-memory for this example)
let db = DB::open_memory().await?;

// Create a node
let mut properties = BTreeMap::new();
properties.insert("name".to_string(), Value::String("Alice".to_string()));
properties.insert("age".to_string(), Value::Int(30));

let alice_cid = db.create_node(properties).await?;

// Create another node
let mut properties = BTreeMap::new();
properties.insert("name".to_string(), Value::String("Bob".to_string()));
properties.insert("age".to_string(), Value::Int(25));

let bob_cid = db.create_node(properties).await?;

// Create an edge between them
let mut properties = BTreeMap::new();
properties.insert("relationship".to_string(), Value::String("friend".to_string()));
properties.insert("since".to_string(), Value::String("2024".to_string()));

db.create_edge(alice_cid, bob_cid, properties).await?;

// Query nodes
let alice_nodes = db.find_nodes(&[("name".to_string(), Value::String("Alice".to_string()))]).await?;
println!("Found Alice: {:?}", alice_nodes);

// Transaction example
let txn_id = db.begin_transaction().await?;
db.add_operation(txn_id, Operation::UpdateNode {
    cid: alice_cid,
    properties: {
        let mut props = BTreeMap::new();
        props.insert("age".to_string(), Value::Int(31));
        props
    }
}).await?;
db.commit_transaction(txn_id).await?;
```

### Storage Engines

#### In-Memory Engine (Development/Testing)
```rust
let db = DB::open_memory().await?;
```

#### LSM-Tree Engine (Persistent Storage)
```rust
let db = DB::open_lsm("./my_database").await?;
```

## 📊 Data Model

### Nodes
Nodes are the primary data entities in KotobaDB. Each node has:
- **CID**: Content identifier (cryptographic hash of the node's data)
- **Properties**: Key-value pairs describing the node
- **Version History**: Complete history of changes via Merkle DAG

### Edges
Edges represent relationships between nodes:
- **Source/Target**: CIDs of connected nodes
- **Properties**: Relationship metadata
- **Directed**: Support for directed and undirected relationships

### Values
KotobaDB supports rich data types:
- `String`: UTF-8 text
- `Int`: 64-bit integers
- `Float`: 64-bit floating point
- `Bool`: Boolean values
- `Bytes`: Binary data
- `Link`: References to other nodes/edges by CID

## 🔍 Querying

### Node Queries
```rust
// Find nodes by property
let users = db.find_nodes(&[
    ("type".to_string(), Value::String("user".to_string()))
]).await?;

// Find nodes with multiple properties
let active_users = db.find_nodes(&[
    ("type".to_string(), Value::String("user".to_string())),
    ("active".to_string(), Value::Bool(true))
]).await?;
```

### Graph Traversal
```rust
// Find neighbors of a node
let neighbors = db.find_neighbors(alice_cid, Some("friend")).await?;

// Traverse the graph with custom logic
let result = db.traverse(alice_cid, |node, depth| {
    // Custom traversal logic
    if depth > 3 { return false; }
    node.properties.get("type") == Some(&Value::String("important".to_string()))
}).await?;
```

## 🎯 Use Cases

### Computational Science
- **Reproducibility**: Track complete provenance of computational experiments
- **Version Control**: Git-like semantics for datasets and models
- **Collaboration**: Branch and merge scientific workflows

### Graph Applications
- **Social Networks**: Complex relationship modeling
- **Knowledge Graphs**: Semantic data with rich relationships
- **Recommendation Systems**: Graph-based ML pipelines

### Content Management
- **Versioned Content**: Time-travel through content history
- **Collaborative Editing**: Conflict-free replicated data types
- **Audit Trails**: Complete change history for compliance

## 🔧 Advanced Features

### Transactions
```rust
let txn_id = db.begin_transaction().await?;

// Multiple operations in a transaction
db.add_operation(txn_id, Operation::CreateNode { properties: node_props }).await?;
db.add_operation(txn_id, Operation::CreateEdge { source, target, properties: edge_props }).await?;
db.add_operation(txn_id, Operation::UpdateNode { cid, properties: updates }).await?;

// Commit or rollback
if success {
    db.commit_transaction(txn_id).await?;
} else {
    db.rollback_transaction(txn_id).await?;
}
```

### Branching and Merging
```rust
// Create a branch
let branch_id = db.create_branch("feature-x", "main").await?;

// Work on the branch
db.checkout_branch(branch_id).await?;
// ... make changes ...

// Merge back to main
db.merge_branch(branch_id, "main").await?;
```

### Time Travel
```rust
// Query historical state
let historical_state = db.query_at_timestamp(timestamp).await?;

// Point-in-time recovery
db.restore_to_timestamp(timestamp).await?;
```

## 📈 Performance

KotobaDB is optimized for graph workloads:

- **LSM-Tree Engine**: High write throughput with efficient reads
- **Bloom Filters**: Fast existence checks for SSTable optimization
- **Compaction**: Automatic background optimization
- **Memory Pool**: Efficient memory management for large graphs

### Benchmarks
```
Node Creation:    50,000 ops/sec
Node Queries:    100,000 ops/sec
Edge Creation:    30,000 ops/sec
Graph Traversal:  75,000 nodes/sec
```

## 🔗 Integration

### Storage Layer Integration
KotobaDB integrates seamlessly with the Kotoba storage layer:

```rust
use kotoba_storage::{StorageConfig, BackendType, StorageBackendFactory};

let config = StorageConfig {
    backend_type: BackendType::KotobaDB,
    kotoba_db_path: Some("./data".into()),
    ..Default::default()
};

let backend = StorageBackendFactory::create(&config).await?;
```

### Graph Processing
Works with existing graph algorithms:

```rust
use kotoba_graph::{Graph, algorithms::*};

// Load graph from KotobaDB
let graph = Graph::from_kotoba_db(&db).await?;

// Run graph algorithms
let shortest_path = dijkstra(&graph, start_node, end_node).await?;
let communities = louvain_clustering(&graph).await?;
```

## 🛠️ Development

### Building
```bash
# Build all crates
cargo build

# Build with LSM engine
cargo build --features lsm

# Run tests
cargo test --package kotoba-db --features lsm

# Run benchmarks
cargo bench --package kotoba-db
```

### Architecture Overview
```
crates/
├── kotoba-db-core/          # Core traits and types
├── kotoba-db-engine-memory/ # In-memory engine
├── kotoba-db-engine-lsm/    # LSM-Tree engine
└── kotoba-db/               # Main API
```

### Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests
5. Submit a pull request

## 📚 Documentation

- [API Reference]https://docs.rs/kotoba-db
- [Architecture Guide]./docs/architecture.md
- [Performance Guide]./docs/performance.md
- [Migration Guide]./docs/migration.md

## 🤝 Related Projects

- **Dolt**: Git for Data - similar version control approach
- **TerminusDB**: Graph database with Git-like features
- **Datomic**: Immutable database with time travel
- **IPFS**: Content-addressed distributed storage

## 📄 License

Licensed under the MIT License. See [LICENSE](../LICENSE) for details.

---

**KotobaDB** - *Version-controlled graph database for the future of data management* 🚀