KotaDB

A custom database for distributed human-AI cognition, built entirely by LLM agents.

KotaDB combines document storage, graph relationships, and semantic search
into a unified system designed for the way humans and AI think together.

Performance

Real-world benchmarks on Apple Silicon:

Operation	Latency	Throughput
B+ Tree Search	489 µs	2,000 queries/sec
Trigram Search	<10 ms	100+ queries/sec
Document Insert	277 µs	3,600 ops/sec
Bulk Operations	20 ms	50,000 ops/sec

10,000 document dataset, Apple Silicon M-series

Quick Start

# Clone and build
git clone https://github.com/jayminwest/kota-db.git
cd kota-db
cargo build

# Start HTTP server
cargo run --bin kotadb -- serve

# CLI examples
cargo run --bin kotadb -- insert /test/doc "My Document" "Document content"
cargo run --bin kotadb -- search "rust"     # Full-text search
cargo run --bin kotadb -- search "*"        # Wildcard search
cargo run --bin kotadb -- stats            # Database statistics

just dev              # Start with auto-reload
just test             # Run all tests
just check            # Format, lint, test
just bench            # Performance benchmarks

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Query Interface                           │
│              Natural Language + Structured                   │
├─────────────────────────────────────────────────────────────┤
│                    Query Router                              │
│         Automatic index selection based on query             │
├──────────────┬───────────────┬───────────────┬──────────────┤
│   Primary    │   Full-Text   │     Graph     │   Semantic   │
│   B+ Tree    │    Trigram    │  (Planned)    │     HNSW     │
├──────────────┴───────────────┴───────────────┴──────────────┤
│                    Storage Engine                            │
│        Pages + WAL + Compression + Memory Map                │
└─────────────────────────────────────────────────────────────┘

Core Features

Storage

Native Format: Markdown files with YAML frontmatter
Git Compatible: Human-readable, diff-friendly
Crash-Safe: WAL ensures data durability
Zero Database Dependencies: No external database required

Indexing

B+ Tree: O(log n) path-based lookups
Trigram: Fuzzy-tolerant full-text search
Graph: Relationship traversal (MCP tools only, not fully implemented)
Vector: Semantic similarity with HNSW

Safety

Systematic Testing: 6-stage risk reduction methodology
Type Safety: Validated types at compile time
Observability: Distributed tracing on every operation
Resilience: Automatic retries with exponential backoff

Code Example

use kotadb::{create_file_storage, DocumentBuilder};

#[tokio::main]
async fn main() -> Result<()> {
    // Production-ready storage with all safety features
    let mut storage = create_file_storage("~/.kota/db", Some(1000)).await?;
    
    // Type-safe document construction
    let doc = DocumentBuilder::new()
        .path("/knowledge/rust-patterns.md")?
        .title("Advanced Rust Design Patterns")?
        .content(b"# Advanced Rust Patterns\n\n...")?
        .build()?;
    
    // Automatically traced, validated, cached, with retries
    storage.insert(doc).await?;
    
    Ok(())
}

Query Language

Natural, intuitive queries designed for human-AI interaction:

// Natural language
"meetings about rust programming last week"

// Structured precision
{
  type: "semantic",
  query: "distributed systems",
  filter: { tags: { $contains: "architecture" } },
  limit: 10
}

// Graph traversal
GRAPH {
  start: "projects/kota-ai/README.md",
  follow: ["related", "references"],
  depth: 2
}

Project Status

Complete

Storage engine with WAL and compression
B+ tree primary index with persistence
Trigram full-text search with ranking
Intelligent query routing
CLI interface
Performance benchmarks

In Progress

Model Context Protocol (MCP) server
Python/TypeScript client libraries
Semantic vector search
Graph relationship queries

Documentation

Architecture • API Reference • Development Guide • Agent Guide

Installation

As a CLI Tool

cargo install --path .
kotadb serve                    # Start HTTP server
kotadb insert /path "Title" "Content"  # Insert document
kotadb search "query"           # Search documents

As a Library

[dependencies]
kotadb = { git = "https://github.com/jayminwest/kota-db" }

Docker

docker build -t kotadb .
docker run -p 8080:8080 kotadb serve

Benchmarks Detail

Operation	Size	Latency	Throughput
BTree Insert	100	15.8 µs	63,300 ops/sec
BTree Insert	1,000	325 µs	3,080 ops/sec
BTree Insert	10,000	4.77 ms	210 ops/sec
BTree Search	100	2.08 µs	482,000 queries/sec
BTree Search	1,000	33.2 µs	30,100 queries/sec
BTree Search	10,000	546 µs	1,830 queries/sec
Bulk Operations	1,000	25.4 ms	39,400 ops/sec
Bulk Operations	5,000	23.7 ms	211,000 ops/sec

Contributing

This project is developed entirely by LLM agents. Human contributions follow the same process:

Open an issue describing the change
Agents will review and implement
Changes are validated through comprehensive testing
Documentation is automatically updated

See AGENT.md for the agent collaboration protocol.

License

MIT - See LICENSE for details.

Built for KOTA • Inspired by LevelDB, Tantivy, and FAISS

The best database is the one designed specifically for your problem.

kotadb 0.2.1