KotaDB
A custom database for distributed human-AI cognition, built entirely by LLM agents.
KotaDB combines document storage, graph relationships, and semantic search
into a unified system designed for the way humans and AI think together.
Performance
Real-world benchmarks on Apple Silicon:
| Operation | Latency | Throughput |
|---|---|---|
| B+ Tree Search | 489 µs | 2,000 queries/sec |
| Trigram Search | <10 ms | 100+ queries/sec |
| Document Insert | 277 µs | 3,600 ops/sec |
| Bulk Operations | 20 ms | 50,000 ops/sec |
10,000 document dataset, Apple Silicon M-series
Quick Start
# Clone and build
# Start HTTP server
# CLI examples
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Query Interface │
│ Natural Language + Structured │
├─────────────────────────────────────────────────────────────┤
│ Query Router │
│ Automatic index selection based on query │
├──────────────┬───────────────┬───────────────┬──────────────┤
│ Primary │ Full-Text │ Graph │ Semantic │
│ B+ Tree │ Trigram │ (Planned) │ HNSW │
├──────────────┴───────────────┴───────────────┴──────────────┤
│ Storage Engine │
│ Pages + WAL + Compression + Memory Map │
└─────────────────────────────────────────────────────────────┘
Core Features
Storage
- Native Format: Markdown files with YAML frontmatter
- Git Compatible: Human-readable, diff-friendly
- Crash-Safe: WAL ensures data durability
- Zero Database Dependencies: No external database required
Indexing
- B+ Tree: O(log n) path-based lookups
- Trigram: Fuzzy-tolerant full-text search
- Graph: Relationship traversal (MCP tools only, not fully implemented)
- Vector: Semantic similarity with HNSW
Safety
- Systematic Testing: 6-stage risk reduction methodology
- Type Safety: Validated types at compile time
- Observability: Distributed tracing on every operation
- Resilience: Automatic retries with exponential backoff
Code Example
use ;
async
Query Language
Natural, intuitive queries designed for human-AI interaction:
// Natural language
"meetings about rust programming last week"
// Structured precision
// Graph traversal
GRAPH
Project Status
Complete
- Storage engine with WAL and compression
- B+ tree primary index with persistence
- Trigram full-text search with ranking
- Intelligent query routing
- CLI interface
- Performance benchmarks
In Progress
- Model Context Protocol (MCP) server
- Python/TypeScript client libraries
- Semantic vector search
- Graph relationship queries
Documentation
Architecture • API Reference • Development Guide • Agent Guide
Installation
As a CLI Tool
As a Library
[]
= { = "https://github.com/jayminwest/kota-db" }
Docker
Benchmarks Detail
| Operation | Size | Latency | Throughput |
|---|---|---|---|
| BTree Insert | 100 | 15.8 µs | 63,300 ops/sec |
| BTree Insert | 1,000 | 325 µs | 3,080 ops/sec |
| BTree Insert | 10,000 | 4.77 ms | 210 ops/sec |
| BTree Search | 100 | 2.08 µs | 482,000 queries/sec |
| BTree Search | 1,000 | 33.2 µs | 30,100 queries/sec |
| BTree Search | 10,000 | 546 µs | 1,830 queries/sec |
| Bulk Operations | 1,000 | 25.4 ms | 39,400 ops/sec |
| Bulk Operations | 5,000 | 23.7 ms | 211,000 ops/sec |
Contributing
This project is developed entirely by LLM agents. Human contributions follow the same process:
- Open an issue describing the change
- Agents will review and implement
- Changes are validated through comprehensive testing
- Documentation is automatically updated
See AGENT.md for the agent collaboration protocol.
License
MIT - See LICENSE for details.
Built for KOTA • Inspired by LevelDB, Tantivy, and FAISS
The best database is the one designed specifically for your problem.