leankg 0.12.0

Lightweight Knowledge Graph for AI-Assisted Development
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
# LeanKG High Level Design

**Version:** 2.0
**Date:** 2026-04-11
**Based on:** PRD v3.0-consolidated
**Status:** Active
**Codebase Version:** 0.11.1

**Changelog:**
- v2.0 - Full codebase audit rewrite:
  - Converted all Vietnamese content to English
  - Added 6 missing modules: orchestrator, compress, hooks, benchmark, registry, api
  - Updated C4 diagrams to include all 17 modules
  - Updated data model with ContextMetric, QueryCache, ApiKey tables
  - Added all 35 MCP tools to interface specifications
  - Added all 28+ CLI commands
  - Updated parser list to 13 languages (10 fully extracted)
  - Added RTK compression architecture
  - Added orchestrator architecture
  - Added git hooks architecture
  - Fixed MCP server tech from "Custom Rust" to "rmcp"
  - Updated relationship types to 10 (was missing tests, implementations)
- v1.19 - Auto-Index on DB Write
- v1.18 - RTK Integration
- v1.13 - Terraform and CI/CD YAML Indexing
- v1.5 - MCP Server Self-Initialization
- v1.2.1 - Migrated from SurrealDB to CozoDB

---

## 1. Architecture Overview

### 1.1 Design Principles

| Principle | Description |
|-----------|-------------|
| **Local-first** | All data and processing runs locally, no cloud dependencies |
| **Single binary** | Application packed as a single binary |
| **Minimal dependencies** | No external services or database processes required |
| **Incremental** | Only process changes, not full re-scan |
| **MCP-native** | Designed from the ground up for MCP protocol |

### 1.2 System Overview

LeanKG is a local-first knowledge graph system providing codebase intelligence for AI coding tools. The system parses code, builds dependency graphs, and exposes interfaces via CLI, MCP server, REST API, and embedded Web UI.

---

## 2. C4 Models

### 2.1 Context Diagram (C4-1)

```mermaid
graph TB
    subgraph "User's Machine"
        subgraph "LeanKG System"
            KG[LeanKG Application]
            DB[(CozoDB<br/>Database)]
            KG --- DB
        end
        
        AI1[Cursor]
        AI2[OpenCode]
        AI3[Claude Code]
        AI4[Gemini CLI]
        AI5[Kilo Code]
        AI6[Codex]
        
        subgraph "CLI Users"
            Dev[Developer]
            CI[CI/CD Pipeline]
        end
        
        subgraph "Web UI Users"
            Browser[Browser]
        end
        
        subgraph "REST API Clients"
            APIClient[API Client]
        end
    end
    
    AI1 -->|MCP Protocol| KG
    AI2 -->|MCP Protocol| KG
    AI3 -->|MCP Protocol| KG
    AI4 -->|MCP Protocol| KG
    AI5 -->|MCP Protocol| KG
    AI6 -->|MCP Protocol| KG
    
    Dev -->|CLI Commands| KG
    CI -->|CLI Commands| KG
    
    Browser -->|HTTP| KG
    APIClient -->|REST API| KG
```

### 2.2 Container Diagram (C4-2)

```mermaid
graph TB
    subgraph "LeanKG Application"
        CLI[CLI Interface<br/>Clap]
        MCP[MCP Server<br/>rmcp]
        Web[Web UI Server<br/>Axum]
        API[REST API Server<br/>Axum]
        
        Indexer[Code Indexer<br/>tree-sitter]
        PipeIdx[Pipeline Indexer<br/>CI/CD Parsers]
        DocIdx[Doc Indexer<br/>Markdown Parser]
        Graph[Graph Engine<br/>Query Processor]
        DocGen[Doc Generator<br/>+ Wiki]
        Watcher[File Watcher<br/>notify]
        Impact[Impact Analyzer<br/>BFS Traversal]
        Trace[Traceability<br/>Analyzer]
        Qual[Code Quality<br/>Metrics]
        Compress[Compress<br/>RTK-style]
        Orch[Orchestrator<br/>Intent + Cache]
        Hooks[Git Hooks<br/>pre/post-commit]
        Bench[Benchmark<br/>Runner]
        Registry[Global Registry<br/>Multi-repo]

        DB[(CozoDB<br/>Database)]

        CLI --> Indexer
        CLI --> PipeIdx
        CLI --> DocIdx
        CLI --> Graph
        CLI --> DocGen
        CLI --> Impact
        CLI --> Trace
        CLI --> Qual
        CLI --> Compress
        CLI --> Hooks
        CLI --> Bench
        CLI --> Registry

        Indexer --> DB
        PipeIdx --> DB
        DocIdx --> DB
        Graph --> DB
        DocGen --> DB
        Impact --> DB
        Trace --> DB
        Qual --> DB
        Compress --> Orch
        Orch --> Graph
        Orch --> DB
        Hooks --> Indexer
        Watcher --> Indexer
        Bench --> Graph
        
        MCP --> Graph
        MCP --> DocGen
        MCP --> Impact
        MCP --> Trace
        MCP --> Compress
        MCP --> Orch
        
        Web --> Graph
        Web --> DocGen
        Web --> Qual
        
        API --> Graph
        API --> DB
    end
```

**Containers:**

| Container | Responsibility | Technology |
|-----------|---------------|------------|
| CLI Interface | Command-line interaction (28+ commands) | Clap (Rust) |
| MCP Server | MCP protocol communication (35 tools) | rmcp (Rust) |
| Web UI Server | HTTP server for graph visualization (20+ routes) | Axum (Rust) |
| REST API Server | REST API with auth | Axum + tower-http (Rust) |
| Code Indexer | Parse source code with tree-sitter (10 languages fully) | tree-sitter (Rust) |
| Pipeline Indexer | Parse CI/CD configuration files | Custom YAML parsers (Rust) |
| Doc Indexer | Parse documentation files and extract code references | pulldown-cmark (Rust) |
| Graph Engine | Query and traverse knowledge graph | Rust + CozoDB Datalog |
| Doc Generator | Generate markdown documentation + wiki | Rust templates |
| File Watcher | Monitor file changes | notify (Rust) |
| Impact Analyzer | Calculate blast radius with confidence scores | Rust (BFS traversal) |
| Traceability Analyzer | Trace requirements to code via documentation | Rust |
| Code Quality | Detect large functions, code metrics | Rust |
| Compress | 8 read modes, response/shell/cargo/git compressors, entropy analysis | Rust |
| Orchestrator | Smart query routing with intent parsing and persistent cache | Rust |
| Git Hooks | pre-commit, post-commit, post-checkout hooks + GitWatcher | Rust (shell script generation) |
| Benchmark | Compare vs OpenCode, Gemini CLI, Kilo CLI | Rust |
| Global Registry | Multi-repo management | Rust + CozoDB |
| CozoDB | Persistent storage (per-project) | CozoDB 0.2 (embedded SQLite-backed) |

### 2.3 Component Diagram (C4-3)

```mermaid
graph TB
    subgraph "Code Indexer"
        Parser[Parser Manager<br/>13 Parsers]
        GoParser[Go Parser]
        TSParser[TS/JS Parser]
        PyParser[Python Parser]
        RsParser[Rust Parser]
        JavaParser[Java Parser]
        KtParser[Kotlin Parser]
        CppParser[C++ Parser]
        CsParser[C# Parser]
        RbParser[Ruby Parser]
        PhpParser[PHP Parser]
        Extractor[Entity Extractor]
        Builder[Graph Builder]
        
        Parser --> GoParser
        Parser --> TSParser
        Parser --> PyParser
        Parser --> RsParser
        Parser --> JavaParser
        Parser --> KtParser
        Parser --> CppParser
        Parser --> CsParser
        Parser --> RbParser
        Parser --> PhpParser
        
        GoParser --> Extractor
        TSParser --> Extractor
        PyParser --> Extractor
        RsParser --> Extractor
        JavaParser --> Extractor
        KtParser --> Extractor
        CppParser --> Extractor
        CsParser --> Extractor
        RbParser --> Extractor
        PhpParser --> Extractor
        
        Extractor --> Builder
    end
    
    subgraph "Graph Engine"
        Query[Query Processor]
        Search[Search Engine]
        Relate[Relationship Engine]
        Cache[Query Cache]
        Cluster[Community Detector]
        Export[Export: HTML/SVG/GraphML/Neo4j]
        
        Query --> Search
        Query --> Relate
        Search --> Cache
        Relate --> Cache
        Query --> Cluster
        Query --> Export
    end
    
    subgraph "Impact Analyzer"
        BFS[BFS Traversal]
        QualNorm[Qualified Name Normalizer]
        Radius[Radius Calculator]
        Context[Review Context Generator]
        ImpactCache[Impact Cache]
        
        BFS --> QualNorm
        QualNorm --> Radius
        Radius --> Context
        Radius --> ImpactCache
    end
    
    subgraph "Compress Module"
        FileReader[File Reader<br/>8 Read Modes]
        ResponseComp[Response Compressor]
        ShellComp[Shell Compressor]
        CargoComp[CargoTest Compressor]
        GitDiffComp[GitDiff Compressor]
        Entropy[Entropy Analyzer]
        
        FileReader --> Entropy
        ResponseComp --> Entropy
    end
    
    subgraph "Orchestrator"
        Intent[Intent Parser<br/>7 Query Types]
        OrchCache[Persistent Cache<br/>CozoDB-backed]
        Router[Query Router]
        
        Intent --> Router
        Router --> OrchCache
    end
    
    subgraph "Git Hooks"
        PreCommit[Pre-commit Hook<br/>Critical file blocking]
        PostCommit[Post-commit Hook<br/>Auto reindex]
        PostCheckout[Post-checkout Hook<br/>Branch switch reindex]
        GitWatch[GitWatcher<br/>Continuous freshness]
        
        PreCommit --> GitWatch
        PostCommit --> GitWatch
        PostCheckout --> GitWatch
    end
    
    subgraph "MCP Server"
        Protocol[MCP Protocol<br/>rmcp]
        Handler[Request Handler<br/>35 tools]
        Tools[Tool Registry]
        WriteTracker[Write Tracker<br/>Atomic dirty flag]
        
        Protocol --> Handler
        Handler --> Tools
        Handler --> WriteTracker
    end
    
    Builder -->|Writes| DB[(CozoDB)]
    Query -.->|Reads| DB
    Tools -.->|Queries| Graph
    BFS -.->|Reads| DB
    OrchCache -.->|Reads/Writes| DB
```

### 2.4 Deployment Diagram (C4-4)

```mermaid
graph TB
    subgraph "Developer Machine"
        subgraph "Operating System"
            subgraph "LeanKG Deployment"
                Binary[LeanKG Binary v0.11.1]
                
                subgraph "Runtime Components"
                    HTTP[HTTP Server<br/>:8080]
                    MCP[MCP Server<br/>stdio]
                    RESTAPI[REST API<br/>:8081]
                    Watch[File Watcher]
                end
                
                subgraph "Data Layer"
                    DB[(CozoDB<br/>.leankg/)]
                    PCache[Persistent Cache<br/>CozoDB query_cache]
                end
                
                subgraph "Processing"
                    Parser[Parser Pool<br/>tree-sitter 13 parsers]
                    Graph[Graph Engine]
                    Impact[Impact Analyzer]
                end
            end
            
            subgraph "File System"
                Code[Project Code]
                ProjectDB[.leankg/<br/>graph.db]
                ProjectConfig[.leankg/<br/>config.yaml]
                HooksDir[.git/hooks/<br/>leankg-*]
                Registry[~/.leankg/<br/>registry.yaml]
            end
        end
    end
    
    HTTP --> Binary
    MCP --> Binary
    RESTAPI --> Binary
    Watch --> Binary
    
    Binary --> Parser
    Binary --> Graph
    Binary --> Impact
    
    Parser --> Code
    Graph --> DB
    Impact --> DB
    PCache --> DB
```

**Deployment Scenarios:**

| Scenario | Environment | Resources |
|----------|-------------|-----------|
| macOS Intel | macOS x64 | < 100MB RAM, < 200MB disk |
| macOS Apple Silicon | macOS ARM64 | < 100MB RAM, < 200MB disk |
| Linux x64 | Linux x64 | < 100MB RAM, < 200MB disk |
| Linux ARM64 | Linux ARM64 | < 100MB RAM, < 200MB disk |

**Processes:**

| Process | Port | Description |
|---------|------|-------------|
| LeanKG Binary | - | Main application process |
| HTTP Server | 8080 | Web UI server (optional) |
| REST API Server | 8081 | REST API with auth (optional) |
| MCP Server | stdio | MCP protocol via stdin/stdout |
| File Watcher | - | Background notify process |
| CozoDB | - | Embedded SQLite-backed database |

---

## 3. Data Flow

### 3.1 Indexing Flow

```mermaid
sequenceDiagram
    participant Dev as Developer
    participant CLI as CLI
    participant Watch as File Watcher
    participant Parse as Parser Manager
    participant Extract as Entity Extractor
    participant Build as Graph Builder
    participant DB as CozoDB

    Dev->>CLI: leankg index ./src
    CLI->>Parse: Parse directory (parallel via rayon)
    Parse->>Extract: Extract entities (10 languages)
    Extract->>Build: Build relationships (10 types)
    Build->>DB: Batch insert (5000 rows/batch)
    
    Note over Extract,Build: Call Edge Resolution Pass
    Build->>DB: resolve_call_edges() resolves __unresolved__ targets
    
    Note over Watch: File change detected
    Watch->>Parse: Re-parse changed file
    Parse->>Extract: Extract entities
    Extract->>Build: Update graph
    Build->>DB: Upsert elements
```

### 3.2 Query Flow

```mermaid
sequenceDiagram
    participant AI as AI Tool
    participant MCP as MCP Server
    participant Orch as Orchestrator
    participant Query as Query Processor
    participant DB as CozoDB

    AI->>MCP: orchestrate("show impact of changing handler.rs")
    MCP->>Orch: Parse intent
    Orch->>Orch: Check persistent cache
    alt Cache hit
        Orch-->>MCP: Cached result
    else Cache miss
        Orch->>Query: Route to get_impact_radius
        Query->>DB: BFS traversal
        DB-->>Query: Affected elements
        Query-->>Orch: Result
        Orch->>DB: Store in query_cache
        Orch-->>MCP: Result with compression
    end
    MCP-->>AI: Context payload
```

### 3.3 Impact Analysis Flow

```mermaid
sequenceDiagram
    participant AI as AI Tool
    participant MCP as MCP Server
    participant BFS as BFS Traversal
    participant DB as CozoDB

    AI->>MCP: get_impact_radius(file.rs, depth=3, compress_response=true)
    MCP->>BFS: Calculate blast radius
    BFS->>DB: Find reachable nodes
    DB-->>BFS: Affected elements
    BFS->>BFS: Assign confidence scores<br/>and severity classifications
    BFS->>MCP: Impact set with severity
    MCP->>MCP: ResponseCompressor compress
    MCP-->>AI: Compressed impact payload
```

### 3.4 Pre-Commit Change Detection Flow

```mermaid
sequenceDiagram
    participant Dev as Developer
    participant Hook as Pre-commit Hook
    participant CLI as leankg detect-changes
    participant DB as CozoDB

    Dev->>Hook: git commit
    Hook->>Hook: Check critical files<br/>(src/lib.rs, Cargo.toml, leankg.yaml)
    alt Critical file changed
        Hook-->>Dev: BLOCK commit
    else Non-critical
        Hook->>CLI: detect_changes(scope=staged)
        CLI->>DB: Query affected elements
        DB-->>CLI: Impact report
        CLI-->>Hook: Risk level (critical/high/medium/low)
        Hook-->>Dev: ALLOW with warning
    end
```

---

## 4. Data Model

### 4.1 Entity Relationship Diagram

```mermaid
erDiagram
    CODE_ELEMENTS ||--o{ RELATIONSHIPS : has
    CODE_ELEMENTS ||--o| BUSINESS_LOGIC : describes
    CODE_ELEMENTS ||--o| DOCUMENTS : referenced_by
    USER_STORIES ||--o{ BUSINESS_LOGIC : mapped_to
    FEATURES ||--o{ BUSINESS_LOGIC : implements

    CODE_ELEMENTS {
        string qualified_name PK
        string type
        string name
        string file_path
        int line_start
        int line_end
        string language
        string parent_qualified FK
        string cluster_id
        string cluster_label
        json metadata
    }

    RELATIONSHIPS {
        string source_qualified FK
        string target_qualified FK
        string rel_type
        float confidence
        json metadata
    }

    BUSINESS_LOGIC {
        string element_qualified PK_FK
        string description
        string user_story_id FK
        string feature_id FK
    }

    DOCUMENTS {
        string id PK
        string title
        string content
        string file_path
        string generated_from
        int last_updated
    }

    CONTEXT_METRICS {
        string tool_name
        int timestamp
        string project_path
        int input_tokens
        int output_tokens
        int tokens_saved
        float savings_percent
    }

    QUERY_CACHE {
        string cache_key PK
        string value_json
        int created_at
        int ttl_seconds
        string tool_name
    }

    API_KEYS {
        string id PK
        string name
        string key_hash
        int created_at
    }
```

### 4.2 Schema Description

| Table | Description | Indexes |
|-------|-------------|---------|
| `code_elements` | All code/doc/pipeline elements. PK = qualified_name (`file_path::parent::name`) | PK: qualified_name |
| `relationships` | 10 relationship types: imports, calls, references, documented_by, tested_by, tests, contains, defines, implements, implementations | rel_type_index, target_qualified_index |
| `business_logic` | Business logic annotations per element | PK: element_qualified |
| `context_metrics` | 18-field metrics tracking per MCP tool call | tool_name_index, timestamp_index, project_path_index |
| `query_cache` | Persistent cache for orchestrator results | unique: cache_key, tool_name_index |

### 4.3 Element Types

| Element Type | qualified_name Format | Description |
|-------------|----------------------|-------------|
| `file` | `path/to/file.rs` | A source code file |
| `function` | `path/to/file.rs::function_name` | A function or method |
| `class` | `path/to/file.rs::ClassName` | A class, struct, or interface |
| `import` | `path/to/file.rs::import_alias` | An import statement |
| `export` | `path/to/file.rs::export_name` | An export |
| `document` | `docs/path/to/file.md` | A documentation file |
| `doc_section` | `docs/path/to/file.md::section_name` | A section within a document |
| `pipeline` | `path/to/ci.yml::pipeline_name` | A CI/CD pipeline |
| `pipeline_stage` | `path/to/ci.yml::pipeline::stage` | A pipeline stage |
| `pipeline_step` | `path/to/ci.yml::pipeline::stage::step` | A pipeline step |
| `terraform` | `path/to/file.tf::resource_type.name` | A Terraform resource |
| `cicd` | `path/to/ci.yml::job_name` | A CI/CD job |

### 4.4 Relationship Types

| Type | Source → Target | Description | Confidence |
|------|-----------------|-------------|------------|
| `imports` | file → file | Module A imports module B | 1.0 |
| `calls` | function → function | Function A calls function B | 0.5-1.0 |
| `references` | document → code_element | Doc references code element | 1.0 |
| `documented_by` | code_element → document | Code is documented by doc | 1.0 |
| `tested_by` | code_element → test_function | Code is tested by test | 1.0 |
| `tests` | test_function → code_element | Test tests code | 1.0 |
| `contains` | parent → child | Parent contains child (file→function, doc→section) | 1.0 |
| `defines` | file → element | File defines element | 1.0 |
| `implements` | struct → interface | Struct implements interface (Go) | 1.0 |
| `implementations` | interface → struct | Interface implementations | 1.0 |

---

## 5. Interface Specifications

### 5.1 CLI Commands

| Command | Flags | Description |
|---------|-------|-------------|
| `init` | `--path` | Initialize LeanKG project |
| `index` | `[path]`, `-i`, `-l`, `--exclude`, `-v` | Index codebase (parallel via rayon) |
| `query` | `<query>`, `--kind` | Query knowledge graph |
| `generate` | `-t/--template` | Generate documentation |
| `web` | `--port` | Start embedded web UI |
| `mcp-stdio` | `--watch`, `--project-path` | Start MCP server (stdio) |
| `impact` | `<file>`, `--depth` | Calculate impact radius |
| `install` | | Auto-install MCP config |
| `status` | | Show index status |
| `watch` | `--path` | Start file watcher |
| `quality` | `--min-lines`, `--lang` | Find oversized functions |
| `export` | `--format`, `--output`, `--file`, `--depth` | Export graph (json/dot/mermaid/html/svg/graphml/neo4j) |
| `annotate` | `<element>`, `-d`, `--user-story`, `--feature` | Add business logic annotation |
| `link` | `<element>`, `<id>`, `--kind` | Link element to story/feature |
| `search-annotations` | `<query>` | Search annotations |
| `show-annotations` | `<element>` | Show annotations |
| `trace` | `--feature`, `--user-story`, `-a` | Traceability chain |
| `find-by-domain` | `<domain>` | Find by business domain |
| `benchmark` | `--category`, `--cli` | Run benchmarks |
| `register` | `<name>` | Register repo in global registry |
| `unregister` | `<name>` | Unregister repo |
| `list` | | List registered repos |
| `status-repo` | `<name>` | Show repo status |
| `setup` | | Global MCP setup |
| `run` | `<command>`, `--compress` | Run shell command with compression |
| `detect-clusters` | `--path`, `--min-hub-edges` | Community detection |
| `api-serve` | `--port`, `--auth` | Start REST API |
| `api-key` | `create/list/revoke` | API key management |
| `hooks` | `install/uninstall/status/watch` | Git hooks management |
| `wiki` | `--output` | Generate wiki |
| `metrics` | `--since`, `--tool`, `--json`, `--reset`, `--seed` | Context metrics |
| `version` | | Show version |

### 5.2 MCP Tools (35 total)

See PRD Section 7 for complete list.

**Auto-Initialization Behavior:**
When MCP server starts via `mcp-stdio` and detects no `.leankg/` exists, it automatically:
1. Runs init (creates .leankg/ and leankg.yaml)
2. Runs index on the current directory
3. Serves normally

**Auto-Indexing on Startup:**
When MCP server starts with existing project:
1. Checks staleness by comparing git HEAD commit time vs database file time
2. If stale, runs incremental indexing automatically

**Auto-Indexing on DB Write:**
1. **WriteTracker** - In-memory `Arc<AtomicBool>` dirty flag + `Arc<RwLock<Instant>>` last_write_time
2. **TrackingDb** - Wraps `CozoDb` to intercept `:put` and `:delete` operations
3. **Lazy Reindex** - On any MCP tool call, checks dirty flag; if set, triggers incremental reindex

### 5.3 REST API Routes

| Route | Method | Description |
|-------|--------|-------------|
| `/health` | GET | Health check (version + status) |
| `/api/v1/status` | GET | Element/relationship/annotation counts |
| `/api/v1/search?q=&limit=` | GET | Search code elements |

**Auth:** Bearer token via API key (Argon2 hashed). Auth middleware implemented but not yet wired into routes.

### 5.4 Web UI Routes

| Route | Description |
|-------|-------------|
| `/` | Main dashboard |
| `/graph` | Interactive graph visualization (sigma.js) |
| `/browse` | Code browser |
| `/docs` | Documentation viewer |
| `/annotate` | Business logic annotation |
| `/quality` | Code quality metrics |
| `/export` | Generate HTML graph export |
| `/settings` | Configuration |
| `/project` | Project management |
| `/api/graph/data` | Graph data API |
| `/api/elements` | Elements CRUD |
| `/api/relationships` | Relationships CRUD |
| `/api/annotations` | Annotations CRUD |
| `/api/search` | Search API |
| `/api/query` | Query API |

---

## 6. Security Considerations

### 6.1 Datalog String Escaping

All user-provided strings in Datalog queries are escaped using `escape_datalog`:
```rust
fn escape_datalog(s: &str) -> String {
    s.replace('\\', "\\\\").replace('"', "\\\"")
}
```

### 6.2 API Key Security

| Concern | Mitigation |
|---------|------------|
| Key storage | Argon2 hash (no plaintext) |
| Key transport | Bearer token over HTTPS recommended |
| Key rotation | Revoke + create new key |

### 6.3 Local Security

| Concern | Mitigation |
|---------|------------|
| Data at rest | Database file stored locally |
| MCP access | stdio transport (local only) |
| HTTP exposure | Bind to localhost only by default |
| File access | Sandboxed to project directory |

---

## 7. Performance Targets

| Operation | Target | Notes |
|-----------|--------|-------|
| Cold start | < 2s | Binary initialization |
| Index speed | > 10K LOC/s | Parallel via rayon, batch 5000/batch |
| Query latency | < 100ms | Graph queries with cache |
| Memory idle | < 100MB | No active operations |
| Memory peak | < 500MB | During indexing |
| Disk footprint | < 50MB/100K LOC | Database size |

---

## 8. Configuration

```yaml
project:
  name: my-project
  root: ./src
  languages:
    - go
    - typescript
    - python
    - rust

indexer:
  exclude:
    - "**/node_modules/**"
    - "**/vendor/**"
  include:
    - "*.go"
    - "*.ts"
    - "*.py"

pipeline:
  enabled: true
  auto_detect: true

mcp:
  enabled: true
  auto_index_on_start: true
  auto_index_threshold_minutes: 5
  auto_index_on_db_write: true
  index_on_first_call: true

web:
  enabled: true
  port: 8080

documentation:
  output: ./docs
  templates:
    - agents
    - claude
```

---

## 9. Error Handling

| Category | Handling | User Feedback |
|----------|----------|---------------|
| Parse errors | Skip file, log warning | Warning in CLI output |
| Database errors | Retry with backoff | Error message |
| MCP errors | Return error response | MCP error payload |
| File system errors | Graceful degradation | Warning |

---

## 10. Future Considerations

### 10.1 Near-term (Current Sprint)

- npm-based installation (US-14)
- Dart entity extraction (US-LANG-01)
- Swift entity extraction (US-LANG-02)
- REST API auth wiring + mutation endpoints

### 10.2 Phase 4 Features

- Vector embeddings for semantic search
- Cloud sync option
- Team features (shared knowledge graphs)
- Plugin system

---

## 11. Dependencies

### 11.1 Direct Dependencies

| Dependency | Version | Purpose |
|------------|---------|---------|
| cozo | 0.2 | Embedded SQLite-backed relational-graph database |
| rmcp | 1.2 | MCP protocol server |
| tree-sitter | 0.25 | Code parsing |
| clap | 4 | CLI framework |
| notify | 7 | File watching |
| axum | 0.7 | Web server + REST API |
| tower-http | 0.6 | CORS middleware |
| tokio | 1 | Async runtime |
| serde | 1 | Serialization |
| rayon | 1.10 | Parallel processing |
| pulldown-cmark | 0.12 | Markdown parsing |
| regex | 1 | Pattern matching |
| argon2 | 0.5 | API key hashing |
| uuid | 1 | Unique IDs |

### 11.2 tree-sitter Parsers

| Parser | Version | Language |
|--------|---------|----------|
| tree-sitter-go | 0.25 | Go |
| tree-sitter-typescript | 0.23 | TypeScript/JavaScript |
| tree-sitter-python | 0.25 | Python |
| tree-sitter-rust | 0.24 | Rust |
| tree-sitter-java | 0.23 | Java |
| tree-sitter-kotlin-ng | 1.1.0 | Kotlin |
| tree-sitter-cpp | 0.23 | C/C++ |
| tree-sitter-c-sharp | 0.23 | C# |
| tree-sitter-ruby | 0.23 | Ruby |
| tree-sitter-php | 0.23 | PHP |
| tree-sitter-dart | 0.1 | Dart (parser only) |
| tree-sitter-swift | 0.7 | Swift (parser only) |
| tree-sitter-xml | 0.7 | XML (parser only) |

---

## 12. Glossary

| Term | Definition |
|------|------------|
| Container | Executable process in C4 model |
| Component | Internal module of a container |
| Code element | File, function, class, import in codebase |
| Context | Information provided to AI tool |
| Blast radius / Impact radius | Files affected by a change within N hops |
| Qualified name | Natural node identifier: `file_path::parent::name` |
| TESTED_BY | Relationship: test file tests production code |
| Confidence Score | Float 0.0-1.0 indicating edge reliability |
| Severity Classification | WILL BREAK / LIKELY AFFECTED / MAY BE AFFECTED |
| Cluster | Functional community of code elements (Leiden algorithm) |
| Read Mode | File compression mode: adaptive, full, map, signatures, diff, aggressive, entropy, lines |
| Orchestrator | Smart query routing with intent parsing and persistent cache |
| GitWatcher | Component monitoring git events for index freshness |
| Global Registry | Multi-repo management for cross-project queries |
| RTK | Rust Token Killer - compression module reducing LLM token consumption |
| Entropy Analysis | Shannon entropy, Jaccard similarity, Kolmogorov adjustment for information density |

---

*Last updated: 2026-04-11 (v2.0, full codebase audit)*