knot-server 0.1.4

Distributed REST API server for knot codebase indexing. Manages Git repositories across a cluster with shared workspace coordination.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
# knot-server

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Rust](https://img.shields.io/badge/rust-2024-brightgreen.svg)](https://www.rust-lang.org)

**knot-server** (v0.1.4) is a distributed REST API and background task scheduler for managing and indexing Git repositories across a cluster. It sits on top of the core [knot](https://github.com/raultov/knot) indexing engine, transforming it from a single-machine CLI tool into a highly available, cluster-aware enterprise service.

With `knot-server`, you can register Git repositories via a REST API, trigger automatic codebase indexing through webhooks (GitHub, GitLab, Bitbucket), and query the vector (Qdrant) and graph (Neo4j) databasesβ€”all while coordinating work safely across multiple server instances via NFS/EFS workspace locks.

---

## ✨ Key Features & API Endpoints

**knot-server** provides a comprehensive REST API to manage the lifecycle of your codebases.

### πŸ“¦ Repository Management
- **`POST /api/repos`**: Register a new Git repository. Accepts a JSON body with a URL, name, and optional authentication.
  ```json
  {
    "url": "https://github.com/raultov/knot.git",
    "name": "knot-core",
    "branch": "master",
    "webhook_secret": "your-secret-token",
    "auth": { "type": "none" }
  }
  ```
  | Field | Required | Description |
  |-------|----------|-------------|
  | `url` | Yes | Git repository URL (HTTPS, SSH, or local path) |
  | `name` | No | Display name (auto-derived from URL if omitted) |
  | `branch` | No | Branch to clone (defaults to `"main"`) |
  | `webhook_secret` | No | Shared secret for validating webhook signatures (HMAC-SHA256 or token). **Required to use the `/api/webhook` endpoint.** |
  | `auth` | No | Authentication method: `{"type": "ssh"}`, `{"type": "https", "token": "..."}`, or `{"type": "none"}` (default: `{"type": "ssh"}`) |
- **`GET /api/repos`**: List all registered repositories, along with their current status (`cloning`, `pulling`, `indexing`, `idle`, `error`) and last indexed timestamp.
- **`GET /api/repos/:id`**: Retrieve detailed information about a specific repository.
- **`DELETE /api/repos/:id`**: Remove a repository from the registry and delete its local workspace. (No request body required).

### πŸ”„ Indexing & Webhooks
- **`POST /api/repos/:id/sync`**: Manually trigger an asynchronous sync and re-indexing job for a repository. (No request body required).
- **`POST /api/webhook/:id`**: Endpoint for Git provider webhooks (GitHub, GitLab, Bitbucket). Securely validates payload signatures (HMAC-SHA256) or tokens, triggering a fast, incremental background re-index on push events. The request body should be the standard JSON webhook payload sent by the Git provider.

### πŸ” Code Intelligence Search
- **`GET /api/repos/:id/search?q=...`**: Semantic + structural search. Find code by meaning, class name, method signature, or docstrings.
- **`GET /api/repos/:id/callers?entity=...`**: Reverse dependency lookup. Identify callers, dead code, and perform impact analysis.
- **`GET /api/repos/:id/explore?path=...`**: File anatomy inspection. Quickly see all classes, interfaces, methods, and functions in a specific file.
- **`GET /api/repos/:id/deps`**: View repository dependencies (transitive and reverse) across the indexed ecosystem.

### βš™οΈ Cluster & Health
- **`GET /api/health`**: Check the health of the server, including connections to Qdrant and Neo4j, and view repository statistics.
- **Distributed Locking**: File-based locking (`.knot.lock`) allows multiple `knot-server` instances to share a single NFS/EFS workspace, ensuring only one instance indexes a given repository at a time.
- **Background Scheduler**: Automatically detects and cleans up stale locks, and periodically re-indexes repositories that haven't been synced recently.

---

## πŸ€– Using with AI Assistants (Cursor, Copilot, Claude, Gemini)

`knot-server` transforms any LLM with terminal access (Cursor, GitHub Copilot,
Claude Code, Gemini CLI, opencode, Cline, Aider) into a codebase-aware engineer.
By teaching the LLM to call the REST API via `curl`, you give it **semantic
understanding** of your entire codebase β€” far beyond what `grep` or file embeddings
can provide.

The AI learns **four code intelligence skills** that replace traditional text search:

| # | Skill | Endpoint | Use Case |
|---|-------|----------|----------|
| 1 | **Semantic Search** | `/search?q=` | Find code by *meaning*, not exact text |
| 2 | **Callers Analysis** | `/callers?entity=` | Impact analysis β€” who uses this function? |
| 3 | **File Exploration** | `/explore?path=` | Get a file's structure without reading it |
| 4 | **Dependency Graph** | `/deps` | Cross-repo dependencies |

These skills teach the LLM to **always prefer knot-server `curl` calls over
`grep`/`find`/`rg`** for code exploration, dramatically improving accuracy
and reducing hallucinations.

### Quick Install β€” One Command Per IDE

Download the pre-built skill instructions directly into your project:

**Cursor** (writes to `.cursorrules`):
```bash
curl -sL https://raw.githubusercontent.com/raultov/knot-server/master/skills/cursor-rules.md >> .cursorrules
```

**GitHub Copilot** (writes to `.github/copilot-instructions.md`):
```bash
mkdir -p .github && curl -sL https://raw.githubusercontent.com/raultov/knot-server/master/skills/copilot-instructions.md >> .github/copilot-instructions.md
```

**Claude Code / Gemini CLI / opencode / Cline / Aider** (generic system prompt):
```bash
curl -sL https://raw.githubusercontent.com/raultov/knot-server/master/skills/system-prompt.md > knot-skills.md
# Then instruct your agent: "Read knot-skills.md and use those tools"
```

### How It Works

Each skill file injects a **system prompt** into the LLM that defines:

- **When** to use each endpoint (trigger phrases)
- **How** to construct the `curl` command (parameters, `jq` filters)
- **How** to interpret the JSON response (field meanings)

The LLM learns to:
1. Instead of `grep "authenticate"`, call `GET /api/repos/{id}/search?q=authentication+logic`
2. Instead of searching for callers manually, call `GET /api/repos/{id}/callers?entity=handleRequest`
3. Instead of `cat src/file.rs`, call `GET /api/repos/{id}/explore?path=src/file.rs` to get the outline first
4. Before breaking a shared library, call `GET /api/repos/{id}/deps` to see the impact

### Example: AI-Assisted Code Exploration

```
User: "Where is the password hashing logic?"
AI (via knot-server):
  curl "/api/repos/myproject/search?q=password+hashing" | jq
  β†’ Found `hash_password` in `src/auth/crypto.rs:142`
  β†’ Reads only lines 142-180 instead of entire file
```

---

## πŸ› οΈ Installation

### Prerequisites

| Component    | Version | Notes                              |
|--------------|---------|-----------------------------------|
| Docker       | 20.10+  | For running Qdrant and Neo4j      |
| qdrant       | 1.x     | Vector database (docker)          |
| neo4j        | 5.x     | Graph database (docker)           |

### 🐳 Official Docker Image

The official Docker image is available on Docker Hub:
**[`raultov/knot-server:latest`](https://hub.docker.com/r/raultov/knot-server)**

This image is lightweight (`debian:trixie-slim` based) and comes pre-packaged
with the `knot-server` binary, `git`, and SSH clients β€” everything needed to
clone and index repositories. It is the recommended way to deploy `knot-server`
in containerized environments (Docker, Docker Compose, or Kubernetes).

### Option A: Quick Install (curl)

A single command that auto-detects your OS and architecture β€” no `sudo` or
manual platform selection needed:

```bash
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/raultov/knot-server/releases/latest/download/knot-server-installer.sh | sh
```

For a specific version, replace `latest` with the version tag:
```bash
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/raultov/knot-server/releases/download/v0.1.4/knot-server-installer.sh | sh
```

### Option B: Docker Compose (Pre-built Image)

The easiest way to run `knot-server` with its dependencies. Just download the
`docker-compose.yml` file and run:

```bash
curl -O https://raw.githubusercontent.com/raultov/knot-server/master/docker-compose.yml
docker compose up
```

This pulls the pre-built [`raultov/knot-server`](https://hub.docker.com/r/raultov/knot-server)
image from Docker Hub along with Qdrant and Neo4j β€” no compilation needed.

```yaml
services:
  knot-server:
    image: raultov/knot-server:latest
    ports:
      - "3000:3000"
    environment:
      - KNOT_WORKSPACE_DIR=/var/lib/knot/repos
      - KNOT_SERVER_QDRANT_URL=http://qdrant:6334
      - KNOT_SERVER_NEO4J_URI=bolt://neo4j:7687
      - KNOT_SERVER_NEO4J_USER=neo4j
      - KNOT_NEO4J_PASSWORD=knotsecret
    volumes:
      - knot_workspace:/var/lib/knot/repos
      - ${HOME}/.ssh:/root/.ssh:ro
    depends_on:
      qdrant:
        condition: service_started
      neo4j:
        condition: service_started

  qdrant:
    image: qdrant/qdrant:v1.16.2
    volumes:
      - qdrant_data:/qdrant/storage

  neo4j:
    image: neo4j:5.26-community
    environment:
      - NEO4J_AUTH=neo4j/knotsecret
      - NEO4J_PLUGINS=["apoc"]
    volumes:
      - neo4j_data:/data

volumes:
  knot_workspace:
  qdrant_data:
  neo4j_data:
```

### Option C: Build from Source

```bash
git clone https://github.com/raultov/knot-server
cd knot-server
cargo build --release
```

---

## βš™οΈ Configuration

`knot-server` is configured entirely via environment variables or CLI flags.

| Environment Variable | Default Value | Description |
|----------------------|---------------|-------------|
| `KNOT_SERVER_PORT` | `3000` | Port the REST API binds to |
| `KNOT_SERVER_BIND_ADDR` | `0.0.0.0` | Address the server binds to |
| `KNOT_WORKSPACE_DIR` | `/var/lib/knot/repos` | Directory where Git repos are cloned & locks are managed. Ensure the user running the server has write access (e.g., `export KNOT_WORKSPACE_DIR=$HOME/.knot/repos`). |
| `KNOT_SERVER_QDRANT_URL` | `http://localhost:6334` | URL to the Qdrant instance |
| `KNOT_SERVER_QDRANT_COLLECTION`| `knot_entities` | Qdrant collection name |
| `KNOT_SERVER_NEO4J_URI` | `bolt://localhost:7687` | URI to the Neo4j instance |
| `KNOT_SERVER_NEO4J_USER` | `neo4j` | Neo4j username |
| `KNOT_NEO4J_PASSWORD` | *(required)* | Neo4j password |
| `KNOT_SERVER_EMBED_DIM` | `384` | Embedding dimension (must match the model) |
| `KNOT_SERVER_RAYON_THREADS`| *(logical cores - 1)* | Number of threads for parallel parsing |
| `KNOT_SERVER_POLL_INTERVAL_SECS` | `86400` (24h) | How often the background scheduler runs |
| `KNOT_SERVER_MAX_INDEX_AGE_SECS` | `86400` (24h) | Age before a repository is automatically re-indexed |
| `KNOT_SERVER_QUEUE_CAPACITY` | `16` | Maximum number of jobs in the background indexing queue. Returns `429 Too Many Requests` when full. |
| `RUST_LOG` | `info` | Log level (`debug`, `info`, `warn`, `error`) |

---

## πŸ”„ Example Workflow

Here is an end-to-end example of managing a repository with `knot-server` using `curl`:

**1. Start the server**
```bash
export KNOT_WORKSPACE_DIR=$HOME/.knot/repos
export KNOT_NEO4J_PASSWORD=mysecret
export KNOT_SERVER_QDRANT_URL=http://localhost:6334
export KNOT_SERVER_NEO4J_URI=bolt://localhost:7687
knot-server
```

**2. Register a repository**
```bash
curl -X POST http://localhost:3000/api/repos \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://github.com/raultov/knot.git",
    "name": "knot-core",
    "branch": "master",
    "webhook_secret": "my-webhook-secret"
  }'
```
*The server will instantly clone the repository and queue it for indexing.*

**3. Check indexing status**
```bash
curl http://localhost:3000/api/repos/knot-core
```
*Wait until `"status": "idle"`.*

**4. Perform a semantic search**
```bash
curl "http://localhost:3000/api/repos/knot-core/search?q=webhook+validation"
```

**5. Trigger manual re-index (Sync)**
```bash
curl -X POST http://localhost:3000/api/repos/knot-core/sync
```

**6. Setup Git Webhooks**
In your GitHub/GitLab repository settings, add a webhook pointing to:
`http://your-server.com/api/webhook/knot-core`

Set the **secret/token** to the same value as `webhook_secret` you used when registering
the repository. Whenever a push occurs, `knot-server` will validate the signature and
automatically perform a fast incremental update.

---

## πŸš€ Cluster & High-Availability Deployment

`knot-server` is designed to run in horizontal scale-out clusters. Multiple instances
share a common workspace directory (NFS, EFS, or Kubernetes RWX PVC) and coordinate
via file-based locks β€” no distributed consensus protocol required.

### Docker Compose (Multi-Instance)

```yaml
services:
  knot-server:
    image: raultov/knot-server:latest
    environment:
      - KNOT_WORKSPACE_DIR=/var/lib/knot/repos
      - KNOT_SERVER_QDRANT_URL=http://qdrant:6334
      - KNOT_SERVER_NEO4J_URI=bolt://neo4j:7687
      - KNOT_SERVER_NEO4J_USER=neo4j
      - KNOT_NEO4J_PASSWORD=your-secure-password
    volumes:
      - knot_shared_workspace:/var/lib/knot/repos
      - ~/.ssh:/root/.ssh:ro
    deploy:
      replicas: 3
    depends_on:
      - qdrant
      - neo4j

  qdrant:
    image: qdrant/qdrant:latest
    volumes:
      - qdrant_data:/qdrant/storage

  neo4j:
    image: neo4j:5
    environment:
      - NEO4J_AUTH=neo4j/your-secure-password
    volumes:
      - neo4j_data:/data

volumes:
  knot_shared_workspace:
    driver: local
  qdrant_data:
  neo4j_data:
```

### Kubernetes

You can deploy the official `raultov/knot-server:latest` image to Kubernetes
with a standard `Deployment`.

> **Reference:** The included [`docker-compose.yml`]docker-compose.yml file is
> the canonical reference for configuring `knot-server`. It documents the exact
> environment variables, service dependencies (Qdrant + Neo4j), and volume
> mounts you need to translate into Kubernetes Deployments, Services, and
> ConfigMaps.

In Kubernetes, the key requirement for horizontal scaling is a
`PersistentVolumeClaim` with **`accessModes: [ReadWriteMany]`** (RWX). This
allows all knot-server Pods to share the workspace and coordinate safely.

```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: knot-shared-workspace
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 50Gi
  # storageClassName: nfs-client  # or efs-sc, cephfs, etc.
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: knot-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: knot-server
  template:
    metadata:
      labels:
        app: knot-server
    spec:
      containers:
        - name: knot-server
          image: raultov/knot-server:latest
          ports:
            - containerPort: 3000
          env:
            - name: KNOT_WORKSPACE_DIR
              value: /var/lib/knot/repos
            - name: KNOT_SERVER_QDRANT_URL
              value: http://qdrant.default.svc.cluster.local:6334
            - name: KNOT_SERVER_NEO4J_URI
              value: bolt://neo4j.default.svc.cluster.local:7687
            - name: KNOT_SERVER_NEO4J_USER
              value: neo4j
            - name: KNOT_NEO4J_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: knot-secrets
                  key: neo4j-password
          volumeMounts:
            - name: shared-workspace
              mountPath: /var/lib/knot/repos
      volumes:
        - name: shared-workspace
          persistentVolumeClaim:
            claimName: knot-shared-workspace
```

Any Pod can receive webhook events or sync requests; the shared workspace
(`repos.json`, `.knot.lock` files) ensures exactly-once processing per repository.

---

## πŸ“œ License

This project is licensed under the **MIT License**. See [LICENSE](LICENSE) for details.