graphrag-cli 0.1.0

Modern Terminal User Interface (TUI) for GraphRAG operations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
# GraphRAG CLI - Workspace Management Commands

## Overview

The GraphRAG CLI now includes comprehensive workspace management commands that allow you to save, load, and manage multiple knowledge graphs. This enables you to:

- **Save your work**: Persist complete knowledge graphs to disk
- **Reuse graphs**: Load previously saved graphs instantly
- **Organize projects**: Manage multiple workspaces for different projects
- **Share graphs**: Export and import knowledge graphs between systems

## Workspace Commands

### 1. List Available Workspaces

```bash
/workspace list
# or shorthand:
/ws list
/ws ls
```

**Output:**
```
📁 Available Workspaces (2 total):

1. tom_sawyer (945.00 KB)
   Entities: 7, Relationships: 6, Documents: 1, Chunks: 435
   Created: 2025-10-16 13:38:41

2. symposium (234.50 KB)
   Entities: 12, Relationships: 15, Documents: 1, Chunks: 187
   Created: 2025-10-15 09:22:18
```

### 2. Save Current Graph to Workspace

```bash
/workspace save <name>
# or shorthand:
/ws save <name>
```

**Example:**
```bash
/workspace save my_project
```

**Requirements:**
- GraphRAG must be initialized (`/config` loaded)
- Knowledge graph must be built (`/load` executed)

**What Gets Saved:**
- All entities with mentions and metadata
- All relationships with context
- All documents with full content
- All chunks with full text
- Workspace metadata (timestamps, counts, version)

### 3. Load Graph from Workspace

```bash
/workspace <name>
# or shorthand:
/ws <name>
```

**Example:**
```bash
/workspace tom_sawyer
```

**Effect:**
- Replaces current knowledge graph with the loaded one
- Preserves configuration (LLM settings, embeddings, etc.)
- Updates statistics in the info panel
- Ready to query immediately

### 4. Delete a Workspace

```bash
/workspace delete <name>
# or shorthand:
/ws delete <name>
/ws del <name>
/ws rm <name>
```

**Example:**
```bash
/workspace delete old_project
```

**Warning:** This action is permanent and cannot be undone!

## Typical Workflow

### Creating a New Workspace

```bash
# 1. Load configuration
/config config/templates/semantic.graphrag.json5

# 2. Load documents and build graph
/load docs-example/The_Adventures_of_Tom_Sawyer.txt

# 3. Wait for graph to build...
# (Entities extracted, relationships created)

# 4. Save to workspace
/workspace save tom_sawyer

# 5. Query the graph
Who is Tom Sawyer's best friend?
```

### Loading an Existing Workspace

```bash
# 1. Load configuration (still needed for LLM/embeddings)
/config config/templates/semantic.graphrag.json5

# 2. List available workspaces
/workspace list

# 3. Load a workspace
/workspace tom_sawyer

# 4. Query immediately (no need to rebuild!)
What are the main themes in Tom Sawyer?
```

### Managing Multiple Projects

```bash
# Project 1: Tom Sawyer
/config config/templates/narrative_fiction.graphrag.json5
/load docs-example/The_Adventures_of_Tom_Sawyer.txt
/workspace save literature_tom_sawyer

# Project 2: Technical documentation
/config config/templates/technical_documentation.graphrag.json5
/load docs/API_Reference.md
/workspace save tech_api_docs

# Switch between projects
/workspace literature_tom_sawyer
# ... query Tom Sawyer ...

/workspace tech_api_docs
# ... query API docs ...
```

## Storage Location

Workspaces are stored in:
- **Linux/Mac**: `~/.local/share/graphrag/workspaces/`
- **Windows**: `%APPDATA%\graphrag\workspaces\`
- **Fallback**: `./workspaces/` (current directory)

Each workspace is a directory containing:
```
~/.local/share/graphrag/workspaces/
├── tom_sawyer/
│   ├── graph.json      # Complete knowledge graph
│   └── metadata.toml   # Workspace metadata
└── symposium/
    ├── graph.json
    └── metadata.toml
```

## File Format

### graph.json

Complete JSON representation of the knowledge graph:
- **Entities**: All extracted entities with full details
- **Relationships**: All relationships between entities
- **Chunks**: Full text content of all chunks
- **Documents**: Full text content of all documents
- **Metadata**: Entity mentions, confidence scores, offsets

### metadata.toml

Workspace information:
```toml
name = "tom_sawyer"
created_at = "2025-10-16T13:38:41.934990424Z"
modified_at = "2025-10-16T13:38:41.951588339Z"
entity_count = 7
relationship_count = 6
document_count = 1
chunk_count = 435
format_version = "1.0"
```

## Performance

### Save Performance

- **Small graphs** (<1K entities): <50ms
- **Medium graphs** (1K-10K entities): 50-500ms
- **Large graphs** (10K-100K entities): 500ms-5s

### Load Performance

- **Small graphs**: <50ms
- **Medium graphs**: 50-500ms
- **Large graphs**: 500ms-5s

**Note:** Loading from workspace is **much faster** than rebuilding the graph from scratch (which requires entity extraction and LLM calls).

## Storage Requirements

Approximate sizes:
- **Entities**: ~500 bytes each (with mentions)
- **Relationships**: ~200 bytes each
- **Chunks**: ~1KB each (full content)
- **Documents**: Full content size + metadata

**Example:** Tom Sawyer workspace (434 KB source text)
- 7 entities, 6 relationships, 435 chunks
- Total size: 945 KB (2.2x source size)

## Best Practices

### 1. Save After Building

Always save your graph after successfully building it:
```bash
/load large_document.txt
# ... wait for build to complete ...
/stats  # Verify graph is built
/workspace save my_workspace  # Save it!
```

### 2. Use Descriptive Names

Use clear, descriptive workspace names:
- ✅ Good: `literature_tom_sawyer`, `tech_api_v2`, `research_quantum_physics`
- ❌ Bad: `test`, `tmp`, `graph1`

### 3. List Before Loading

Always list workspaces to see what's available:
```bash
/workspace list
# Check available workspaces and their stats
/workspace tom_sawyer
```

### 4. Delete Old Workspaces

Clean up unused workspaces to save disk space:
```bash
/workspace list
/workspace delete old_experiment
```

### 5. Backup Important Workspaces

Workspaces are just directories - you can backup/share them:
```bash
# Backup
tar -czf tom_sawyer_backup.tar.gz ~/.local/share/graphrag/workspaces/tom_sawyer/

# Share
scp -r ~/.local/share/graphrag/workspaces/tom_sawyer/ user@remote:~/
```

## Limitations

### Current Limitations

1. **Embeddings Not Saved**
   - Vector embeddings are regenerated on load
   - This is temporary until LanceDB integration is unblocked
   - Impact: First query after load may be slightly slower

2. **Configuration Not Saved**
   - You must still load config with `/config`
   - Workspace only stores the graph data
   - Future: May include config snapshot

3. **JSON Format**
   - Currently uses JSON for maximum compatibility
   - Future: Will use Parquet for better compression
   - Current approach is human-readable for debugging

### Future Enhancements

1. **Compression** (Planned)
   - Add gzip compression for JSON
   - Expected 5x size reduction
   - Transparent to users

2. **Parquet Storage** (Blocked)
   - Columnar storage for entities/relationships
   - Much better compression (~10x)
   - Faster queries for large graphs
   - Waiting for upstream dependency fix

3. **LanceDB Vector Storage** (Blocked)
   - Persist embeddings for instant queries
   - Hybrid retrieval acceleration
   - Waiting for version conflict resolution

4. **Auto-Save** (Planned)
   - Automatic workspace saves
   - Configurable save intervals
   - Crash recovery

5. **Workspace Metadata** (Planned)
   - Descriptions and tags
   - Creation/modification history
   - Related workspaces linking

## Troubleshooting

### "GraphRAG not initialized"

**Problem:** Trying to save/load without loading config first

**Solution:**
```bash
/config config/templates/semantic.graphrag.json5
/workspace save my_workspace  # Now works!
```

### "No knowledge graph to save"

**Problem:** Trying to save before building the graph

**Solution:**
```bash
/load your_document.txt  # Build the graph first
/workspace save my_workspace  # Now works!
```

### "Workspace not found"

**Problem:** Trying to load a non-existent workspace

**Solution:**
```bash
/workspace list  # See available workspaces
/workspace <correct_name>
```

### "Failed to load workspace"

**Problem:** Corrupted workspace files or permission issues

**Solutions:**
1. Check file permissions
2. Verify JSON is valid
3. Check disk space
4. Delete and recreate the workspace

## Examples

### Example 1: Literature Analysis

```bash
# Setup
/config config/templates/narrative_fiction.graphrag.json5

# Load multiple books
/load books/Tom_Sawyer.txt
/workspace save literature_tom_sawyer

/clear
/load books/Huckleberry_Finn.txt --rebuild
/workspace save literature_huck_finn

# Compare analyses
/workspace literature_tom_sawyer
Who are the main characters?

/workspace literature_huck_finn
Who are the main characters?
```

### Example 2: Technical Documentation

```bash
# Setup
/config config/templates/technical_documentation.graphrag.json5

# Build knowledge base
/load docs/API_Reference.md
/load docs/User_Guide.md
/load docs/Architecture.md
/workspace save tech_docs_v1

# Later: Load and query
/config config/templates/technical_documentation.graphrag.json5
/workspace tech_docs_v1
How do I authenticate API requests?
```

### Example 3: Research Papers

```bash
# Setup
/config config/templates/academic_research.graphrag.json5

# Process research paper
/load papers/Quantum_Computing_Review.pdf
/workspace save research_quantum_2025

# Analyze
/workspace research_quantum_2025
What are the key challenges in quantum computing?
What algorithms were discussed?
```

## Summary

Workspace management in GraphRAG CLI provides:

✅ **Persistence**: Save complete knowledge graphs to disk
✅ **Fast Loading**: Load graphs instantly without rebuilding
✅ **Organization**: Manage multiple projects with ease
✅ **Portability**: Share and backup workspaces
✅ **Reliability**: 100% data integrity preservation

Use `/workspace list`, `/workspace save`, and `/workspace load` to work efficiently with multiple knowledge graphs!