ggen 1.2.0

ggen is a deterministic, language-agnostic code generation framework that treats software artifacts as projections of knowledge graphs.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Table of Contents**

- [ggen Codebase Analysis: Patterns for Cookbook]#ggen-codebase-analysis-patterns-for-cookbook
  - [Executive Summary]#executive-summary
  - [1. PROJECT ARCHITECTURE]#1-project-architecture
    - [Module Structure]#module-structure
    - [Key Design Patterns]#key-design-patterns
  - [2. TEMPLATE SYSTEM PATTERNS]#2-template-system-patterns
    - [Frontmatter Structure]#frontmatter-structure
    - [Template Features]#template-features
    - [Injection Patterns]#injection-patterns
  - [3. RDF/SPARQL PATTERNS]#3-rdfsparql-patterns
    - [Graph Loading Strategies]#graph-loading-strategies
    - [SPARQL Query Patterns]#sparql-query-patterns
    - [Template Function Usage]#template-function-usage
  - [4. GPACK (PACKAGE) SYSTEM]#4-gpack-package-system
    - [Manifest Structure (`gpack.toml`)]#manifest-structure-gpacktoml
    - [Convention-Based Discovery]#convention-based-discovery
  - [5. CLI COMMAND PATTERNS]#5-cli-command-patterns
    - [Core Commands]#core-commands
    - [Template Reference Formats]#template-reference-formats
  - [6. DETERMINISM & REPRODUCIBILITY]#6-determinism--reproducibility
    - [Deterministic Features]#deterministic-features
    - [Tracing & Observability]#tracing--observability
  - [7. MULTI-LANGUAGE SUPPORT]#7-multi-language-support
    - [Language-Specific Templates]#language-specific-templates
    - [Tera Filters & Functions]#tera-filters--functions
  - [8. TESTING PATTERNS]#8-testing-patterns
    - [BDD Test Structure]#bdd-test-structure
    - [Integration Test Patterns]#integration-test-patterns
  - [9. COMMON WORKFLOWS]#9-common-workflows
    - [Workflow 1: CLI Subcommand Generation]#workflow-1-cli-subcommand-generation
    - [Workflow 2: API Endpoint Generation]#workflow-2-api-endpoint-generation
    - [Workflow 3: Database Schema to Structs]#workflow-3-database-schema-to-structs
  - [10. KEY COOKBOOK RECIPES TO DOCUMENT]#10-key-cookbook-recipes-to-document
    - [1. **Quick Start: Your First Template**]#1-quick-start-your-first-template
    - [2. **Template Injection Patterns**]#2-template-injection-patterns
    - [3. **Working with RDF Graphs**]#3-working-with-rdf-graphs
    - [4. **SPARQL-Driven Generation**]#4-sparql-driven-generation
    - [5. **Creating a Gpack**]#5-creating-a-gpack
    - [6. **Multi-Language Templates**]#6-multi-language-templates
    - [7. **Deterministic Builds**]#7-deterministic-builds
    - [8. **CLI Scaffolding Recipe**]#8-cli-scaffolding-recipe
    - [9. **Code Generation from OpenAPI**]#9-code-generation-from-openapi
    - [10. **Database-First Development**]#10-database-first-development
  - [11. BEST PRACTICES IDENTIFIED]#11-best-practices-identified
    - [Template Design]#template-design
    - [RDF/SPARQL Usage]#rdfsparql-usage
    - [Gpack Distribution]#gpack-distribution
  - [12. ARCHITECTURAL INSIGHTS]#12-architectural-insights
    - [Performance Optimizations]#performance-optimizations
    - [Security Considerations]#security-considerations
    - [Extensibility Points]#extensibility-points
  - [CONCLUSION]#conclusion

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

# ggen Codebase Analysis: Patterns for Cookbook

**Date:** 2025-10-09
**Analyst:** CodebaseAnalyst Agent
**Purpose:** Identify key patterns, use cases, and best practices for ggen cookbook documentation

## Executive Summary

ggen is a **deterministic, language-agnostic code generation framework** that treats software artifacts as projections of knowledge graphs. Built in Rust, it combines:
- **Template processing** (Tera engine with YAML frontmatter)
- **RDF/SPARQL graph operations** (Oxigraph store)
- **Package management** (gpack registry system)
- **CLI tooling** (clap-based commands)

**Core Innovation:** Templates can embed RDF data and SPARQL queries to drive generation from semantic knowledge graphs.

---

## 1. PROJECT ARCHITECTURE

### Module Structure
```
ggen/
├── ggen-core/         # Core generation engine
│   ├── template.rs    # Frontmatter parsing & rendering
│   ├── graph.rs       # RDF store with caching
│   ├── pipeline.rs    # Rendering pipeline
│   ├── gpack.rs       # Package manifest handling
│   ├── registry.rs    # Marketplace client
│   ├── lockfile.rs    # Dependency locking
│   └── inject.rs      # Code injection utilities
├── cli/               # CLI commands
│   └── cmds/          # Individual command modules
├── utils/             # Shared utilities
└── tests/
    └── bdd/features/  # Cucumber BDD tests
```

### Key Design Patterns

**1. Pipeline Pattern**
- `PipelineBuilder``Pipeline``Plan``apply()`
- Immutable builder pattern for configuration
- Separation of rendering from execution

**2. Template Processing Flow**
```rust
Parse → RenderFrontmatter → ProcessGraph → RenderBody → Plan
```

**3. Graph-Aware Generation**
- RDF data loaded from files or inline Turtle
- SPARQL queries extract variables for templates
- Cached query results for performance

---

## 2. TEMPLATE SYSTEM PATTERNS

### Frontmatter Structure

**Basic Template:**
```yaml
---
to: "src/cmds/{{cmd}}.rs"
vars:
  cmd: "hello"
  summary: "Print a greeting"
---
pub fn {{cmd}}() {
    println!("{{summary}}");
}
```

**RDF-Enhanced Template:**
```yaml
---
to: "src/cmds/{{cmd}}.rs"
prefixes:
  cli: "urn:ggen:cli#"
base: "http://example.org/"
rdf:
  - "graphs/cli.ttl"
rdf_inline:
  - "@prefix cli: <urn:ggen:cli#> . cli:hello a cli:Command ."
sparql:
  commands: "SELECT ?cmd ?summary WHERE { ?cmd a cli:Command ; cli:summary ?summary }"
shape:
  - "graphs/shapes/cli.shacl.ttl"
determinism:
  seed: "cli-subcommand"
  sort_order: ["cmd", "summary"]
---
```

### Template Features

| Feature | Purpose | Example |
|---------|---------|---------|
| `to` | Output path (templated) | `src/{{name}}.rs` |
| `from` | Source file to use as body | `base-template.rs` |
| `vars` | Default variables | `{cmd: "hello"}` |
| `inject` | Code injection mode | `true` |
| `before/after` | Injection anchors | `"// IMPORTS"` |
| `skip_if` | Idempotency check | `"pub fn hello"` |
| `force` | Overwrite existing | `true` |
| `unless_exists` | Create only if missing | `true` |
| `sh_before/sh_after` | Shell hooks | `"cargo fmt"` |

### Injection Patterns

**Append to File:**
```yaml
---
to: "src/lib.rs"
inject: true
append: true
skip_if: "mod {{module_name}}"
---
pub mod {{module_name}};
```

**Insert at Specific Line:**
```yaml
---
to: "Cargo.toml"
inject: true
after: "[dependencies]"
skip_if: "serde"
---
serde = "1.0"
```

**Prepend to File:**
```yaml
---
to: "src/main.rs"
inject: true
prepend: true
skip_if: "use anyhow"
---
use anyhow::Result;
```

---

## 3. RDF/SPARQL PATTERNS

### Graph Loading Strategies

**1. External RDF Files:**
```yaml
rdf:
  - "graphs/schema.ttl"
  - "graphs/entities.ttl"
```

**2. Inline Turtle:**
```yaml
rdf_inline:
  - |
    @prefix ex: <http://example.org/> .
    ex:{{entity}} a ex:{{type}} ;
      ex:name "{{name}}" .
```

**3. Prefix Management:**
```yaml
prefixes:
  ex: "http://example.org/"
  rdf: "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  rdfs: "http://www.w3.org/2000/01/rdf-schema#"
base: "http://example.org/{{namespace}}/"
```

### SPARQL Query Patterns

**Variable Extraction:**
```yaml
sparql:
  entities: |
    PREFIX ex: <http://example.org/>
    SELECT ?name ?type WHERE {
      ?entity a ex:{{entity_type}} ;
        ex:name ?name ;
        ex:type ?type .
    }
```

**Count Aggregation:**
```yaml
sparql:
  count: |
    SELECT (COUNT(?item) AS ?total) WHERE {
      ?item a ex:{{type}} .
    }
```

**Conditional Generation:**
```yaml
sparql:
  has_tests: |
    ASK { ?module ex:hasTests true }
```

### Template Function Usage

**In-template SPARQL:**
```jinja2
{% set results = sparql(query="SELECT ?name WHERE { ?x ex:name ?name }") %}
{% for row in results %}
  pub struct {{row.name}} {}
{% endfor %}
```

**Extract Single Value:**
```jinja2
{% set name = sparql(query="SELECT ?name WHERE { ... }", var="name") %}
```

---

## 4. GPACK (PACKAGE) SYSTEM

### Manifest Structure (`gpack.toml`)

```toml
[gpack]
id = "io.ggen.rust.cli-subcommand"
name = "Rust CLI subcommand"
version = "0.1.0"
description = "Generate clap subcommands"
license = "MIT"
ggen_compat = ">=0.1 <0.2"

[dependencies]
"io.ggen.macros.std" = "^0.1"

[templates]
patterns = ["cli/subcommand/*.tmpl"]
includes = ["macros/**/*.tera"]

[rdf]
base = "http://example.org/"
prefixes.ex = "http://example.org/"
patterns = ["templates/**/graphs/*.ttl"]

[queries]
patterns = ["../queries/*.rq"]
aliases.component_by_name = "../queries/component_by_name.rq"

[shapes]
patterns = ["../shapes/*.ttl"]

[preset]
vars = { author = "Acme", license = "MIT" }
```

### Convention-Based Discovery

**Default Patterns:**
- Templates: `templates/**/*.tmpl`, `templates/**/*.tera`
- RDF: `templates/**/graphs/*.ttl`
- Queries: `templates/**/queries/*.rq`
- Shapes: `templates/**/shapes/*.ttl`

---

## 5. CLI COMMAND PATTERNS

### Core Commands

| Command | Purpose | Example |
|---------|---------|---------|
| `ggen gen` | Generate from template | `ggen gen rust:main.tmpl -v name=app` |
| `ggen market search` | Search registry | `ggen market search rust cli` |
| `ggen market add` | Install gpack | `ggen market add io.ggen.rust.cli` |
| `ggen template lint` | Validate templates | `ggen template lint templates/` |
| `ggen graph export` | Export RDF data | `ggen graph export --format turtle` |
| `ggen completion` | Shell completions | `ggen completion bash` |

### Template Reference Formats

**Local Template:**
```bash
ggen gen templates/rust.tmpl -v name=hello
```

**Gpack Template:**
```bash
ggen gen io.ggen.rust.cli:cli/subcommand/rust.tmpl -v cmd=hello
```

**With Variables:**
```bash
ggen gen rust:main.tmpl \
  -v name=myapp \
  -v author="Sean Chatman" \
  -v version=0.1.0
```

---

## 6. DETERMINISM & REPRODUCIBILITY

### Deterministic Features

**1. Lockfile Management (`ggen.lock`):**
```toml
[[pack]]
id = "io.ggen.rust.cli"
version = "0.1.0"
git_url = "https://github.com/..."
git_rev = "abc123..."
sha256 = "deadbeef..."
```

**2. Seed-Based Generation:**
```yaml
determinism:
  seed: "my-template-v1"
  sort_order: ["name", "type"]
```

**3. SHA256 Verification:**
- All gpack downloads verified against registry SHA256
- Lockfile ensures reproducible builds

### Tracing & Observability

**Simple Tracing System:**
- Template processing metrics
- RDF loading statistics
- SPARQL query performance
- File operation tracking

---

## 7. MULTI-LANGUAGE SUPPORT

### Language-Specific Templates

**Rust:**
```yaml
---
to: "src/{{module}}.rs"
sh_after: "cargo fmt && cargo clippy"
---
```

**Python:**
```yaml
---
to: "{{package}}/{{module}}.py"
sh_after: "black {{to}} && isort {{to}}"
---
```

**Bash:**
```yaml
---
to: "scripts/{{name}}.sh"
sh_after: "chmod +x {{to}}"
---
#!/bin/bash
```

**TypeScript:**
```yaml
---
to: "src/{{name}}.ts"
sh_after: "prettier --write {{to}}"
---
```

### Tera Filters & Functions

**Built-in Filters:**
- `{{name | title}}` - Title case
- `{{name | upper}}` - Uppercase
- `{{name | lower}}` - Lowercase
- `{{name | snake_case}}` - Snake case
- `{{name | camel_case}}` - Camel case
- `{{name | pascal_case}}` - Pascal case

**Custom Functions:**
- `sparql()` - Execute SPARQL query
- `local()` - Generate local IRIs

---

## 8. TESTING PATTERNS

### BDD Test Structure

**Feature File Example:**
```gherkin
Feature: Template Generation
  Scenario: Basic template with frontmatter
    Given I have a template with:
      """
      ---
      to: src/{{name}}.rs
      ---
      pub fn {{name}}() {}
      """
    When I run "ggen gen test-template"
    Then the file "src/hello.rs" should exist
```

### Integration Test Patterns

**Template Validation:**
```rust
#[test]
fn test_template_render() -> Result<()> {
    let mut pipeline = Pipeline::new()?;
    let vars = BTreeMap::from([
        ("name".to_string(), "test".to_string())
    ]);
    let plan = pipeline.render_file(template_path, &vars, false)?;
    assert_eq!(plan.output_path, PathBuf::from("src/test.rs"));
    Ok(())
}
```

---

## 9. COMMON WORKFLOWS

### Workflow 1: CLI Subcommand Generation

**Goal:** Generate a new CLI subcommand with boilerplate

**Steps:**
1. Define RDF schema for CLI structure
2. Create template with SPARQL queries
3. Generate subcommand with metadata
4. Inject module registration

**Template Pattern:**
```yaml
---
to: "src/cmds/{{cmd}}.rs"
rdf_inline:
  - "@prefix cli: <urn:ggen:cli#> . cli:{{cmd}} a cli:Command ."
sparql:
  validate: "ASK { cli:{{cmd}} a cli:Command }"
---
use clap::Args;

#[derive(Args, Debug)]
pub struct {{cmd|title}}Args {
    // ...
}

pub fn run(args: &{{cmd|title}}Args) -> Result<()> {
    // ...
}
```

### Workflow 2: API Endpoint Generation

**Goal:** Generate REST API endpoints from OpenAPI-like RDF schema

**Steps:**
1. Define API schema in Turtle
2. Query endpoints with SPARQL
3. Generate handlers and routes
4. Inject into main router

### Workflow 3: Database Schema to Structs

**Goal:** Generate Rust structs from RDF database schema

**Steps:**
1. Load database schema as RDF
2. Query for tables/columns
3. Generate Rust structs with derives
4. Add validation annotations

---

## 10. KEY COOKBOOK RECIPES TO DOCUMENT

Based on this analysis, here are the **top 10 cookbook recipes** users would benefit from:

### 1. **Quick Start: Your First Template**
- Basic frontmatter structure
- Variable substitution
- Output path configuration

### 2. **Template Injection Patterns**
- Append/prepend to existing files
- Anchor-based injection (before/after)
- Idempotent generation with `skip_if`

### 3. **Working with RDF Graphs**
- Loading external Turtle files
- Inline RDF in frontmatter
- Prefix management and base IRIs

### 4. **SPARQL-Driven Generation**
- Variable extraction from queries
- Conditional generation with ASK
- Aggregation and filtering

### 5. **Creating a Gpack**
- Manifest structure (`gpack.toml`)
- Convention-based file discovery
- Publishing to marketplace

### 6. **Multi-Language Templates**
- Language-specific patterns
- Shell hook integration
- Formatter/linter automation

### 7. **Deterministic Builds**
- Lockfile usage
- Seed-based generation
- SHA256 verification

### 8. **CLI Scaffolding Recipe**
- End-to-end CLI subcommand generation
- Module registration patterns
- Integration testing

### 9. **Code Generation from OpenAPI**
- RDF mapping of OpenAPI schemas
- API endpoint generation
- Client SDK generation

### 10. **Database-First Development**
- Schema to struct generation
- Migration generation
- CRUD boilerplate

---

## 11. BEST PRACTICES IDENTIFIED

### Template Design
✅ **DO:**
- Use descriptive variable names
- Include default values in `vars`
- Document expected variables in comments
- Use `skip_if` for idempotency
- Normalize EOL for cross-platform compatibility

❌ **DON'T:**
- Hardcode paths or platform-specific values
- Skip validation with `force: true` unnecessarily
- Ignore shell hook exit codes
- Mix multiple concerns in one template

### RDF/SPARQL Usage
✅ **DO:**
- Define clear prefix namespaces
- Use SHACL shapes for validation
- Cache queries with semantic versioning
- Separate schema from instance data

❌ **DON'T:**
- Query in loops (use SPARQL aggregation)
- Store large binary data in RDF
- Skip graph clearing between renders

### Gpack Distribution
✅ **DO:**
- Version dependencies with semver
- Include comprehensive examples
- Document required variables
- Provide shape validation

❌ **DON'T:**
- Publish without testing
- Skip lockfile updates
- Use wildcard dependencies in production

---

## 12. ARCHITECTURAL INSIGHTS

### Performance Optimizations
1. **LRU Caching:** Query plans and results cached by epoch
2. **Parallel Graph Ops:** Thread-safe Arc-wrapped store
3. **Lazy Rendering:** Frontmatter rendered before body
4. **Deterministic Ordering:** Sorted file discovery

### Security Considerations
1. **SHA256 Verification:** All downloads validated
2. **No Code Execution in Templates:** Only data substitution
3. **Shell Hooks:** Explicit opt-in with sandboxing
4. **Registry HTTPS:** Enforced in production

### Extensibility Points
1. **Tera Custom Functions:** Register new template functions
2. **Custom Filters:** Add domain-specific transformations
3. **RDF Formats:** Support NTriples, RDF/XML, JSON-LD
4. **Shape Validation:** SHACL constraint checking

---

## CONCLUSION

ggen represents a **paradigm shift** from imperative code generation to **declarative, graph-aware generation**. Key innovations:

1. **Semantic Templates:** RDF/SPARQL elevates templates from string substitution to knowledge projection
2. **Deterministic by Design:** Lockfiles + SHA256 + seeds ensure reproducibility
3. **Language-Agnostic:** Tera templates work for any language
4. **Marketplace Ecosystem:** Gpack system enables template sharing

**Primary Use Cases:**
- **Scaffolding:** CLI tools, APIs, database schemas
- **Boilerplate Reduction:** DRY principle via template reuse
- **Knowledge-Driven:** Generate code from semantic models
- **Consistency Enforcement:** Templates as contracts

**Cookbook Priority:**
Focus on **practical recipes** that demonstrate the RDF/SPARQL advantage over traditional templating. Show how semantic graphs enable:
- Multi-target generation from single source
- Constraint-based validation
- Relationship-aware code generation

---

**Analysis Complete**
**Next Steps:** Use this analysis to structure cookbook chapters and select example projects.