capsula-capture-git-repo 0.9.0

A Capsula hook that captures Git repository information.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
# Capsula

> [!WARNING]
> This project is in early development. The CLI interface and configuration format may change in future releases.

A powerful CLI tool for running hooks and capturing their output before and after your command executions. Capsula automatically records the state of your project environment before and after running commands, making your workflows reproducible and auditable.

> [!NOTE]
> The Python version of Capsula is deprecated and can be found at the `python` branch of this repository.

## Features

- 📸 **Context Capture**: Automatically capture git state, file contents, environment variables, and more
- 🔄 **Reproducible Runs**: Complete record of execution hook for debugging and auditing
- 🛡️ **Safety Checks**: Prevent execution on dirty repositories or other unsafe conditions
- 📊 **Structured Output**: JSON-formatted capture data for easy processing
- 🔧 **Extensible**: Multiple built-in hooks with clean error handling

## Installation

### Install from crates.io (recommended)

```bash
cargo install capsula-cli --locked
```

### Install from the GitHub repository

```bash
cargo install --git https://github.com/shunichironomura/capsula --branch rust --locked capsula-cli
```

## Quick Start

1. **Create a configuration file** (`capsula.toml`) in your project root:

```toml
[vault]
name = "my-project"

[[pre-run.hooks]]
id = "capture-git-repo"
name = "repo-name"
path = "."

[[pre-run.hooks]]
id = "capture-cwd"

[[pre-run.hooks]]
id = "capture-file"
glob = "config.json"
mode = "copy"
hash = "sha256"
```

2. **Run a command with hooks**:

```bash
capsula run python train_model.py
```


## Configuration

### Basic Structure

The `capsula.toml` configuration file defines:

- **Vault**: Where to store captured data
- **Phases**: Pre-run and post-run hooks

```toml
[vault]
name = "project-name"        # Vault identifier
path = ".capsula"           # Storage path (optional, defaults to .capsula/{name})

[pre-run]                 # Pre-execution hooks
[[pre-run.hooks]]
id = "capture-git-repo"
# ... hook configuration

[post-run]                # Post-execution hooks
[[post-run.hooks]]
id = "capture-file"
# ... hook configuration
```

### Available Hook Types

#### Git Hook

Captures git repository state including commit hash and cleanliness check.

```toml
[[pre-run.hooks]]
id = "capture-git-repo"
name = "repo-name"          # Hook name
path = "."                  # Repository path
allow_dirty = false         # Allow uncommitted changes (default: false)
```

**Output Example:**

```json
{
  "__meta": {
    "config": {
      "name": "repo-name",
      "path": ".",
      "allow_dirty": false
    },
    "id": "capture-git-repo",
    "success": true
  },
  "working_dir": "/path/to/repo",
  "sha": "abc123...",
  "is_dirty": false,
  "abort_on_dirty": false
}
```

#### Current Working Directory

Captures the current working directory path.

```toml
[[pre-run.contexts]]
id = "capture-cwd"
```

**Output Example:**

```json
{
  "__meta": {
    "config": {},
    "id": "capture-cwd",
    "success": true
  },
  "cwd": "/current/working/directory"
}
```

#### File Hook

Captures file contents and/or metadata.

```toml
[[pre-run.hooks]]
id = "capture-file"
glob = "config.json"        # File pattern to capture
mode = "copy"               # Capture mode ("copy", "move", or "none". default: "copy")
hash = "sha256"             # Calculate file hash ("sha256" or "none". default: "sha256")
```

**Output Example:**

```json
{
  "__meta": {
    "config": {
      "glob": "config.json",
      "mode": "copy",
      "hash": "sha256"
    },
    "id": "capture-file",
    "success": true
  },
  "files": [
    {
      "path": "/path/to/config.json",
      "copied_path": "/vault/run-dir/config.json",
      "hash": "sha256:abc123..."
    }
  ]
}
```

#### Environment Variables Hook

Captures specified environment variables.

```toml
[[pre-run.hooks]]
id = "capture-env"
name = "HOME"                 # Variable name to capture
```

**Output Example:**

```json
{
  "__meta": {
    "config": {
      "name": "HOME"
    },
    "id": "capture-env",
    "success": true
  },
  "value": "/home/user"
}
```

#### Command Hook

Captures output of shell commands.

```toml
[[pre-run.hooks]]
id = "capture-command"
command = ["uname", "-a"]
abort_on_failure  = false  # Abort if command fails (default: false)
```

**Output Example:**

```json
{
  "__meta": {
    "config": {
      "command": ["rustc", "--version"],
      "abort_on_failure": false
    },
    "id": "capture-command",
    "success": true
  },
  "status": 0,
  "stdout": "rustc 1.91.0 (f8297e351 2025-10-28)\n",
  "stderr": "",
  "abort_requested": false
}
```

#### Machine Hook

Captures system information like CPU, memory, and OS details.

```toml
[[pre-run.hooks]]
id = "capture-machine"
```

**Output Example:**

```json
{
  "__meta": {
    "config": {},
    "id": "capture-machine",
    "success": true
  },
  "hostname": "hostname.local",
  "os": "Darwin",
  "os_version": "26.0.1",
  "kernel_version": "25.0.0",
  "architecture": "aarch64",
  "total_memory": 137438953472,
  "cpus": [
    {
      "name": "1",
      "brand": "Apple M3 Max",
      "vender_id": "Apple",
      "frequency_mhz": 4056
    }
  ]
}
```

## CLI Usage

### Commands

#### `capsula run <command>`

Execute a command with full hook capture.

```bash
# Run with default config
capsula run python script.py

# Run with custom config
capsula run --config my-config.toml python script.py

# Run with arguments
capsula run python train.py --epochs 100 --lr 0.01
```

**Behavior:**

1. Runs pre-run hooks and saves their outputs to vault
2. Checks for abort conditions (e.g., dirty git repo)
3. Executes the command if safe, aborts otherwise
4. Runs post-run hooks and saves their outputs to vault

**Environment Variables:**

When executing a command with `capsula run`, the following environment variables are automatically set and available to your command:

| Variable | Description | Example |
|----------|-------------|---------|
| `CAPSULA_RUN_ID` | Unique ULID identifier for this run | `01K8WSYC91YAE21R7CWHQ4KYN2` |
| `CAPSULA_RUN_NAME` | Human-readable generated name | `chubby-back` |
| `CAPSULA_RUN_DIRECTORY` | Absolute path to the run directory in the vault | `/path/to/.capsula/vault-name/2025-10-31/093525-chubby-back` |
| `CAPSULA_RUN_TIMESTAMP` | ISO 8601 timestamp of when the run started | `2025-10-31T09:35:25.473+00:00` |
| `CAPSULA_RUN_COMMAND` | Shell-quoted string of the executed command | `python train.py --epochs 100` |
| `CAPSULA_PRE_RUN_OUTPUT_PATH` | Path to the pre-run output JSON file | `/path/to/.capsula/vault-name/.../pre-run.json` |

> [!CAUTION]
> While `CAPSULA_RUN_DIRECTORY` is available, it is **not recommended** to write files directly to this directory. Instead, output files to your project root and capture them using the `capture-file` hook in the post-run phase. This approach ensures files are properly tracked and managed by Capsula's file handling system.

These variables can be used within your scripts to access run metadata:

```python
# Example: Embed run name in matplotlib figures for traceability
import os
import matplotlib.pyplot as plt

run_name = os.environ.get('CAPSULA_RUN_NAME')

# Create your plot
plt.plot([1, 2, 3, 4], [1, 4, 2, 3])
plt.title('Experiment Results')

# Add run name to the figure - useful when copying plots to presentations
plt.figtext(0.99, 0.01, f'Run: {run_name}',
            ha='right', va='bottom', fontsize=8, alpha=0.7)

# Save to project root, then capture with post-run hook
plt.savefig('results.png')
```

```toml
# Configure a post-run hook to capture the output file
[[post-run.hooks]]
id = "capture-file"
glob = "results.png"
mode = "move"  # Move the file to the vault
```

#### `capsula list`

List all captured runs in the vault.

```bash
# List runs with default config
capsula list

# List runs with custom config
capsula list --config my-config.toml
```

**Example Output:**

```
TIMESTAMP (UTC)      NAME                  COMMAND
---------------------------------------------------------------------------------------------
2025-10-31 09:35:29  kind-year             echo hello
2025-10-31 09:35:28  smelly-apparel        echo hello
2025-10-31 09:35:26  clear-waste           echo hello
2025-10-31 09:30:15  cheap-trip            echo this is a very long command with many argu...
```

The output shows:
- **Timestamp**: UTC time when the command was executed
- **Name**: Human-readable generated name for the run
- **Command**: The command that was executed (truncated if too long)


## Output Structure

### Metadata

Every hook output includes metadata for traceability:

```json
{
  "__meta": {
    "config": {},    // Configuration used for this hook
    "id": "capture-cwd",  // Hook ID from configuration
    "success": true  // Capture success status
  }
  // ... hook-specific data
}
```

### Vault Structure

Captured data is organized in the vault:

```
.capsula/
└── vault-name/
    └── 2025-10-31/                    # Date-based directory (YYYY-MM-DD, UTC)
        └── 093525-chubby-back/        # Unique run directory (HHMMSS-run-name)
            ├── _capsula/              # Capsula metadata directory
            │   ├── metadata.json      # Run metadata (ID, name, command, timestamp)
            │   ├── pre-run.json       # Pre-phase hook outputs (array)
            │   ├── command.json       # Command execution results
            │   └── post-run.json      # Post-phase hook outputs (array)
            └── [captured files]       # Files copied by file hooks
```

**metadata.json** contains run information:

```json
{
  "id": "01K8WSYC91YAE21R7CWHQ4KYN2",
  "name": "chubby-back",
  "command": ["echo", "hello"],
  "timestamp": "2025-10-31T09:35:25.473+00:00",
  "run_dir": "/path/to/.capsula/vault-name/2025-10-31/093525-chubby-back"
}
```

**command.json** contains command execution results:

```json
{
  "exit_code": 0,
  "stdout": "hello\n",
  "stderr": "",
  "duration": {
    "secs": 0,
    "nanos": 1986042
  }
}
```