diamond-cli 0.1.1

Lightning-fast CLI for stacked pull requests
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
# Repository Guidelines

> **Core Philosophy**
> We are making an **enterprise-grade tool**. We do not want to overengineer it, but it must be **robust and ready for anything**. Reliability is achieved through simplicity and exhaustive testing, not complexity.

## ๐Ÿค– LLM & Agent Guidelines

**These instructions are specifically for AI agents working on this project. Read this section first.**

### 0. ๐Ÿ›‘ CRITICAL SAFETY RULE: NO EXPERIMENTING ON DIAMOND REPO
**ABSOLUTE RULE: When fixing bugs, reproducing issues, or testing changes, YOU MUST NOT COMMIT TO OR MODIFY THE DIAMOND REPOSITORY DIRECTLY.**

- **NEVER** run `git commit`, `git rebase`, or `dm` commands on the Diamond repository itself to test things.
- **ALWAYS** use `just playground` or `cd sandbox/` to reproduce bugs and verify fixes.
- **EXCEPTION**: You may modify source code (`src/`) and tests (`tests/`) as part of the fix, but you must NOT use the repository's git history as your testbed.
- Modifying the tool you are building while it is running on itself is dangerous and leads to repository corruption.

#### ๐Ÿ›ก๏ธ Safety Mechanisms

Diamond has three layers of protection to prevent unit tests from committing to the repository:

1. **Runtime Safety Checks** - `RefStore::new()` and `GitGateway::new()` panic in test mode if `TestRepoContext` is not set:
   ```rust
   // This will panic with helpful error message:
   let ref_store = RefStore::new()?;  // โŒ Missing TestRepoContext

   // Correct usage:
   let dir = tempdir()?;
   let _ctx = TestRepoContext::new(dir.path());  // โœ… Isolates test
   let ref_store = RefStore::new()?;
   ```

2. **Pre-commit Git Hook** - `.git/hooks/pre-commit` blocks commits with test author signatures:
   - Rejects commits from "Test User <test@example.com>" or similar test patterns
   - Prevents unit test accidents from reaching git history
   - Warns if test signatures appear in code changes

3. **TestRepoContext RAII Guard** - Thread-local test isolation (see `src/test_context.rs`):
   - Sets temporary repo path for the test
   - Automatically cleans up on drop (even on panic)
   - Prevents tests from accessing diamond repository

**If you write a unit test, you MUST use TestRepoContext:**
```rust
#[test]
fn test_something() -> Result<()> {
    let dir = tempdir()?;
    let _repo = init_test_repo(dir.path())?;
    let _ctx = TestRepoContext::new(dir.path());  // REQUIRED

    // Now safe to call RefStore::new(), GitGateway::new(), etc.
    Ok(())
}
```

**Without TestRepoContext, the test will panic immediately with a clear error message.**

### 1. โšก Efficiency & Token Usage
- **Don't guess, check**: Read `Cargo.toml` before suggesting new dependencies.
- **Error Analysis**: When you see an error, use `anyhow::Context` to explain *why* it happened, not just *what* failed.
- **Context Awareness**: You don't need to read every file. Trust `src/git_gateway.rs` for git operations and `src/state.rs` for metadata.

### 2. ๐Ÿฆ€ Rust Design Patterns
**Stop guessing patterns from other languages (Java/Python/TS). Think in Rust.**

- **Ownership First**: Before writing an API, decide who owns the data.
- **Borrowing**: Prefer passing references (`&str`, `&Path`) over cloning (`String`, `PathBuf`) unless ownership transfer is required.
- **Trait Bounds**: Use clear trait bounds.
    - โŒ *Bad*: `fn process(input: String)` (Forces allocation)
    - โœ… *Good*: `fn process<T: AsRef<str>>(input: T)` (Flexible, zero-cost)
- **Simplicity**: Do not over-engineer traits "just in case". implementation `Into<T>` is often better than a complex builder pattern.

### 3. ๐Ÿงช Testing Protocol
**We are burning tokens on full test suite runs. Stop it.**

- **Your Responsibility**: Run **ONLY** the specific unit/integration tests relevant to your changes.
    - `cargo test --lib commands::sync`
    - `cargo test --test integration_tests -- test_sync_basic`
- **Handoff Protocol**: When you are confident, **ask the user** to run the full suite:
    > "I have verified my changes with specific tests. Please run `just test` (or `cargo t`) to ensure no regressions."

### 4. ๐Ÿ”ด TDD Enforcement (Red-Green-Refactor)
**MANDATORY WORKFLOW - Do not skip steps:**

1. **STOP** - Before writing ANY implementation code, write the test first
2. **Show the Red** - Run the test and paste the failure output to prove it fails
3. **Implement** - Write the minimal code to make the test pass
4. **Show the Green** - Run the test again and confirm it passes
5. **Refactor** - Only if needed, keeping tests green

**If you catch yourself writing implementation before a failing test exists, STOP immediately and write the test first.**

## Project Structure & Module Organization

- `Cargo.toml` / `Cargo.lock`: Rust workspace metadata and dependency locks.
- `src/main.rs`: CLI entrypoint (`clap`) and subcommand routing.
- `src/commands/`: Subcommand implementations (e.g., `create`, `sync`, `restack`, `move`).
- `src/git_gateway.rs`: Unified interface for all `git2` operations.
- `src/state.rs`: Operation state persistence under `.git/diamond/operation_state.json`.
- `src/ref_store.rs`: Stack metadata persistence using git refs (`refs/diamond/parent/*`).
- `src/operation_log.rs`: Transaction history for undo/redo and crash recovery.
- `src/validation.rs`: Integrity checks for stack data and git state.
- `src/forge/`: Hosting provider integrations (e.g., GitHub).
- `agent_notes/`: Design/architecture notes (not part of the shipped binary).

## Build, Test, and Development Commands

### Quick Reference
- `just`: Show all available commands
- `just test` or `cargo t`: Run all tests (unit + integration)
- `just check`: Run fmt + clippy + tests (pre-commit validation)
- `just playground`: Create isolated test repo for manual testing (**USE THIS, NOT THE REAL REPO**)
- `cargo build`: Compile a debug build
- `cargo run -- --help`: Run the CLI and show available subcommands
- `cargo run -- log`: Launch the TUI stack view (`ratatui`/`crossterm`)
- `cargo fmt`: Format with `rustfmt` (run before opening a PR)
- `cargo clippy --all-targets --all-features -- -D warnings`: Lint; treat warnings as errors

### โš ๏ธ CRITICAL: Safe Testing Environments

**NEVER test features directly in the Diamond repository!**

This project provides safe, isolated environments for testing:

#### 1. Automated Integration Tests (`tests/`)
- All integration tests run in temporary directories that auto-cleanup
- Tests spawn the actual `dm` binary for end-to-end validation
- Run with: `just test-integration` or `cargo ti` or `cargo test --test '*'`
- **Structure**:
  - `basic_tests.rs`: Core command functionality
  - `stack_ops_tests.rs`: Complex stack manipulations
  - `sync_restack_tests.rs`: Rebase and sync logic
  - `battle_hardness_tests.rs`: Chaos testing and stress scenarios
  - `edge_cases_tests.rs`: Robustness against bad states

#### 2. Manual Testing Playground (`sandbox/`)
```bash
just playground          # Creates sandbox/test-repo with git initialized
cd sandbox/test-repo    # Enter the test environment
../../target/debug/dm create my-feature  # Test commands safely
```

The playground is git-ignored and disposable. When exploring features or debugging, **always use the playground**, never the main repository.

#### 3. Unit Tests (Automatic Isolation)
- All unit tests use `tempdir()` to create isolated temporary repos
- Tests use `std::env::set_current_dir()` when needed
- No test should ever modify the Diamond repository

## Coding Style & Naming Conventions

- Rust 2021 edition; 4-space indentation (standard `rustfmt` defaults).
- Modules/files: `snake_case.rs` under `src/` and `src/commands/`.
- Types/traits: `CamelCase`; functions/vars: `snake_case`; constants: `SCREAMING_SNAKE_CASE`.
- Prefer small, focused command modules; keep CLI parsing in `src/main.rs` and logic in `src/commands/*`.
- **Dynamic Program Name**: Use `program_name()` instead of hardcoding "dm" in help and error messages.

## ๐Ÿ–ฅ๏ธ Interactive Features & TUI Guidelines

**CRITICAL: All interactive code must handle non-TTY environments.**

### TTY Detection Requirement

Any code that:
- Launches a TUI (`enable_raw_mode`, `Terminal::new`, `EnterAlternateScreen`)
- Reads from stdin (`stdin().read_line()`)
- Waits for user input

**MUST check if running in a TTY before attempting interaction.**

### Implementation Pattern

```rust
use std::io::IsTerminal;

// For TUI commands (log, checkout)
if !std::io::IsTerminal::is_terminal(&std::io::stdout()) {
    // Fall back to non-interactive mode or return clear error
    anyhow::bail!("command requires interactive mode. Usage: dm command <args>");
}

// For stdin prompts (cleanup confirmation)
if !std::io::IsTerminal::is_terminal(&std::io::stdin()) {
    anyhow::bail!("command requires --force when running non-interactively");
}
```

### Why This Matters

**Without TTY detection:**
- โŒ Tests hang forever waiting for input that never comes
- โŒ CI/CD pipelines timeout or fail mysteriously
- โŒ Scripts/automation break without clear error messages
- โŒ Test suite takes 2+ minutes instead of seconds

**With TTY detection:**
- โœ… Tests complete immediately with clear pass/fail
- โœ… Non-interactive environments get actionable error messages
- โœ… Scripts know to pass required flags (e.g., `--force`)
- โœ… Full test suite runs in seconds

### Testing Interactive Features

When adding interactive features:
1. **Add TTY detection FIRST** before any interactive code
2. **Write integration test** that calls the command from a non-TTY context
3. **Verify timeout protection**: Test should complete in <1 second
4. **Test error messages**: Ensure non-TTY errors are clear and actionable

### Current Interactive Commands

| Command | TTY Check | Fallback Behavior |
|---------|-----------|-------------------|
| `dm log` | โœ… stdout | Falls back to `short` mode |
| `dm checkout` | โœ… stdout | Returns error asking for branch name |
| `dm cleanup` | โœ… stdin | Returns error asking for `--force` |

### The `--force` / `-f` Convention

**All commands that prompt for confirmation MUST accept `--force` (and `-f`) to bypass the prompt.**

This is a Diamond-wide convention. Do not use `--no-interactive`, `--yes`, `-y`, or other variants.

**Rationale:**
- **Unix tradition**: `rm -f`, `git push -f`, `docker rm -f` โ€” developers have muscle memory for `-f`
- **Simplicity**: One flag to remember: "if it prompts, add `--force`"
- **Internal consistency**: All Diamond commands behave the same way
- **Brevity**: `-f` is quick to type in scripts

**Implementation pattern:**
```rust
#[derive(Parser)]
struct Args {
    /// Skip confirmation prompt
    #[arg(short, long)]
    force: bool,
}

// In command logic:
if !args.force && std::io::stdin().is_terminal() {
    // Show interactive prompt
    print!("Are you sure? [y/N] ");
    // ... handle input
} else if !args.force {
    anyhow::bail!("This command requires confirmation. Use --force to skip.");
}
// Proceed with operation
```

**Error messages** should always tell users about `--force`:
- โœ… `"This command requires confirmation. Use --force to skip."`
- โŒ `"Cannot run in non-interactive mode."`

## ๐ŸŽฏ Simplicity Principle

**CRITICAL: Always choose the simplest solution that works.**

This is a CLI tool for managing git branches, not a framework. Avoid:
- โŒ Over-abstraction: Don't create layers "for future flexibility"
- โŒ Premature optimization: Don't optimize code that isn't proven slow
- โŒ Complex patterns: Don't use design patterns unless they solve a real problem
- โŒ Unnecessary generics: Don't make code generic "just in case"
- โŒ Feature creep: Don't add features that weren't requested

Do:
- โœ… **Write straightforward code**: If it's simple and works, ship it
- โœ… **Solve the actual problem**: Address the specific need, nothing more
- โœ… **Delete code when possible**: Less code = fewer bugs
- โœ… **Copy-paste over abstraction**: Three similar functions are better than one confusing abstraction
- โœ… **Use stdlib first**: Prefer standard library over external crates when reasonable

**Rule of thumb:** If you're debating between a simple solution and a "clever" one, choose simple every time. You can always refactor later when complexity is justified by real requirements.

### YAGNI: You Aren't Gonna Need It

**CRITICAL: Do not write code "just in case" or "for future use".**

- โŒ **No `#[allow(dead_code)]`**: If code isn't used, delete it. Don't preserve it for hypothetical future needs.
- โŒ **No speculative features**: Don't add parameters, flags, or methods that aren't needed by current requirements.
- โŒ **No "reserved for future use"**: Comments like "TODO: might need this later" are a code smell. Delete the code.

**Why this matters:**
- Dead code rots: It doesn't get tested, updated, or maintained
- It creates confusion: Future readers wonder if they're missing something
- It's trivially recoverable: Git history preserves everything - you can always bring it back

**When you're tempted to keep unused code:**
1. Delete it
2. If you need it later, `git log -S "function_name"` will find it
3. Rewriting from scratch with fresh context is often better anyway

## Test Driven Development & Testing

**TDD (Red-Green-Refactor) is mandatory for this project.**

1. **Red**: Write a failing test for the desired behavior *before* writing any implementation code.
2. **Green**: Write the minimal code to make the test pass.
3. **Refactor**: Improve code structure while ensuring tests remain green.

### Guidelines
- **Unit Tests**: Co-locate with code in `#[cfg(test)] mod tests { ... }`.
- **Integration Tests**: Place in `tests/` for end-to-end CLI behavior (e.g., `tests/log_command.rs`).
- **Deterministic**: Tests must not rely on the environment (e.g., user's global git config).
- **Bug Fixes**: Always start by writing a test that reproduces the bug.

## โš ๏ธ Code Quality & Reliability Requirements

**CRITICAL: This tool manipulates git repositories and stack metadata. Bugs can cause data loss, repository corruption, or orphaned branches. The company's codebase depends on this tool's reliability.**

### Mandatory Test Coverage Standards

**Target: 85-95% code coverage for all non-TUI code.**

Every module must have comprehensive test coverage before being considered complete:

1. **All Happy Paths**: Every successful code path must be tested with real git operations
2. **All Error Paths**: Every failure mode must be tested and verified to fail gracefully
3. **All Edge Cases**: Empty inputs, corrupted data, missing files, detached HEAD, orphaned branches
4. **All State Mutations**: Every operation that modifies stack metadata or git state must be verified

### What Must Be Tested

#### git_gateway.rs (Git Operations)
- โœ… Branch creation, checkout, and existence checks
- โœ… Error handling: duplicate branches, non-existent branches, detached HEAD
- โœ… Current branch detection in all states
- โœ… Repository operations on real temp repositories (not mocked)
- โœ… Backup reference management for undo operations

#### state.rs & validation.rs (Metadata & Integrity)
- โœ… Save/load cycles with round-trip verification
- โœ… Corrupted JSON handling (must fail gracefully, not panic)
- โœ… Missing file handling (return empty state, not error)
- โœ… Parent-child relationship integrity (ConsistencyValidator)
- โœ… Cycle detection (CycleValidator)
- โœ… Orphaned branch handling
- โœ… Complex tree structures (multi-level stacks)
- โœ… Branch removal with and without children

#### commands/* (Command Logic)
- โœ… Full command workflows (git + metadata changes)
- โœ… Parent tracking accuracy
- โœ… Error messages for user mistakes
- โœ… Idempotency where appropriate
- โœ… State consistency after operations

### Critical Test Patterns

**Use serial_test for directory-changing tests:**
```rust
#[test]
#[serial]
fn test_with_cwd_change() -> Result<()> {
    let dir = tempdir()?;
    std::env::set_current_dir(dir.path())?;
    // Test logic
}
```

**Use real git repositories:**
```rust
fn init_test_repo(path: &Path) -> Result<Repository> {
    let repo = Repository::init(path)?;
    // Make initial commit so HEAD is valid
    let sig = git2::Signature::now("Test", "test@example.com")?;
    let tree_id = repo.index()?.write_tree()?;
    let tree = repo.find_tree(tree_id)?;
    repo.commit(Some("HEAD"), &sig, &sig, "Initial", &tree, &[])?;
    drop(tree);
    Ok(repo)
}
```

**Test both success and failure:**
```rust
#[test]
fn test_duplicate_branch_fails() -> Result<()> {
    create_branch("feature")?;
    let result = create_branch("feature");
    assert!(result.is_err());
    assert!(result.unwrap_err().to_string().contains("already exists"));
    Ok(())
}
```

### When Adding New Features

**MANDATORY CHECKLIST:**

- [ ] Write tests BEFORE implementation (TDD)
- [ ] Test happy path with real git operations (in unit tests with `tempdir()`)
- [ ] Add integration test for end-to-end workflow (e.g., `tests/basic_tests.rs` or specialized suite)
- [ ] Test all error conditions
- [ ] Test edge cases (empty, null, corrupted, missing)
- [ ] Verify state consistency after operations
- [ ] Manual verification in playground: `just playground` then test the feature
- [ ] Run `just test` or `cargo t` - all tests must pass (unit + integration)
- [ ] Run `cargo clippy -- -D warnings` - zero warnings
- [ ] Coverage: Aim for 90%+ of new code paths

**NEVER manually test features in the Diamond repository - always use `just playground`**

### Consequences of Insufficient Testing

**Without comprehensive tests, this tool could:**
- Corrupt the `refs/diamond/*` refs, losing all stack metadata
- Create orphaned branches that appear "lost" to users
- Fail to detect detached HEAD state, corrupting parent relationships
- Accept invalid state, causing crashes in `dm log`
- Leave the git repository in an inconsistent state
- Cause data loss that affects the entire engineering team

**Rule of thumb:** If you wouldn't trust the code with your company's main repository without the test, the test coverage is insufficient.

### Current Test Coverage Status

**As of late 2025: Comprehensive test suite with >90% coverage**

| Test Type | Location | Purpose |
|-----------|----------|---------|
| Unit Tests | `src/**/*.rs` | Module-level testing (git gateway, state validation, operational log) |
| Basic Integration | `tests/basic_tests.rs` | Core command functionality |
| Stack Operations | `tests/stack_ops_tests.rs` | Complex parent/child manipulations |
| Sync & Restack | `tests/sync_restack_tests.rs` | Rebasing and syncing logic |
| Navigation | `tests/navigation_tests.rs` | Movement between branches |
| Battle Hardness | `tests/battle_hardness_tests.rs` | Stress testing and complex workflows |
| Edge Cases | `tests/edge_cases_tests.rs` | Robustness against invalid states |

**All tests must pass with zero warnings:**
```bash
$ just test  # or: cargo t
$ cargo clippy --all-targets --all-features -- -D warnings
```

## Commit & Pull Request Guidelines

- Git history is not established yet; use Conventional Commits (`feat:`, `fix:`, `docs:`, `refactor:`, `test:`) with an imperative subject.
- PRs should include: what/why, repro or verification steps, and (for TUI changes) a screenshot or short recording.
- Required before merge: `cargo fmt`, `cargo clippy ... -D warnings`, and `cargo test` passing locally.

## Configuration & Safety Notes

- Diamond writes stack state to `refs/diamond/*` (git refs); do not hand-edit unless you know the format.
- `log` uses an alternate screen; if the terminal looks "stuck", run `reset` after exiting.

## ๐Ÿงช Testing Reminder

**Before testing any feature manually:**
1. Run `just playground` to create an isolated test environment
2. Navigate to `sandbox/test-repo/`
3. Test your feature there, not in the Diamond repository

**Before committing code:**
1. Run `just check` to validate formatting, linting, and all tests
2. Ensure all 206 tests pass (200 unit + 6 integration)
3. Add integration tests for new user-facing features in `tests/integration_test.rs`

## ๐Ÿ”ด QA Agent Guidelines (Diamond-Specific)

When using the `qa-tester` agent on this project, these are the Diamond-specific testing concerns:

### Environment Setup
- **Always** run `just playground` first to create an isolated test environment
- Work in `sandbox/test-repo/`, **never** the main Diamond repository
- Use `../../target/debug/dm` or `../../target/release/dm` to run the CLI

### State to Monitor
- `refs/diamond/parent/*` - Branch parent relationships (verify consistency)
- `refs/diamond/config/trunk` - Trunk configuration
- `.git/diamond/operation_state.json` - In-progress operation state

### Git State Edge Cases
Test `dm` commands in these challenging git states:
- Detached HEAD
- During merge conflict
- During rebase
- With dirty working tree (uncommitted changes)
- With branches that have diverged from remote
- With orphaned branches (parent deleted)
- With circular parent references (should be impossible, but verify)

### Critical Paths to Stress Test
1. **Stack Creation**: `dm create` with special chars, unicode, spaces, reserved names
2. **State Corruption Recovery**: Corrupt refs/diamond/* refs, then run commands
3. **Concurrent Operations**: Rapid `dm create && dm sync && dm restack`
4. **Navigation**: `dm up`, `dm down`, `dm top`, `dm bottom` at stack boundaries
5. **Cleanup**: `dm cleanup` with PRs in various states (merged, closed, open)

### Expected Error Behaviors
- Commands should fail gracefully with helpful messages, never panic
- State should remain consistent after errors (no partial updates)
- Interactive commands (`dm log`, `dm checkout`) must detect non-TTY and fail fast