guardy 0.2.4

Fast, secure git hooks in Rust with secret scanning and protected file synchronization
Documentation
# Git Module - High-Performance Pure Rust Git Operations 🦀

This module provides **microsecond-level Git operations** using pure Rust via
[gitoxide (gix)](https://github.com/Byron/gitoxide). It handles all Git repository operations and
file discovery with exceptional performance for commercial-grade applications.

## 🚀 Performance Benefits

Our pure gix implementation delivers **massive performance improvements** over traditional
approaches:

| Operation                              | External Git Command | libgit2 (C) | **gix (Pure Rust)** | **Speedup**      |
| -------------------------------------- | -------------------- | ----------- | ------------------- | ---------------- |
| Staged Files Discovery                 | 7.31ms               | 140μs       | **42μs**            | **175x faster**  |
| Repository Discovery                   | 2.5ms                | N/A         | **350μs**           | **7x faster**    |
| All Files Listing                      | 3-5ms                | ~500μs      | **100-500μs**       | **6-50x faster** |
| **Real-world impact in Guardy hooks:** |                      |             |                     |                  |

- **Before**: 20.9ms total hook execution
- **After**: 16.5ms total hook execution (**21% faster**)
- Git operations: **~1ms total** (was ~6ms)

## 🏗️ Architecture & Caching

### Intelligent Multi-Level Caching

```rust
// Application-wide repository caching
static GIT_REPO: LazyLock<Option<Arc<GitRepo>>> = LazyLock::new(|| {
    GitRepo::discover().ok().map(Arc::new)
});
// Per-hook execution caching
pub struct HookFileCache {
    staged_files: OnceLock<Vec<PathBuf>>,    // Cached on first access
    all_files: OnceLock<Vec<PathBuf>>,       // Cached on first access
    push_files: OnceLock<Vec<PathBuf>>,      // Cached on first access
}
```

### Thread-Safe Design

- Uses `gix::ThreadSafeRepository` for concurrent access
- `Arc<GitRepo>` for safe sharing across threads
- Optimized for parallel hook execution

## 📁 Module Structure

### Core Implementation

- **`mod.rs`** - Main `GitRepo` struct with gix integration
- **`operations.rs`** - High-performance file discovery operations
- **`remote.rs`** - Remote repository operations

### Future Extensions

- **`hooks.rs`** - Git hook installation and management (planned)
- **`commit.rs`** - Commit-related operations (planned)

## 🔧 Current API

### GitRepo (mod.rs)

```rust
impl GitRepo {
    pub fn discover() -> Result<Self>              // Find repo from current directory
    pub fn open(path: &Path) -> Result<Self>       // Open repo at specific path
    pub fn current_branch(&self) -> Result<String> // Get current branch name
    pub fn git_dir(&self) -> PathBuf               // Get .git directory path
}
```

### High-Performance Operations (operations.rs)

```rust
impl GitRepo {
    // 🚀 42μs - Primary use case for pre-commit hooks
    pub fn get_staged_files(&self) -> Result<Vec<PathBuf>>
    // 🚀 100-500μs - All tracked files in repository
    pub fn get_all_files(&self) -> Result<Vec<PathBuf>>
    // For pre-push hooks (TODO: implement proper diff)
    pub fn get_push_files(&self, remote: &str, branch: &str) -> Result<Vec<PathBuf>>
    // TODO: Implement with gix status API
    pub fn get_modified_files(&self) -> Result<Vec<PathBuf>>
}
```

## ⚡ Implementation Details

### Staged Files Detection Algorithm

Our implementation directly compares Git index with HEAD tree for maximum performance:

```rust
pub fn get_staged_files(&self) -> Result<Vec<PathBuf>> {
    let repo = self.gix_repo();
    let index = repo.index()?;
    // Handle initial commit case
    let head_tree = match repo.head_tree_id() {
        Ok(tree_id) => Some(repo.find_tree(tree_id)?),
        Err(_) => None, // All files staged for initial commit
    };
    // Compare index entries with HEAD tree entries
    for entry in index.entries() {
        match tree.lookup_entry_by_path(&entry.path()) {
            Ok(Some(tree_entry)) => {
                if entry.id != tree_entry.object_id() {
                    // File content differs = staged change
                    staged_files.push(path);
                }
            },
            Ok(None) | Err(_) => {
                // File not in HEAD = newly added (staged)
                staged_files.push(path);
            }
        }
    }
}
```

### Why Not External Commands?

- **Process spawning overhead**: 3-7ms per `git` command
- **Parsing overhead**: String parsing and validation
- **I/O bottlenecks**: Multiple filesystem operations
- **Error handling complexity**: Process exit codes and stderr parsing

### Why gix Over libgit2?

- **4x faster** than libgit2 for staged files (42μs vs 140μs)
-**Pure Rust** - no C dependencies or FFI overhead
-**Memory safe** - no risk of C memory issues
-**Better error handling** - Rust's Result type
-**Thread safety** - Built-in thread-safe operations
-**Maintenance** - Active pure-Rust development

## 🧪 Performance Benchmarking

Run the included benchmark to compare all approaches:

```bash
# Run performance comparison
cargo run --example git_timing_test --release
# Expected output:
# Command:  7.307ms
# libgit2:  140.213μs  (52x faster)
# gix:      41.77μs    (175x faster) ← Winner!
```

For comprehensive benchmarks:

```bash
cargo bench --bench git_performance_comparison
```

## 🎯 Design Principles

1. **Microsecond Performance** - Sub-millisecond operations for all Git queries
2. **Pure Rust** - Zero C dependencies, maximum safety and performance
3. **Smart Caching** - Multi-level caching for optimal repeated access
4. **Thread Safety** - Designed for concurrent hook execution
5. **Correctness First** - Proper Git semantics, not shortcuts
6. **Commercial Grade** - Production-ready error handling and logging

## 🔄 Integration Patterns

### Recommended Usage

```rust
// Get cached repository (fast - uses LazyLock)
let repo = get_cached_git_repo()?;
// Create per-hook cache (enables OnceLock caching)
let cache = HookFileCache::new(repo, "pre-commit");
cache.precompute(); // Start background loading
// Get files (42μs first call, ~0μs subsequent calls)
let staged_files = cache.get_staged_files();
```

### Anti-Patterns to Avoid

```rust
// ❌ DON'T: Create new GitRepo instances repeatedly
let repo = GitRepo::discover()?; // 350μs each time
// ❌ DON'T: Call get_staged_files() multiple times without caching
for _ in 0..10 {
    let files = repo.get_staged_files()?; // 42μs × 10 = 420μs
}
// ✅ DO: Use the caching layer
let cache = HookFileCache::new(repo, "pre-commit");
let files = cache.get_staged_files(); // 42μs once, then cached
```

## 🚧 Future Enhancements

### Planned Features

- **Modified files detection** using gix status API
- **Proper push files diff** between local and remote branches
- **Git hooks management** for programmatic hook installation
- **Commit operations** for automated commit workflows
- **Branch operations** for advanced Git workflows

### Performance Targets

- [ ] Sub-10μs staged file detection for small repositories
- [ ] Sub-100μs operations for repositories with 10k+ files
- [ ] Parallel file discovery for large monorepos
- [ ] Memory-mapped index reading for extreme performance

## 🔍 When to Extend This Module

### Add to `mod.rs` when:

- Adding fundamental repository operations
- Extending GitRepo with new core capabilities
- Adding repository discovery/initialization logic

### Add to `operations.rs` when:

- Adding new file discovery operations
- Implementing Git status/diff functionality
- Creating file listing optimizations

### Add to `remote.rs` when:

- Adding remote repository operations
- Implementing push/pull functionality
- Managing remote branch comparisons

---

**This module powers Guardy's commercial-grade Git operations with microsecond-level performance.
The pure Rust implementation ensures memory safety, thread safety, and exceptional speed for
production workloads.** 🚀