Expand description
Git repository cache management with worktree-based parallel operations
This module provides a sophisticated caching system for Git repositories that enables safe parallel resource installation through Git worktrees. The cache system has been redesigned for optimal concurrency, simplified architecture, and enhanced performance in AGPM v0.3.0.
§Architecture Overview
The cache system implements a multi-layered architecture:
Cachestruct: Core repository management and worktree orchestrationCacheLock: File-based locking for process-safe concurrent accessWorktreeState: Instance-level caching for worktree lifecycle management- Bare repositories: Optimized Git storage for efficient worktree creation
§Platform-Specific Cache Locations
The cache follows platform conventions for optimal performance:
- Linux/macOS:
~/.agpm/cache/(following XDG standards) - Windows:
%LOCALAPPDATA%\agpm\cache\(using Windows cache directory) - Environment Override: Set
AGPM_CACHE_DIRfor custom locations
§Cache Directory Structure
The cache is organized for optimal parallel access patterns:
~/.agpm/cache/
├── sources/ # Bare repositories optimized for worktrees
│ ├── github_owner_repo.git/ # Bare repo with all Git objects
│ └── gitlab_org_project.git/ # URL-parsed directory naming
├── worktrees/ # SHA-based worktrees for maximum deduplication
│ ├── github_owner_repo_abc12345/ # First 8 chars of commit SHA
│ ├── github_owner_repo_def67890/ # Each unique commit gets one worktree
│ ├── .state.json # Persistent worktree registry
│ └── github_owner_repo_456789ab/ # Multiple refs to same SHA share worktree
└── .locks/ # Fine-grained locking infrastructure
├── github_owner_repo.lock # Repository-level locks
└── worktree-owner_repo-v1.lock # Worktree creation locks§Enhanced Concurrency Architecture
The v0.3.2+ cache implements SHA-based worktree optimization with advanced concurrency:
- SHA-based deduplication: Worktrees keyed by commit SHA, not version reference
- Centralized resolution:
VersionResolverhandles batch SHA resolution upfront - Maximum reuse: Multiple tags/branches pointing to same commit share one worktree
- Instance-level caching:
WorktreeStatetracks creation across threads - Per-worktree file locking: Fine-grained locks prevent creation conflicts
- Direct parallelism control:
--max-parallelflag controls concurrency - Command-instance fetch caching: Single fetch per repository per command
- Atomic state transitions: Pending → Ready state coordination
§Locking Strategy
Process A: acquire("source1") ───┐
├─── BLOCKS: same source
Process B: acquire("source1") ───┘
Process C: acquire("source2") ───── CONCURRENT: different source§Cache Operations
§Repository Management
- Clone: Initial repository cloning from remote URLs
- Update: Fetch latest changes from remote (git fetch)
- Checkout: Switch to specific versions (tags, branches, commits)
- Cleanup: Remove unused repositories to reclaim disk space
§Resource Installation
- Copy-based: Files copied from cache to project directories
- Path resolution: Handles relative paths within repositories
- Directory creation: Automatically creates parent directories
- Overwrite safety: Replaces existing files atomically
§Performance Characteristics
The cache is optimized for common AGPM workflows:
- First install: Clone repository once, reuse for all resources
- Subsequent installs: Copy from local cache (fast file operations)
- Version switching: Git checkout within cached repository
- Parallel operations: Multiple sources can be processed concurrently
§Disk Space Management
- Size calculation: Recursive directory size calculation
- Unused cleanup: Remove repositories no longer referenced
- Complete cleanup: Clear entire cache when needed
- Selective removal: Keep active sources, remove only unused ones
§Error Handling and Recovery
The cache provides comprehensive error handling:
- Lock timeouts: Graceful handling of concurrent access
- Clone failures: Network and authentication error reporting
- Version errors: Clear messages for invalid tags/branches/commits
- File system errors: Detailed context for permission and space issues
§Security Considerations
- Path validation: Prevents directory traversal attacks
- Lock file isolation: Prevents lock file manipulation
- Safe file operations: Atomic operations prevent corruption
- Permission handling: Respects file system permissions
§Usage Examples
§Basic Cache Operations
use agpm_cli::cache::Cache;
use std::path::PathBuf;
// Initialize cache with default location
let cache = Cache::new()?;
// Get or clone a source repository
let repo_path = cache.get_or_clone_source(
"community",
"https://github.com/example/agpm-community.git",
Some("v1.0.0") // Specific version
).await?;
// Copy a resource from cache to project
cache.copy_resource(
&repo_path,
"agents/helper.md", // Source path in repository
&PathBuf::from("./agents/helper.md") // Destination in project
).await?;§Cache Maintenance
use agpm_cli::cache::Cache;
let cache = Cache::new()?;
// Check cache size
let size_bytes = cache.get_cache_size().await?;
println!("Cache size: {} MB", size_bytes / 1024 / 1024);
// Clean unused repositories
let active_sources = vec!["community".to_string(), "work".to_string()];
let removed_count = cache.clean_unused(&active_sources).await?;
println!("Removed {} unused repositories", removed_count);
// Complete cache cleanup
cache.clear_all().await?;§Custom Cache Location
use agpm_cli::cache::Cache;
use std::path::PathBuf;
// Use custom cache directory (useful for testing or special setups)
let custom_dir = PathBuf::from("/tmp/my-agpm-cache");
let cache = Cache::with_dir(custom_dir)?;
println!("Using cache at: {}", cache.get_cache_location().display());§Integration with AGPM Workflow
The cache module integrates seamlessly with AGPM’s dependency management:
- Manifest parsing: Source URLs extracted from
agpm.toml - Dependency resolution: Version constraints resolved to specific commits
- Cache population: Repositories cloned and checked out as needed
- Resource installation: Files copied from cache to project directories
- Lockfile generation: Installed resources tracked in
agpm.lock
See crate::manifest for manifest handling and crate::lockfile for
lockfile management.
Re-exports§
pub use lock::CacheLock;
Modules§
- lock
- File-based locking mechanism for cache operations
Structs§
- Cache
- Git repository cache for efficient resource management