# Architecture Documentation
This document provides a comprehensive overview of the Vibe Coding Tracker's architecture, design patterns, and system organization.
## Table of Contents
- [System Overview](#system-overview)
- [Module Architecture](#module-architecture)
- [Data Flow](#data-flow)
- [Key Components](#key-components)
- [Design Patterns](#design-patterns)
- [External Dependencies](#external-dependencies)
- [Performance Considerations](#performance-considerations)
## System Overview
Vibe Coding Tracker is a Rust-based CLI tool designed to analyze AI coding assistant usage across multiple platforms (Claude Code, Codex, and Gemini). The system employs a modular architecture with clear separation of concerns:
```
┌───────────────────────┐
│ Startup Check │
│ - update notif. │
│ - 24h cache │
│ - silent fail │
└───────────┬───────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ CLI Layer │
│ (clap-based command routing) │
└────────┬────────────────────────────────┬──────────┬────────┘
│ │ │
▼ ▼ ▼
┌────────────────┐ ┌────────────────────┐ ┌──────────────┐
│ Usage Analysis │ │ Conversation │ │ Update │
│ - calculator │ │ Analysis │ │ - github.rs │
│ - display │ │ - analyzer.rs │ │ - platform │
└────────┬───────┘ └────────┬───────────┘ │ - archive │
│ │ └──────────────┘
│ ┌─────────────┼────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────┐ ┌──────────────────────────┐
│ Pricing System │ │ Format Detection │
│ - fetch_model_ │ │ - Claude/Codex/Gemini │
│ pricing() │ │ - detector.rs │
│ - fuzzy match │ │ - claude_analyzer.rs │
│ - caching │ │ - codex_analyzer.rs │
└─────────────────┘ │ - gemini_analyzer.rs │
│ └──────────────────────────┘
└────────┬────────────┘
▼
┌───────────────────────┐
│ Data Models │
│ - DateUsageResult │
│ - CodeAnalysis │
│ - Claude/Codex/ │
│ Gemini │
└───────────────────────┘
│
▼
┌───────────────────────┐
│ Output Layer │
│ - TUI (Ratatui) │
│ - Table (comfy) │
│ - JSON/Text │
└───────────────────────┘
```
## Module Architecture
### Core Modules
#### 1. Entry Points (`main.rs`, `lib.rs`, `cli.rs`)
**main.rs**
- Application entry point
- Command routing to usage/analysis pipelines
- Error handling and exit codes
**lib.rs**
- Library exports for external use
- Version information (`get_version()`)
- Public API surface
**cli.rs**
- Clap-based CLI definitions
- Commands: `usage`, `analysis`, `version`
- Argument parsing and validation
#### 2. Models Module (`src/models/`)
Defines core data structures with serde serialization:
**usage.rs**
- `DateUsageResult`: Date-indexed usage map (maps date → model → usage data)
**analysis.rs**
- `CodeAnalysis`: Comprehensive conversation metadata
- User info (username, hostname, machineId)
- Git metadata (remote URL, commit hash)
- Token usage breakdown
- Tool call statistics
- File operation counts
**claude.rs**
- `ClaudeCodeLog`: Claude Code session log format
**codex.rs**
- `CodexMessage`: OpenAI/Codex message format
- `CompletionResponse`: Response with usage data
- `ReasoningTokens`: Reasoning output token tracking
**gemini.rs**
- `GeminiMessage`: Gemini message format
- `GeminiUsage`: Token usage tracking for Gemini
- `GeminiContent`: Content types for Gemini API
#### 3. Pricing System (`pricing.rs`)
**Responsibilities:**
- Fetch LiteLLM pricing data from GitHub
- Cache pricing with 24-hour TTL (daily cache files)
- Fuzzy model name matching (Jaro-Winkler algorithm)
**Matching Strategy (Priority Order):**
1. **Exact match**: Direct string equality
2. **Normalized match**: Strip version suffixes (e.g., `-20250514`)
3. **Substring match**: Check if model name contains pricing key
4. **Fuzzy match**: Jaro-Winkler similarity ≥ 0.7 threshold
5. **Fallback**: Return $0.00 if no match found
**Cache Location:**
```
~/.vibe_coding_tracker/model_pricing_YYYY-MM-DD.json
```
**Cost Calculation:**
```rust
cost = (input × input_cost_per_token) +
(output × output_cost_per_token) +
(cache_read × cache_read_cost_per_token) +
(cache_creation × cache_creation_cost_per_token)
```
#### 4. Usage Analysis Module (`src/usage/`)
**calculator.rs**
- `get_usage_from_directories()`: Main aggregation function
- Scans `~/.claude/projects/*.jsonl`
- Scans `~/.codex/sessions/*.jsonl`
- Scans `~/.gemini/tmp/<project_hash>/chats/*.json`
- Aggregates tokens by date and model
- Calculates costs via pricing system
**display.rs**
- `display_usage_interactive()`: Ratatui-based TUI with 1s refresh
- `display_usage_table()`: Static comfy-table output (UTF8_FULL preset)
- `display_usage_text()`: Plain text format (`Date > model: cost`)
- `display_usage_json()`: Full-precision JSON output
#### 5. Analysis Module (`src/analysis/`)
**analyzer.rs**
- `analyze_jsonl_file()`: Single file analysis entry point
- Routes to Claude/Codex analyzers based on detection
**batch_analyzer.rs**
- `analyze_all_sessions()`: Batch processing for all sessions
- Aggregates metrics by date and model:
- Edit/Read/Write line counts
- Tool call counts (Bash, Edit, Read, TodoWrite, Write)
**detector.rs**
- `detect_format()`: Determines Claude, Codex, or Gemini format
- Detection logic: Checks for `parentUuid` field (Claude-specific)
**claude_analyzer.rs**
- Parses Claude Code JSONL format
- Extracts:
- Tool usage (tool_use blocks in content)
- File operations (Edit, Read, Write line counts)
- Git metadata
- Conversation metrics
**codex_analyzer.rs**
- Parses Codex/OpenAI JSONL format
- Extracts:
- Token usage from `completion_response.usage`
- Reasoning tokens
- Total token usage
**gemini_analyzer.rs**
- Parses Gemini JSON format
- Extracts:
- Token usage from session messages
- Message content and metadata
- Session-level statistics
**display.rs**
- `display_analysis_interactive()`: Ratatui TUI for batch analysis
- Columns: Date, Model, Edit Lines, Read Lines, Write Lines, Tool Counts
- Auto-refresh, totals row, memory usage
- `display_analysis_table()`: Static table output
#### 6. Update Module (`src/update/`)
**mod.rs**
- `update_interactive()`: Interactive update with user confirmation
- `check_update()`: Check for updates without installing
- `perform_update()`: Execute the update process
- Version comparison using `semver` crate
**github.rs**
- `fetch_latest_release()`: Fetch release info from GitHub API
- `download_file()`: Download binary archives
- Uses `reqwest` with rustls-tls
**platform.rs**
- `get_asset_pattern()`: Determine platform-specific asset name
- `perform_update_unix()`: Unix update logic (rename + replace)
- `perform_update_windows()`: Windows update logic (batch script)
**archive.rs**
- `extract_targz()`: Extract `.tar.gz` archives (Unix)
- `extract_zip()`: Extract `.zip` archives (Windows)
**installation.rs**
- `detect_installation_method()`: Analyze executable path
- `InstallationMethod` enum: Npm, Pip, Cargo, Manual
- Path pattern matching for each installation method
**startup_check.rs**
- `check_update_on_startup()`: Automatic startup check
- Cache management (24-hour TTL)
- Display update notifications with installation-specific commands
- Silent failure on network errors
#### 7. Utilities Module (`src/utils/`)
**file.rs**
- `read_jsonl()`: Line-by-line JSONL parsing
- `save_json_pretty()`: Pretty-printed JSON output
**paths.rs**
- `resolve_paths()`: Resolves Claude, Codex, and Gemini session directories
- Handles `~` expansion via `home` crate
**git.rs**
- `get_git_remote()`: Extracts Git remote URL from repository
- Uses `git config --get remote.origin.url`
**time.rs**
- `format_timestamp()`: ISO 8601 date formatting
- Date aggregation utilities
#### 8. TUI Components (`src/usage/display.rs`, `src/analysis/display.rs`)
Terminal UI components built with Ratatui:
- Widgets: Tables, borders, styled text
- Layout: Constraints-based responsive design
- Event handling: Keyboard input (q, Esc, Ctrl+C)
- Refresh loop: 1-second intervals
## Data Flow
### Usage Command Flow
```
User runs: vct usage [--table|--text|--json]
│
▼
cli.rs parses args
│
▼
main.rs::Commands::Usage
│
▼
usage/calculator.rs::get_usage_from_directories()
│
├─> utils/paths.rs::resolve_paths()
│ └─> Returns ~/.claude/projects, ~/.codex/sessions, ~/.gemini/tmp
│
├─> walkdir scans *.jsonl files
│
├─> For each file:
│ ├─> utils/file.rs::read_jsonl()
│ ├─> analysis/detector.rs::detect_format()
│ └─> Aggregate tokens by (date, model)
│
▼
pricing.rs::fetch_model_pricing()
│
├─> Check cache: ~/.vibe_coding_tracker/model_pricing_YYYY-MM-DD.json
├─> If expired: fetch from GitHub
└─> Fuzzy match model names
│
▼
Calculate costs for each (date, model)
│
▼
usage/display.rs::display_usage_*()
│
├─> Interactive: Ratatui TUI loop
├─> Table: comfy-table render
├─> Text: Date > model: cost
└─> JSON: Full precision output
```
### Analysis Command Flow (Single File)
```
User runs: vct analysis --path <file.jsonl> [--output <out.json>]
│
▼
cli.rs parses args
│
▼
main.rs::Commands::Analysis
│
▼
analysis/analyzer.rs::analyze_jsonl_file()
│
├─> utils/file.rs::read_jsonl()
│
├─> analysis/detector.rs::detect_format()
│ └─> Identify Gemini (sessionId/projectHash/messages) or Claude (`parentUuid`)
│
├─> Route to analyzer:
│ ├─> Claude: claude_analyzer.rs
│ │ ├─> Parse ClaudeMessage structs
│ │ ├─> Extract tool_use blocks
│ │ ├─> Count file operations (Edit/Read/Write lines)
│ │ ├─> Extract Git info
│ │ └─> Aggregate tool call counts
│ │
│ ├─> Codex: codex_analyzer.rs
│ ├─> Parse CodexMessage structs
│ ├─> Extract completion_response.usage
│ └─> Handle reasoning tokens
│
│ └─> Gemini: gemini_analyzer.rs
│ └─> Parse session JSON (messages, tokens)
│
▼
Build CodeAnalysis struct
│
▼
Output JSON (if --output specified)
```
### Analysis Command Flow (Batch)
```
User runs: vct analysis [--output <out.json>]
│
▼
main.rs::Commands::Analysis
│
▼
analysis/batch_analyzer.rs::analyze_all_sessions()
│
├─> utils/paths.rs::resolve_paths()
│
├─> For each *.jsonl file:
│ └─> analysis/analyzer.rs::analyze_jsonl_file()
│
├─> Aggregate by (date, model):
│ ├─> Sum edit/read/write lines
│ └─> Sum tool call counts
│
▼
analysis/display.rs::display_analysis_*()
│
├─> Interactive: Ratatui TUI (default)
│ ├─> Show totals row
│ ├─> Memory usage stats
│ └─> 1s refresh loop
│
└─> JSON: Save aggregated array
```
## Key Components
### 1. Format Detection System
**Location:** `analysis/detector.rs`
**Logic:**
```rust
fn detect_format(records: &[Value]) -> FileFormat {
if records.is_empty() {
return FileFormat::Codex;
}
if records.len() == 1 {
if let Some(obj) = records[0].as_object() {
if obj.contains_key("sessionId")
&& obj.contains_key("projectHash")
&& obj.contains_key("messages")
{
return FileFormat::Gemini;
}
}
}
for record in records {
if let Some(obj) = record.as_object() {
if obj.contains_key("parentUuid") {
return FileFormat::Claude;
}
}
}
FileFormat::Codex
}
```
**Rationale:** Gemini exports wrap a session in a single JSON object with `sessionId`/`projectHash`, Claude Code records contain `parentUuid`, and Codex defaults to the OpenAI event log structure.
### 2. Pricing Cache System
**Design Goals:**
- Minimize network requests
- Daily pricing updates
- Offline capability (stale cache acceptable)
**Implementation:**
```rust
// Cache file naming: model_pricing_YYYY-MM-DD.json
let cache_path = format!("{}/model_pricing_{}.json", cache_dir, today);
if cache_path.exists() {
// Use cached data
} else {
// Fetch from GitHub, save to cache
}
```
### 3. Interactive TUI Architecture
**Framework:** Ratatui (formerly tui-rs)
**Event Loop:**
```rust
loop {
terminal.draw(|f| {
// Render UI
})?;
if event::poll(Duration::from_secs(1))? {
if let Event::Key(key) = event::read()? {
match key.code {
KeyCode::Char('q') | KeyCode::Esc => break,
_ => {}
}
}
}
// Refresh data every iteration
}
```
**Benefits:**
- Real-time updates
- Keyboard navigation
- Responsive layout
- Memory-efficient (no history buffering)
### 4. Fuzzy Model Matching
**Library:** `strsim` (Jaro-Winkler algorithm)
**Threshold:** 0.7 similarity score
**Example Matches:**
```
User model: "claude-sonnet-4-20250514"
Pricing DB: "claude-sonnet-4"
Match: Normalized (strip version suffix)
User model: "gpt-4-turbo-2024-04-09"
Pricing DB: "gpt-4-turbo"
Match: Normalized
User model: "anthropic.claude-3-sonnet"
Pricing DB: "claude-3-sonnet"
Match: Fuzzy (0.85 similarity)
```
### 5. Automatic Update System
**Location:** `src/update/`
**Components:**
**startup_check.rs:**
- Runs on every application startup (before command execution)
- 24-hour cache to minimize GitHub API requests
- Cache location: `~/.vibe_coding_tracker/update_check.json`
- Silent failure (network errors don't disrupt main functionality)
- Displays colorful box notification when update available
**installation.rs:**
- Detects installation method by analyzing executable path
- Detection patterns:
- npm: `/npm/`, `/.npm`, `/node_modules/`
- pip: `/site-packages/`, `/python`, `/Scripts/`, `/.local/bin/` (with Python context)
- cargo: `/.cargo/bin/`
- manual: All other paths (curl, PowerShell, source build)
- Provides installation-specific update commands
**Cache Structure:**
```json
{
"last_check": "2025-10-09T12:34:56.789Z",
"latest_version": "v0.1.7",
"has_update": true
}
```
**User Experience Flow:**
```
Application Start
│
▼
Load cache (~/.vibe_coding_tracker/update_check.json)
│
├─> Cache valid (<24h)
│ ├─> has_update = true
│ │ └─> Display notification
│ └─> has_update = false
│ └─> Silent (no output)
│
└─> Cache invalid or missing
└─> Check GitHub API
├─> Update available
│ ├─> Detect installation method
│ ├─> Display notification with update command
│ └─> Save cache
└─> No update
└─> Save cache (has_update = false)
```
**Benefits:**
- **Prevents version conflicts**: Users update via same method they installed
- **Reduces support burden**: Correct update commands shown automatically
- **Non-disruptive**: Silent failure on errors, doesn't block main commands
- **Performance**: 24-hour cache minimizes network requests
- **User-friendly**: Clear, actionable update instructions
## Design Patterns
### 1. Pipeline Architecture
Each analysis flow follows a clear pipeline:
```
Input → Detection → Parsing → Aggregation → Display → Output
```
### 2. Strategy Pattern (Display Modes)
Multiple display strategies for same data:
- `display_usage_interactive()`
- `display_usage_table()`
- `display_usage_text()`
- `display_usage_json()`
### 3. Repository Pattern (Data Access)
Centralized file access through `utils/file.rs`:
- `read_jsonl()`: Streaming JSONL reader
- `save_json_pretty()`: Formatted JSON writer
### 4. Factory Pattern (Format Detection)
`detector.rs` acts as factory, routing to appropriate analyzer:
```rust
match detect_format(line)? {
FileFormat::Claude => claude_analyzer::analyze(),
FileFormat::Codex => codex_analyzer::analyze(),
FileFormat::Gemini => gemini_analyzer::analyze(),
}
```
### 5. Error Handling Strategy
**Libraries:**
- `anyhow`: Application-level errors (main.rs, high-level logic)
- `thiserror`: Library-level errors (custom error types)
**Approach:**
- Propagate errors with `?` operator
- Context-aware error messages
- Graceful degradation (e.g., $0.00 for unknown models)
## External Dependencies
### Critical Dependencies
**CLI & Serialization:**
- `clap` (4.x): Derive-based CLI parsing
- `serde` + `serde_json`: Serialization framework
**TUI:**
- `ratatui`: Terminal UI framework
- `crossterm`: Cross-platform terminal control
- `comfy-table`: Static table rendering
**HTTP & Data:**
- `reqwest` (rustls-tls): Async HTTP client
- `chrono`: Date/time manipulation
**File System:**
- `walkdir`: Recursive directory traversal
- `home`: Platform-agnostic home directory
**Algorithms:**
- `strsim`: Fuzzy string matching
- `regex`: Pattern matching
**System:**
- `sysinfo`: Memory and system stats
- `hostname`: Machine hostname retrieval
### Dependency Rationale
**Why Ratatui over alternatives?**
- Active maintenance
- Rich widget library
- Flexible layout system
- Efficient rendering
**Why rustls-tls for reqwest?**
- Smaller binary size than native-tls
- Pure Rust implementation
- Better cross-compilation
**Why comfy-table?**
- UTF-8 box drawing
- Flexible styling
- Column auto-sizing
## Performance Considerations
### 1. JSONL Streaming
**Approach:** Line-by-line parsing instead of loading entire file
**Benefits:**
- Constant memory usage
- Early error detection
- Handles large files (>100MB)
**Implementation:**
```rust
for line in BufReader::new(file).lines() {
let line = line?;
// Parse single line
}
```
### 2. Caching Strategy
**Pricing Cache:**
- Daily TTL reduces network requests
- Local file cache (no database overhead)
**No Session Cache:**
- Always read fresh data (session files change frequently)
- Acceptable trade-off for CLI tool
### 3. Aggregation Efficiency
**Data Structures:**
```rust
type DateUsageResult = HashMap<String, HashMap<String, serde_json::Value>>;
// Outer key: Date (YYYY-MM-DD)
// Inner key: Model name
// Value: Token usage data (varies by extension type)
```
**Complexity:** O(1) insertion, O(n) iteration for display
### 4. TUI Refresh Rate
**Interval:** 1 second
**Rationale:**
- Balance between responsiveness and CPU usage
- Session files updated infrequently (minutes/hours)
- Minimal overhead for re-aggregation
### 5. Binary Size Optimization
**Release Profile:**
```toml
[profile.release]
lto = "thin" # Link-time optimization
codegen-units = 1 # Better optimization
strip = true # Remove debug symbols
```
**Result:** ~3-5 MB binary
## File Format Specifications
### Claude Code JSONL Format
```json
{
"parentUuid": "conv-123",
"type": "apiResponse",
"message": {
"model": "claude-sonnet-4-20250514",
"usage": {
"input_tokens": 1000,
"output_tokens": 500,
"cache_read_input_tokens": 2000,
"cache_creation_input_tokens": 500
},
"content": [
{
"type": "text",
"text": "..."
},
{
"type": "tool_use",
"name": "Edit",
"input": {
"old_string": "...",
"new_string": "..."
}
}
]
}
}
```
### Codex JSONL Format
```json
{
"completion_response": {
"usage": {
"prompt_tokens": 1000,
"completion_tokens": 500,
"total_tokens": 1500
}
},
"reasoning_output_tokens": 100,
"total_token_usage": 1600
}
```
## Directory Structure
```
~/.claude/
└── projects/
├── project-a.jsonl
└── project-b.jsonl
~/.codex/
└── sessions/
├── session-1.jsonl
└── session-2.jsonl
~/.gemini/
└── tmp/
└── <project-hash>/
└── chats/
├── session-1.json
└── session-2.json
~/.vibe_coding_tracker/
├── model_pricing_2025-10-05.json # Daily pricing cache
└── update_check.json # Update notification cache (24h TTL)
```
## Extension Points
### Adding New AI Platforms
1. **Add model to `src/models/`**:
```rust
#[derive(Deserialize)]
pub struct NewPlatformMessage { ... }
```
2. **Update detector**:
```rust
pub enum FileFormat {
Claude,
Codex,
Gemini,
NewPlatform, }
```
3. **Implement analyzer**:
```rust
pub fn analyze_newplatform(lines: Vec<String>) -> Result<CodeAnalysis>
```
4. **Update router**:
```rust
match format {
FileFormat::NewPlatform => newplatform_analyzer::analyze(lines),
...
}
```
### Adding New Display Formats
Implement trait-based abstraction:
```rust
trait DisplayFormatter {
fn format(&self, data: &DateUsageResult) -> String;
}
struct TableFormatter;
struct JsonFormatter;
// Add custom formatters
```
### Adding New Metrics
1. Update `CodeAnalysis` struct in `models/analysis.rs`
2. Update analyzers to extract new metrics
3. Update display modules to render new columns
## Security Considerations
### File Access
- Only reads from known directories (`~/.claude`, `~/.codex`, `~/.gemini`)
- No arbitrary file write (output only to user-specified paths)
- No command execution in parsed data
### Network Requests
- HTTPS only (rustls)
- Single endpoint: GitHub raw content
- No authentication required
- No user data transmitted
### Data Privacy
- All processing local
- No telemetry or analytics
- Session data never leaves machine
## Testing Strategy
### Unit Tests
- Model deserialization
- Pricing calculation
- Fuzzy matching logic
- Date formatting
### Integration Tests
Located in `tests/`:
- `test_integration_usage.rs`: End-to-end usage command
- `test_integration_analysis.rs`: End-to-end analysis command
**Test Fixtures:**
- `examples/test_conversation.jsonl`: Claude format
- `examples/test_conversation_oai.jsonl`: Codex format
### Manual Testing
```bash
# Test usage command
cargo run -- usage --table
# Test analysis command
cargo run -- analysis --path examples/test_conversation.jsonl
# Test batch analysis
cargo run -- analysis
# Test error handling
cargo run -- usage --invalid-flag
```
## Future Architecture Considerations
### Potential Improvements
1. **Plugin System**: Load analyzers dynamically for new platforms
2. **Database Backend**: SQLite for historical tracking
3. **Web Dashboard**: Ratatui → WebAssembly UI
4. **Distributed Analysis**: Analyze across multiple machines
5. **Real-time Monitoring**: Watch session files for changes
6. **Export Formats**: CSV, Excel, PDF reports
### Scalability
**Current Limits:**
- File count: ~1000 sessions (tested)
- File size: Up to 1GB per file (streaming parser)
- Memory: \<50MB typical usage
**Bottlenecks:**
- Aggregation is O(n) where n = total lines across all files
- No indexing or incremental updates
**Solutions for Scale:**
- Incremental aggregation (track last-processed line)
- SQLite index on (date, model)
- Parallel file processing (Rayon)
---
**Document Version:** 1.0
**Last Updated:** 2025-10-05
**Maintainer:** Vibe Coding Tracker Team