cai-ingest 0.1.0

Data ingestion parsers for Coding Agent Insights
Documentation
# cai-ingest

Data ingestion from AI coding platforms.

## Overview

`cai-ingest` provides parsers and scanners for extracting AI coding interactions from various sources including Claude Code conversations, Codex CLI history, and Git commit history.

## Key Features

- **Claude Code parser** - Parse conversation JSON files
- **Codex CLI parser** - Parse Codex command history
- **Git scanner** - Extract commits as entries
- **Unified API** - Single `Ingestor` for all sources
- **Extensible** - Easy to add new parsers

## Usage

### Using Ingestor

```rust
use cai_ingest::{Ingestor, IngestConfig};
use cai_storage::MemoryStorage;

#[tokio::main]
async fn main() -> cai_core::Result<()> {
    let storage = MemoryStorage::new();

    // Default configuration (Claude + Codex)
    let ingestor = Ingestor::with_defaults();
    let count = ingestor.ingest_all(&storage).await?;

    println!("Ingested {} entries", count);
    Ok(())
}
```

### Individual Parsers

```rust
use cai_ingest::{ClaudeParser, CodexParser, GitScanner};

// Parse Claude conversations
let claude_parser = ClaudeParser::with_default_path()?;
let entries = claude_parser.parse_all()?;

// Parse Codex history
let codex_parser = CodexParser::with_default_path()?;
let entries = codex_parser.parse_all()?;

// Scan Git repository
let git_scanner = GitScanner::new("~/my-project");
let entries = git_scanner.scan()?;
```

## Data Format

Each parser produces `cai_core::Entry` instances:

```rust
use cai_core::{Entry, Source, Metadata};
use chrono::Utc;

Entry {
    id: "unique-id".to_string(),
    source: Source::Claude,
    timestamp: Utc::now(),
    prompt: "Original prompt".to_string(),
    response: "AI response".to_string(),
    metadata: Metadata {
        file_path: Some("src/main.rs".to_string()),
        repo_url: Some("https://github.com/user/repo".to_string()),
        commit_hash: Some("abc123".to_string()),
        language: Some("Rust".to_string()),
        extra: std::collections::HashMap::new(),
    },
}
```

## Usage

Add to your `Cargo.toml`:

```toml
[dependencies]
cai-ingest = { path = "../cai-ingest" }
```

## Design Decisions

- **Streaming**: Support large files without loading entirely into memory
- **Error recovery**: Continue parsing on malformed entries
- **Extensible**: Easy to add new source parsers

## Testing

```bash
cargo test -p cai-ingest
```

## License

MIT OR Apache-2.0