sql-cli 1.73.1 - Docs.rs

# Hybrid Cache Proxy Architecture

## Core Principle
Keep sql-cli simple and stateless, add intelligence via optional proxy service.

## Architecture Options

### Option 1: Transparent Cache Proxy (Recommended)
```
┌─────────────┐     ┌──────────────────┐     ┌─────────────┐
│ nvim Plugin │────▶│  Cache Proxy     │────▶│   sql-cli   │
└─────────────┘     │  (.NET/Rust)     │     │  (simple)   │
                    └──────────────────┘     └─────────────┘
                             │
                    ┌────────▼─────────┐
                    │   Cache Store    │
                    │  (Parquet files) │
                    └──────────────────┘
```

**How it works:**
1. nvim sends query to proxy (if running) or directly to sql-cli
2. Proxy checks cache for Web CTE results
3. If cached: rewrites query to use local files
4. If not: passes through, caches response
5. sql-cli stays completely stateless

### Option 2: Smart Rewriter Proxy
```
Original Query:
WITH WEB trades AS (
    URL 'https://prod/trades'
    BODY '{"source": "Barclays", "date": "2025-09-01"}'
)
SELECT * FROM trades WHERE amount > 1000000

↓ Proxy Rewrites To ↓

WITH trades AS (
    SELECT * FROM '/cache/2025-09-01/barclays_trades.parquet'
)
SELECT * FROM trades WHERE amount > 1000000
```

## Implementation: .NET Cache Proxy

```csharp
// Simple .NET Core service
public class SqlCliCacheProxy
{
    private readonly ICache cache;
    private readonly HttpClient httpClient;

    public async Task<QueryResult> ExecuteQuery(string query)
    {
        // 1. Parse for Web CTEs
        var webCtes = ParseWebCTEs(query);

        // 2. Check cache for each
        var rewrites = new Dictionary<string, string>();
        foreach (var cte in webCtes)
        {
            var cacheKey = GenerateCacheKey(cte);
            var cachedFile = await cache.GetFilePathAsync(cacheKey);

            if (cachedFile != null)
            {
                // Rewrite to local file
                rewrites[cte.Name] = $"SELECT * FROM '{cachedFile}'";
            }
            else
            {
                // Fetch and cache
                var data = await FetchWebData(cte);
                cachedFile = await cache.StoreAsync(cacheKey, data);
                rewrites[cte.Name] = $"SELECT * FROM '{cachedFile}'";
            }
        }

        // 3. Rewrite query
        var rewrittenQuery = ApplyRewrites(query, rewrites);

        // 4. Execute with sql-cli
        return await ExecuteSqlCli(rewrittenQuery);
    }
}
```

## Simpler Option: Shell Script Proxy

```bash
#!/bin/bash
# sql-cli-cached

CACHE_DIR=~/.cache/sql-cli/tables
QUERY="$1"

# Hash the query
CACHE_KEY=$(echo "$QUERY" | sha256sum | cut -d' ' -f1)
CACHE_FILE="$CACHE_DIR/$CACHE_KEY.parquet"

# Check cache
if [[ -f "$CACHE_FILE" ]]; then
    # Rewrite query to use cached file
    REWRITTEN=$(echo "$QUERY" | sed "s|URL.*)|SELECT * FROM '$CACHE_FILE'|")
    sql-cli -q "$REWRITTEN"
else
    # Execute and cache
    sql-cli -q "$QUERY" -o parquet > "$CACHE_FILE"
    cat "$CACHE_FILE" | sql-cli -f parquet
fi
```

## Rust-based Lightweight Proxy

```rust
// Minimal proxy that just handles caching
// Could be part of sql-cli with --proxy flag

use axum::{Router, Json};
use serde::{Deserialize, Serialize};

#[derive(Deserialize)]
struct QueryRequest {
    query: String,
    cache_hint: Option<String>,
}

async fn execute_query(Json(req): Json<QueryRequest>) -> Json<QueryResult> {
    // 1. Check if query has Web CTEs
    if let Some(web_ctes) = extract_web_ctes(&req.query) {
        // 2. Check cache
        for cte in web_ctes {
            if let Some(cached_path) = cache.get(&cte.cache_key()) {
                // Rewrite to use file
                rewrite_cte_to_file(&mut query, &cte.name, &cached_path);
            } else {
                // Mark for caching after execution
                pending_cache.insert(cte.cache_key(), cte);
            }
        }
    }

    // 3. Execute (with sql-cli or embedded engine)
    let result = sql_cli_execute(&query)?;

    // 4. Cache any pending Web CTE results
    for (key, cte) in pending_cache {
        cache.store(key, &result.get_cte_data(&cte.name));
    }

    Json(result)
}

#[tokio::main]
async fn main() {
    let app = Router::new()
        .route("/query", post(execute_query))
        .route("/cache/status", get(cache_status))
        .route("/cache/clear", post(clear_cache));

    axum::Server::bind(&"127.0.0.1:7890")
        .serve(app.into_make_service())
        .await
        .unwrap();
}
```

## Progressive Implementation Path

### Phase 1: File-based cache in sql-cli (Simple)
```bash
# Add cache flag to sql-cli
sql-cli query.sql --cache-dir ~/.cache/sql-cli

# sql-cli checks cache before fetching
# Minimal changes to existing code
```

### Phase 2: Optional Proxy Mode
```bash
# Start proxy (could be part of sql-cli binary)
sql-cli --proxy-mode --port 7890 &

# nvim plugin detects proxy and uses it
# Falls back to direct sql-cli if not running
```

### Phase 3: Intelligent Proxy
- Smarter cache invalidation
- Background refresh
- Query rewriting optimization
- Multi-user cache sharing

## Why This Architecture?

1. **sql-cli stays simple**: No daemon, no state, no complexity
2. **Optional complexity**: Proxy only when needed
3. **Technology flexibility**: Proxy can be .NET, Rust, Python, whatever
4. **Graceful degradation**: Works without proxy
5. **Clear separation**: sql-cli = execution, proxy = caching

## Quick Win Implementation

Start with the simplest thing that works:

```rust
// In sql-cli: Add cache check for Web CTEs only
impl WebDataFetcher {
    pub fn fetch_with_cache(&self, spec: &WebCTESpec) -> Result<DataTable> {
        // Check for cached parquet file
        if let Some(cache_path) = self.check_cache(spec) {
            return load_parquet(cache_path);
        }

        // Fetch from network
        let data = self.fetch(spec)?;

        // Save to cache
        if let Some(cache_path) = self.get_cache_path(spec) {
            save_as_parquet(&data, cache_path)?;
        }

        Ok(data)
    }
}
```

This gives you:
- Immediate caching benefit
- No architectural changes
- Can evolve to proxy later

## Your Workflow Optimized

```sql
-- First run: fetches and caches (30s)
WITH WEB barclays_trades AS (
    URL 'https://prod/trades'
    BODY '{"source": "Barclays", "date": "2025-09-01"}'
    CACHE '24h'  -- Yesterday's trades never change
)
SELECT * FROM barclays_trades WHERE amount > 1000000;

-- Subsequent runs: instant (cached)
-- Cache key includes date, so different days cached separately
```

## Decision Framework

Choose based on:

1. **Simple file cache in sql-cli** if:
   - Single user
   - Don't mind cache per invocation
   - Want minimal changes

2. **Lightweight Rust proxy** if:
   - Want persistent cache
   - Multiple nvim sessions
   - Prefer staying in Rust ecosystem

3. **.NET proxy service** if:
   - Need complex business logic
   - Integration with other .NET services
   - Team already uses .NET

4. **Full server mode** if:
   - Multiple users
   - Need query result sharing
   - Want maximum performance

My recommendation: Start with simple file cache in sql-cli, evolve to lightweight proxy when needed.