sql-cli 1.73.1

SQL query tool for CSV/JSON with both interactive TUI and non-interactive CLI modes - perfect for exploration and automation
Documentation
# Prime Number Implementation Strategy

## Core Design: Hybrid Approach

### 1. Pre-computed Prime Table
- Store first 100,000 primes (up to 1,299,709) as a static array
- Compile-time generation using `build.rs` or `include_bytes!`
- Instant O(1) lookup for common cases
- ~400KB memory footprint (100K × 4 bytes)

### 2. On-Demand Generation
- For primes beyond the table, use optimized sieve
- Cache results in runtime HashMap
- Miller-Rabin test for very large numbers

## Implementation Architecture

```rust
// src/sql/functions/mathematics/primes.rs

use once_cell::sync::Lazy;
use std::collections::HashSet;

// Pre-computed at compile time
const PRIMES_10K: &[u32] = &include!("../../../data/primes_10k.rs");

// Lazy-loaded larger table
static PRIMES_100K: Lazy<Vec<u32>> = Lazy::new(|| {
    include_bytes!("../../../data/primes_100k.bin")
        .chunks(4)
        .map(|chunk| u32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]))
        .collect()
});

// HashSet for O(1) primality testing
static PRIME_SET: Lazy<HashSet<u32>> = Lazy::new(|| {
    PRIMES_100K.iter().cloned().collect()
});

pub struct PrimeEngine {
    // Runtime cache for primes beyond pre-computed range
    extended_cache: std::sync::RwLock<Vec<u64>>,
}

impl PrimeEngine {
    /// Get the nth prime (1-indexed)
    pub fn nth_prime(&self, n: usize) -> Result<u64, String> {
        if n == 0 {
            return Err("Prime index must be >= 1".to_string());
        }
        
        // Fast path: pre-computed
        if n <= PRIMES_100K.len() {
            return Ok(PRIMES_100K[n - 1] as u64);
        }
        
        // Slow path: generate on demand
        self.generate_nth_prime(n)
    }
    
    /// Check if a number is prime
    pub fn is_prime(&self, n: u64) -> bool {
        if n < 2 { return false; }
        if n == 2 { return true; }
        if n % 2 == 0 { return false; }
        
        // Fast path: check pre-computed set
        if n <= 1_299_709 {
            return PRIME_SET.contains(&(n as u32));
        }
        
        // Medium numbers: trial division with prime list
        if n < 1_000_000_000 {
            let sqrt_n = (n as f64).sqrt() as u32;
            for &p in PRIMES_100K.iter() {
                if p > sqrt_n { return true; }
                if n % (p as u64) == 0 { return false; }
            }
            return true;
        }
        
        // Large numbers: Miller-Rabin test
        miller_rabin(n)
    }
    
    /// Count primes up to n (prime counting function π(n))
    pub fn prime_count(&self, n: u64) -> usize {
        if n < 2 { return 0; }
        
        // Fast path: binary search in pre-computed list
        if n <= 1_299_709 {
            match PRIMES_100K.binary_search(&(n as u32)) {
                Ok(idx) => idx + 1,
                Err(idx) => idx,
            }
        } else {
            // Use approximation or sieve for large n
            self.count_primes_sieve(n)
        }
    }
    
    /// Find next prime >= n
    pub fn next_prime(&self, n: u64) -> u64 {
        if n <= 2 { return 2; }
        
        let mut candidate = if n % 2 == 0 { n + 1 } else { n };
        while !self.is_prime(candidate) {
            candidate += 2;
        }
        candidate
    }
    
    /// Get prime factorization
    pub fn factor(&self, mut n: u64) -> Vec<(u64, u32)> {
        let mut factors = Vec::new();
        
        // Trial division with small primes
        for &p in PRIMES_10K.iter() {
            if p as u64 * p as u64 > n { break; }
            
            let mut count = 0;
            while n % (p as u64) == 0 {
                n /= p as u64;
                count += 1;
            }
            
            if count > 0 {
                factors.push((p as u64, count));
            }
        }
        
        // If n > 1, it's a prime factor
        if n > 1 {
            factors.push((n, 1));
        }
        
        factors
    }
}
```

## Build-Time Prime Generation

```rust
// build.rs
use std::env;
use std::fs::File;
use std::io::Write;
use std::path::Path;

fn sieve_of_eratosthenes(limit: usize) -> Vec<u32> {
    let mut is_prime = vec![true; limit + 1];
    is_prime[0] = false;
    is_prime[1] = false;
    
    for i in 2..=((limit as f64).sqrt() as usize) {
        if is_prime[i] {
            for j in ((i * i)..=limit).step_by(i) {
                is_prime[j] = false;
            }
        }
    }
    
    is_prime.iter()
        .enumerate()
        .filter(|(_, &prime)| prime)
        .map(|(num, _)| num as u32)
        .collect()
}

fn main() {
    let out_dir = env::var("OUT_DIR").unwrap();
    
    // Generate first 100,000 primes
    let mut primes = Vec::with_capacity(100_000);
    let mut candidate = 2u32;
    
    while primes.len() < 100_000 {
        if is_prime_simple(candidate) {
            primes.push(candidate);
        }
        candidate += if candidate == 2 { 1 } else { 2 };
    }
    
    // Write as Rust array (for small sets)
    let dest_path = Path::new(&out_dir).join("primes_10k.rs");
    let mut f = File::create(&dest_path).unwrap();
    write!(f, "[").unwrap();
    for (i, p) in primes[..10_000].iter().enumerate() {
        if i > 0 { write!(f, ",").unwrap(); }
        if i % 20 == 0 { write!(f, "\n    ").unwrap(); }
        write!(f, "{}", p).unwrap();
    }
    write!(f, "\n]").unwrap();
    
    // Write as binary file (for large sets)
    let dest_path = Path::new(&out_dir).join("primes_100k.bin");
    let mut f = File::create(&dest_path).unwrap();
    for p in primes.iter() {
        f.write_all(&p.to_le_bytes()).unwrap();
    }
    
    println!("cargo:rerun-if-changed=build.rs");
}
```

## SQL Functions

```sql
-- Basic prime functions
SELECT PRIME(100) as p100;           -- 100th prime = 541
SELECT IS_PRIME(17) as check;        -- true
SELECT PRIME_COUNT(100) as pi_100;   -- 25 primes <= 100
SELECT NEXT_PRIME(100) as next;      -- 101
SELECT PREV_PRIME(100) as prev;      -- 97

-- Prime factorization
SELECT FACTOR(60) as factors;        -- [(2,2), (3,1), (5,1)]
SELECT FACTOR_STRING(60) as str;     -- "2^2 × 3 × 5"

-- Prime gaps
SELECT PRIME_GAP(100) as gap;        -- PRIME(101) - PRIME(100)
SELECT MAX_GAP(1000) as max_gap;     -- Largest gap in first 1000 primes

-- Twin primes
SELECT IS_TWIN_PRIME(11) as twin;    -- true (11 and 13)
SELECT TWIN_PRIME(5) as fifth_twin;  -- 5th twin prime pair

-- Prime testing utilities
SELECT IS_MERSENNE(31) as mersenne;  -- true (2^31-1 is prime)
SELECT IS_SOPHIE_GERMAIN(11) as sg;  -- true (11 and 23 are both prime)
```

## Memory/Performance Trade-offs

### Option 1: Minimal (10K primes, 40KB)
- Covers up to 104,729
- Good for most common cases
- Very fast load time

### Option 2: Standard (100K primes, 400KB)
- Covers up to 1,299,709  
- Handles 99% of use cases
- Still instant load with Lazy

### Option 3: Extended (1M primes, 4MB)
- Covers up to 15,485,863
- For heavy prime computation
- Could load on first use

### Option 4: Compressed (100K primes, 200KB)
- Use bit compression techniques
- Slightly slower access
- Better memory usage

## Optimization Techniques

### 1. Wheel Factorization
```rust
// Skip multiples of 2, 3, 5
const WHEEL: [u8; 8] = [1, 7, 11, 13, 17, 19, 23, 29];
```

### 2. Segmented Sieve for Ranges
```rust
pub fn primes_in_range(start: u64, end: u64) -> Vec<u64> {
    // Use segmented sieve for memory efficiency
}
```

### 3. Prime Number Theorem Approximations
```rust
pub fn approx_nth_prime(n: usize) -> u64 {
    // n * ln(n) + n * ln(ln(n)) for large n
    let n_f = n as f64;
    let ln_n = n_f.ln();
    (n_f * (ln_n + ln_n.ln())) as u64
}
```

## Integration with SQL CLI

```rust
// src/sql/functions/mod.rs
impl FunctionRegistry {
    pub fn register_prime_functions(&mut self) {
        let engine = Arc::new(PrimeEngine::new());
        
        // Register all prime functions
        self.register_fn("PRIME", move |args| {
            let n = args[0].as_integer()?;
            Ok(DataValue::Integer(engine.nth_prime(n as usize)? as i64))
        });
        
        self.register_fn("IS_PRIME", move |args| {
            let n = args[0].as_integer()?;
            Ok(DataValue::Boolean(engine.is_prime(n as u64)))
        });
        
        // ... more functions
    }
}
```

## Usage Examples

### Find Goldbach Pairs
```sql
-- Verify Goldbach's conjecture for even numbers
WITH evens AS (
    SELECT n * 2 as even_num FROM generate_series(2, 100) as n
)
SELECT 
    even_num,
    p1.val as prime1,
    even_num - p1.val as prime2
FROM evens
CROSS JOIN LATERAL (
    SELECT PRIME(i) as val 
    FROM generate_series(1, PRIME_COUNT(even_num/2)) as i
) p1
WHERE IS_PRIME(even_num - p1.val)
  AND p1.val <= even_num / 2
LIMIT 1;
```

### Analyze Prime Gaps
```sql
SELECT 
    n,
    PRIME(n) as p,
    PRIME(n+1) - PRIME(n) as gap,
    ROUND(PRIME(n+1) / PRIME(n), 6) as ratio
FROM generate_series(1, 1000) as n
WHERE PRIME(n+1) - PRIME(n) > 10
ORDER BY gap DESC;
```

### Prime Factorization
```sql
SELECT 
    number,
    FACTOR_STRING(number) as factorization,
    IS_PRIME(number) as is_prime
FROM (VALUES (60), (97), (1001), (65537)) as t(number);
```

## Performance Characteristics

| Operation | Range | Time Complexity | Actual Time |
|-----------|-------|----------------|-------------|
| PRIME(n) | n ≤ 100K | O(1) | < 1μs |
| PRIME(n) | n > 100K | O(n log n) | ~10ms per 1K |
| IS_PRIME(n) | n ≤ 1.3M | O(1) | < 1μs |
| IS_PRIME(n) | n < 10^9 | O(√n/ln n) | < 1ms |
| IS_PRIME(n) | n > 10^9 | O(k log³n) | < 10ms |
| PRIME_COUNT(n) | n ≤ 1.3M | O(log n) | < 10μs |
| FACTOR(n) | n < 10^12 | O(√n/ln n) | < 100ms |

## Summary

This approach gives us:
1. **Instant access** to first 100K primes
2. **Fast primality testing** up to 10^9
3. **Efficient factorization** for most numbers
4. **Extensible design** for additional prime functions
5. **Minimal memory footprint** (400KB for massive capability)