sql-cli 1.69.1

SQL query tool for CSV/JSON with both interactive TUI and non-interactive CLI modes - perfect for exploration and automation
Documentation
# SQL Math Functions - Current Implementation Status

## Overview
Updated status of mathematical functions implementation based on the SQL_MATH_EXTENSIONS.md roadmap.

## ✅ Phase 1: Basic Arithmetic - COMPLETED
**Status: FULLY IMPLEMENTED**

### Implemented Operations
- ✅ Addition: `quantity + 10`
- ✅ Subtraction: `price - discount` 
- ✅ Multiplication: `quantity * price`
- ✅ Division: `total / quantity`
- ✅ Modulo: `id % 10` (via MOD function)
- ✅ Parentheses: `(price - cost) * quantity`

**Evidence:** All basic arithmetic works in complex expressions like:
```sql
SELECT 
    price * quantity * (1 - discount/100) as net_amount,
    ROUND((selling_price - cost_basis) / cost_basis * 100, 2) as profit_margin_pct
FROM trade_data
```

## ✅ Phase 2: Mathematical Functions - COMPLETED
**Status: FULLY IMPLEMENTED**

### Implemented Functions
- `ROUND(value, decimals)` - Round to N decimal places
-`FLOOR(value)` - Round down
-`CEIL(value)` / `CEILING(value)` - Round up (both aliases work)
-`ABS(value)` - Absolute value
-`POWER(base, exponent)` / `POW(base, exponent)` - Power (both aliases work)
-`SQRT(value)` - Square root
-`MOD(a, b)` - Modulo operation
-`QUOTIENT(a, b)` - Integer division

### Advanced Functions (Bonus - Not in Original Plan)
- `PI()` - Pi constant
-`EXP(x)` - e^x (exponential)
-`LN(x)` - Natural logarithm
-`LOG(x)` / `LOG10(x)` - Base 10 logarithm
-`LOG(base, x)` - Custom base logarithm

### Missing from Phase 2 Plan
- `GREATEST(a, b, ...)` - Maximum value
-`LEAST(a, b, ...)` - Minimum value

**Evidence:** All functions work in complex nested expressions:
```sql
SELECT 
    ROUND(SQRT(POWER(leg1, 2) + POWER(leg2, 2)), 3) as hypotenuse,
    CEIL(LOG10(ABS(value) + 1)) as log_scale
FROM data
```

## ❌ Phase 3: Aggregate Functions - NOT IMPLEMENTED
**Status: NOT STARTED**

### Missing Core Aggregates
- `SUM(expression)` - Sum of values
-`AVG(expression)` - Average
-`COUNT(*)` / `COUNT(column)` - Count rows
-`MIN(expression)` - Minimum value
-`MAX(expression)` - Maximum value
-`STDDEV(expression)` - Standard deviation
-`VARIANCE(expression)` - Variance

### Missing GROUP BY Support
- ❌ GROUP BY clause parsing
- ❌ GROUP BY execution
- ❌ HAVING clause

## 🎯 New Features Since Original Plan

### CASE WHEN Expressions - ✅ COMPLETED!
- **Parser Complete** - Full CASE WHEN syntax support
-**AST Structure** - Proper CaseExpression with WhenBranch
-**Complex Nesting** - CASE within CASE supported
-**Evaluator Complete** - ArithmeticEvaluator and RecursiveWhereEvaluator support
-**Comparison Operators** - All operators (>, <, >=, <=, =, !=, <>) with type coercion
-**Comprehensive Testing** - 10 test scenarios covering all working functionality
-**Production Ready** - Documented limitations and workarounds

Example that works in production:
```sql
SELECT id, a,
    CASE 
        WHEN a > 10 THEN 'Big'
        WHEN a > 5 THEN 
            CASE 
                WHEN MOD(a, 2) = 0 THEN 'Medium Even' 
                ELSE 'Medium Odd' 
            END 
        ELSE 'Small' 
    END as category
FROM test_simple_math
```

### Date Functions (Bonus) - ✅ COMPLETED!
- `DATEDIFF('unit', date1, date2)` - Date difference
-`DATEADD('unit', amount, date)` - Date arithmetic  
-`NOW()` - Current timestamp
-`TODAY()` - Current date

### String Functions (Bonus) - ✅ COMPLETED!
- `TEXTJOIN(delimiter, ignore_empty, val1, val2, ...)` - Join values
- ✅ Method calls: `name.Trim()`, `name.Length()`, `email.Contains('@')`
- ✅ String methods: `StartsWith()`, `EndsWith()`, `IndexOf()`
-`MID(string, start, length)` - Substring extraction
-`UPPER(string)` - Convert to uppercase
-`LOWER(string)` - Convert to lowercase
-`TRIM(string)` - Remove leading/trailing spaces

### Physics Constants - ✅ COMPLETED! (v1.35.0)
Over 27 physics and mathematical constants as zero-argument functions:
- **Fundamental**: `C()`, `G()`, `PLANCK()`, `HBAR()`, `K()`, `NA()`
-**Electromagnetic**: `E0()`, `MU0()`, `QE()`, `ALPHA()`
-**Particle Masses**: `ME()`, `MP()`, `MN()`, `AMU()`
-**Mathematical**: `PI()`, `EULER()`, `TAU()`, `PHI()`, `SQRT2()`
-**Other**: `R()`, `SIGMA()`, `RY()`

### Unit Conversions - ✅ COMPLETED! (v1.35.0)
- `CONVERT(value, from_unit, to_unit)` - Convert between 150+ units
-**8 Categories**: Length, Mass, Temperature, Volume, Time, Area, Speed, Pressure
-**Case-insensitive** with multiple aliases
-**SI normalization** for accuracy

### SQL Comments - ✅ COMPLETED! (v1.36.0)
- **Single-line comments** with `--`
-**Block comments** with `/* */`
-**Works everywhere** - inline, multi-line, nested
-**Perfect for documentation** in query files

### DUAL Table Support - ✅ COMPLETED! (v1.35.0)
- **Oracle-compatible** single-row table
-**No FROM clause needed** - implicit DUAL
-**Scientific calculator** functionality

## 📊 Overall Progress

| Phase | Status | Functions | Completion |
|-------|--------|-----------|------------|
| **Phase 1**: Basic Arithmetic | ✅ Complete | 6/6 | 100% |
| **Phase 2**: Math Functions | ✅ Complete | 10/12 | 83% |
| **Phase 3**: Aggregates | ❌ Not Started | 0/7 | 0% |
| **Bonus**: CASE WHEN | ✅ Complete | 1/1 | 100% |
| **Bonus**: Date Functions | ✅ Complete | 4/4 | 100% |
| **Bonus**: String Functions | ✅ Complete | 6+/6+ | 100% |

## 🚀 Development Roadmap (Priority Order)

### ⭐ PHASE 1: Basic Aggregations WITHOUT GROUP BY (Next Sprint)
**HIGH PRIORITY - Foundation for data analysis**

#### Core Aggregates (Day 1-2)
- `COUNT(*)` - Count all rows
- `COUNT(column)` - Count non-null values  
- `COUNT(DISTINCT column)` - Count unique values
- `SUM(column)` - Sum of values
- `AVG(column)` - Average value
- `MIN(column)` - Minimum value
- `MAX(column)` - Maximum value

#### Statistical Aggregates (Day 3)
- `STDDEV(column)` / `STDEV(column)` - Standard deviation
- `VARIANCE(column)` / `VAR(column)` - Variance  
- `MEDIAN(column)` - Median value
- `MODE(column)` - Most frequent value

**Example Usage:**
```sql
-- Works with WHERE clauses
SELECT 
    COUNT(*) as total_trades,
    AVG(price) as avg_price,
    STDDEV(price) as price_volatility,
    MIN(timestamp) as first_trade,
    MAX(timestamp) as last_trade
FROM trades
WHERE category = 'EQUITY' 
  AND price > 100
```

### ⭐ PHASE 2: Comparison & Utility Functions (Quick Wins)
**MEDIUM PRIORITY - High value, easy implementation**

#### Comparison Functions (Day 4)
- `GREATEST(a, b, c, ...)` - Return largest value
- `LEAST(a, b, c, ...)` - Return smallest value
- `COALESCE(a, b, c, ...)` - First non-null value
- `NULLIF(a, b)` - Return NULL if a = b

#### Number Theory Functions (Day 5)
- `ISPRIME(n)` - Check if number is prime
- `PRIME(n)` - Return nth prime number
- `NEXTPRIME(n)` - Next prime after n
- `GCD(a, b)` - Greatest common divisor
- `LCM(a, b)` - Least common multiple
- `FACTORIAL(n)` - n!

**Example Usage:**
```sql
SELECT 
    GREATEST(bid, ask, last) as high_price,
    COALESCE(preferred_price, market_price, 0) as price,
    PRIME(10) as tenth_prime,  -- Returns 29
    GCD(48, 18) as common_divisor  -- Returns 6
FROM data
```

### ⭐ PHASE 3: Financial Functions (Domain-Specific)
**MEDIUM PRIORITY - High value for financial users**

#### Time Value of Money (Week 2)
- `PV(rate, nper, pmt, [fv], [type])` - Present value
- `FV(rate, nper, pmt, [pv], [type])` - Future value  
- `PMT(rate, nper, pv, [fv], [type])` - Payment amount
- `RATE(nper, pmt, pv, [fv], [type])` - Interest rate
- `NPER(rate, pmt, pv, [fv], [type])` - Number of periods

#### Investment Analysis
- `NPV(rate, cashflow1, cashflow2, ...)` - Net present value
- `IRR(cashflow1, cashflow2, ...)` - Internal rate of return
- `XIRR(values, dates)` - IRR for irregular periods
- `COMPOUND(principal, rate, periods)` - Compound interest

**Example Usage:**
```sql
SELECT 
    -- Monthly payment for $200,000 loan at 5% for 30 years
    PMT(0.05/12, 30*12, -200000) as monthly_payment,
    
    -- Future value of $100/month for 10 years at 7% annual
    FV(0.07/12, 10*12, -100) as savings_value
FROM DUAL
```

### ⭐ PHASE 4: Single-Column GROUP BY
**HIGH COMPLEXITY - Major feature addition**

#### Basic Implementation (Week 3)
- Parse GROUP BY clause
- Build groups in memory
- Apply aggregates per group
- Support HAVING clause

**Example Usage:**
```sql
SELECT 
    category,
    COUNT(*) as count,
    AVG(price) as avg_price,
    SUM(quantity) as total_volume
FROM trades
WHERE date >= '2024-01-01'
GROUP BY category
HAVING COUNT(*) > 10
```

### ⭐ PHASE 5: Advanced Statistical Functions
**LOWER PRIORITY - Nice to have**

#### Distribution Functions
- `PERCENTILE(column, p)` - Percentile value
- `QUARTILE(column, q)` - Quartile (1, 2, or 3)
- `CORR(col1, col2)` - Correlation coefficient
- `COVAR(col1, col2)` - Covariance
- `SKEW(column)` - Skewness
- `KURTOSIS(column)` - Kurtosis

#### Window Functions (Future)
- `ROW_NUMBER()`, `RANK()`, `DENSE_RANK()`
- `LAG()`, `LEAD()`
- `FIRST_VALUE()`, `LAST_VALUE()`

## 🎯 Recommendation: Add GREATEST/LEAST Functions

**Rationale:**
1. **Build on existing foundation** - Math function infrastructure already exists
2. **Complete Phase 2** - Only 2 functions missing for 100% completion  
3. **High user value** - GREATEST/LEAST are commonly used in data analysis
4. **Low complexity** - Simple min/max logic across multiple values
5. **Quick implementation** - Should take 1-2 hours maximum

**Implementation Example:**
```rust
// Add to ArithmeticEvaluator function evaluation
"GREATEST" => {
    let mut max_value: Option<DataValue> = None;
    for arg_value in &arg_values {
        match &max_value {
            None => max_value = Some(arg_value.clone()),
            Some(current_max) => {
                if self.compare_values(arg_value, current_max, |a, b| a > b)?.to_bool()? {
                    max_value = Some(arg_value.clone());
                }
            }
        }
    }
    max_value.unwrap_or(DataValue::Null)
}
```

This would enable queries like:
```sql
SELECT 
    id,
    GREATEST(price1, price2, price3) as best_price,
    LEAST(cost1, cost2, cost3) as lowest_cost
FROM product_comparison
```

## 📈 Success Metrics Met

From the original SQL_MATH_EXTENSIONS.md goals:

- **Can perform basic calculations without external tools**
-**Handles NULL and type mismatches gracefully**  
-**Clear error messages for invalid expressions**
-**Compatible with existing SQL features**
- 🔄 **Performance within 20% of native calculations** (needs testing)

The SQL CLI has already exceeded the original Phase 1 and 2 goals and added significant bonus functionality!