# SQL Expression Parser & Evaluator
A Rust library for parsing and evaluating SQL-like boolean expressions with full support for comparisons, arithmetic, pattern matching, and logical operators. Also see *CLAUDE.md* and *SqlExprParser-EBNF-Final.ebnf* for related documentation.
**NOTE:** This project has been superceded by [sqlexpr-congo-rust](https://github.com/richcar58/sqlexpr-congo-rust), which was built using CongoCC-generated Rust code and delivers significantly improved performance.
There are two Java SqlExpr parser/evaluator implementations that accept basically the same language as this Rust parser. See [sqlexpr-javacc](https://github.com/richcar58/sqlexpr-javacc) for a parser built using the [JavaCC](https://javacc.github.io/javacc/) parser generator; see [sqlexpr-congocc](https://github.com/richcar58/sqlexpr-congocc) for a parser built using the [CongoCC](https://parsers.org/) parser generator.
## Related Project Catalog
The project is a part of a suite of production-ready and experimental projects that provide SQL-based boolean expression parsers and evaluators that use different technologies in their construction and output. Below is a summary of these projects.
| <big>**Project Name**</big> | <big>**Description**</big> |
|:---|:---|
| >>> **Parsers and Evaluators** <<< | |
| [sqlexpr-javacc](https://github.com/richcar58/sqlexpr-javacc) | Java parser and evaluator. JavaCC generated parser that incorporates Apache ActiveMQ JMS evaluator. |
| [sqlexpr-congocc](https://github.com/richcar58/sqlexpr-congocc) | Java parser and evaluator. CongoCC generated parser that incorporates Apache ActiveMQ JMS evaluator. |
| [sqlexpr-rust](https://github.com/richcar58/sqlexpr-rust) | Rust parser and evaluator generated from EBNF specification by Claude (see NOTE below). |
| [sqlexpr-congo-rust](https://github.com/richcar58/sqlexpr-congo-rust) | Rust parser, evaluator and tests generated using the Enhanced CongoCC generator (*congo-parser-generator*) and Claude. |
| >>> **Testers** <<< | |
| [sqlexpr-gen](https://github.com/richcar58/sqlexpr-gen) | Rust test data generator for sqlexpr-* parsers and evaluators. |
| [sqlexpr-load-congocc](https://github.com/richcar58/sqlexpr-load-congocc) | Java load tester for sqlexpr-congocc. |
| [sqlexpr-load-rust](https://github.com/richcar58/sqlexpr-load-rust) | Rust load tester for sqlexpr-rust. |
| [sqlexpr-load-congo-rust](https://github.com/richcar58/sqlexpr-load-congo-rust) | Rust load tester for sqlexpr-congo-rust. |
| [sqlexpr-javacc-test](https://github.com/richcar58/sqlexpr-congocc-test) | Java integration tests for sqlexpr-javacc parser. |
| [sqlexpr-congocc-test](https://github.com/richcar58/sqlexpr-gen) | Java integration tests for sqlexpr-congocc parser. |
| >>> **CongoCC Rust Extension** <<< | |
| [congo-parser-generator](https://github.com/richcar58/congo-parser-generator) | Experimental version of CongoCC enhanced by Claude to output Rust parsers. |
**NOTE:** The first Rust parser/evaluator, **sqlexpr-rust**, was generated by Claude directly from an EBNF grammar. Load testing in *sqlexpr-load-rust* reveals that this implementation was actually **slower** than the handcrafted Java parser generated by CongoCC (**sqlexpr-congocc**). The second Rust parser/evaluator, **sqlexpr-congo-rust**, was generated using CongoCC enhanced to directly output Rust parsers. Load testing in *sqlexpr-load-congo-rust* results in an approximate 2.5% improvement over the CongoCC-generated Java parser (in non-scientific testing). Further performance improvements can probably be wrung from the improved Rust parser, but it's clear that a well-designed, well-written Java parser is quite competitive in certain testing scenarios.
## Features
### Parser
- **Grammar-enforced type safety**: All top-level expressions must be boolean-valued
- **Comprehensive operators**:
- Logical: `AND`, `OR`, `NOT`
- Comparison: `>`, `>=`, `<`, `<=`, `=`, `<>`, `!=`
- Pattern matching: `LIKE`, `NOT LIKE` (with `%`, `_` wildcards and `ESCAPE`)
- Range: `BETWEEN`, `NOT BETWEEN`
- Membership: `IN`, `NOT IN`
- Null testing: `IS NULL`, `IS NOT NULL`
- Arithmetic: `+`, `-`, `*`, `/`, `%` (modulo)
- Unary: `+`, `-`
- **Rich literals**:
- Integers: decimal (`42`), hexadecimal (`0xFF`), octal (`0755`)
- Floats: standard (`3.14`), scientific notation (`1.5e-10`)
- Strings: single-quoted with escape sequences (`'hello\'world'`)
- Booleans: `TRUE`, `FALSE`
- Null: `NULL`
- **Comments**: Line comments (`--`) and block comments (`/* */`)
- **Case-insensitive keywords**: `AND`, `and`, `And` all work
- **Detailed error messages**: Parse errors include position and context
This parser implements a clean separation between boolean and value expressions at the grammar level, ensuring most type safety during parsing rather than evaluation.
### Evaluator
- **Variable substitution**: Bind runtime values to variables
- **Type system**: Integer, Float, String, Boolean, Null
- **Automatic type coercion**: Mixed int/float arithmetic automatically promotes to float
- **Division semantics**: Always returns float (e.g., `7/2 = 3.5`)
- **Null handling**: NULL disallowed in arithmetic/comparisons, only allowed with `IS NULL`
- **Short-circuit evaluation**: `AND` and `OR` operators evaluate efficiently
- **Pattern matching**: Full LIKE implementation with wildcards and escape sequences
- **Comprehensive error reporting**: Type errors, null violations, division by zero, etc.
## Quick Start
Add to your `Cargo.toml`:
```toml
[dependencies]
sqlexpr-rust = "0.1.0"
```
### Parsing Expressions
```rust
use sqlexpr_rust::parse;
fn main() {
// Parse a simple comparison
let ast = parse("age >= 18 AND status = 'active'").unwrap();
println!("Parsed: {}", ast);
// Parse complex expressions
let expr = parse("(price * quantity) > 1000 AND customer_type IN ('gold', 'platinum')").unwrap();
// Parse with LIKE pattern matching
let pattern = parse("email LIKE '%@example.com' AND NOT deleted").unwrap();
}
```
### Evaluating Expressions
```rust
use std::collections::HashMap;
use sqlexpr_rust::{evaluate, RuntimeValue};
fn main() {
// Create variable bindings
let mut bindings = HashMap::new();
bindings.insert("age".to_string(), RuntimeValue::Integer(25));
bindings.insert("status".to_string(), RuntimeValue::String("active".to_string()));
bindings.insert("premium".to_string(), RuntimeValue::Boolean(true));
// Evaluate the expression
let result = evaluate("age >= 18 AND status = 'active' AND premium", &bindings).unwrap();
assert_eq!(result, true);
// Arithmetic evaluation
bindings.insert("price".to_string(), RuntimeValue::Float(99.99));
bindings.insert("quantity".to_string(), RuntimeValue::Integer(10));
let result = evaluate("(price * quantity) > 500", &bindings).unwrap();
assert_eq!(result, true);
// Pattern matching
bindings.insert("email".to_string(), RuntimeValue::String("user@example.com".to_string()));
let result = evaluate("email LIKE '%@example.com'", &bindings).unwrap();
assert_eq!(result, true);
}
```
### Error Handling
```rust
use sqlexpr_rust::{evaluate, RuntimeValue, EvalError};
use std::collections::HashMap;
fn main() {
let mut bindings = HashMap::new();
bindings.insert("x".to_string(), RuntimeValue::String("hello".to_string()));
// Type error: string in arithmetic
let result = evaluate("(x + 10) > 0", &bindings);
match result {
Err(EvalError::TypeError { operation, expected, actual, context }) => {
println!("Type error in {}: expected {}, got {} ({})",
operation, expected, actual, context);
}
_ => {}
}
// Type error: incompatible IN list
bindings.insert("y".to_string(), RuntimeValue::Integer(42));
let result = evaluate("y IN ('a', 'b', 'c')", &bindings);
assert!(matches!(result, Err(EvalError::TypeError { .. })));
}
```
## Project Layout
```
sqlexpr-rust/
├── src/
│ ├── lib.rs # Public API and re-exports
│ ├── lexer.rs # Tokenization
│ ├── parser.rs # Recursive descent parser
│ ├── ast.rs # Abstract Syntax Tree definitions
│ └── evaluator.rs # Expression evaluation engine
├── tests/
│ ├── parser_tests.rs # Parser test suite (155 tests)
│ ├── parser_type_checking_tests.rs # Parser type test suite (97 tests)
│ └── evaluator_tests.rs # Evaluator test suite (111 tests)
├── examples/
│ ├── showcase.rs # Feature demonstration
│ └── ... # Additional examples
├── docs/
│ ├── EVALUATION_DESIGN.md # Design alternatives
│ ├── EVALUATOR_IMPLEMENTATION_PLAN.md # Implementation roadmap
│ └── command_prompts.md # Development notes
├── SqlExprParser-EBNF-Final.ebnf # Formal grammar specification
├── Cargo.toml
├── CLAUDE.md
├── LICENSE
├── README.md
└── SqlExprParser-EBNF-Final.ebnf
```
## Core Components
### Lexer (`src/lexer.rs`)
Tokenizes input strings into a stream of tokens. Handles:
- Keywords (case-insensitive)
- Identifiers and variables
- Numeric literals (int, float, hex, octal, scientific)
- String literals with escapes
- Operators and punctuation
- Comments (line and block)
### Parser (`src/parser.rs`)
Recursive descent parser implementing the EBNF grammar. Features:
- Operator precedence handling
- Type safety at grammar level
- Lookahead for disambiguation
- Detailed error messages with position info
### AST (`src/ast.rs`)
Hierarchical AST structure:
- `BooleanExpr`: AND, OR, NOT, literals, variables, relational expressions
- `RelationalExpr`: Comparisons, LIKE, BETWEEN, IN, IS NULL
- `ValueExpr`: Arithmetic operations, literals, variables
### Evaluator (`src/evaluator.rs`)
Evaluation engine with:
- Variable binding resolution
- Type checking and coercion
- Short-circuit boolean logic
- Pattern matching for LIKE
- Comprehensive error handling
## Grammar Overview
The grammar enforces type safety at parse time:
```ebnf
BooleanExpression = BooleanOrExpression ;
BooleanOrExpression = BooleanAndExpression { "OR" BooleanAndExpression } ;
BooleanAndExpression = BooleanTerm { "AND" BooleanTerm } ;
BooleanTerm = "NOT" BooleanTerm
| "(" BooleanExpression ")"
| BooleanLiteral
| Variable
| RelationalExpression ;
RelationalExpression = ValueExpression ComparisonOp ValueExpression
| ValueExpression "LIKE" Pattern
| ValueExpression "BETWEEN" ValueExpression "AND" ValueExpression
| ValueExpression "IN" "(" ValueList ")"
| ValueExpression "IS" ["NOT"] "NULL" ;
ValueExpression = AdditiveExpression ;
AdditiveExpression = MultiplicativeExpression { ("+" | "-") MultiplicativeExpression } ;
MultiplicativeExpression = UnaryExpression { ("*" | "/" | "%") UnaryExpression } ;
UnaryExpression = ["+" | "-"] PrimaryExpression ;
PrimaryExpression = Literal | Variable | "(" ValueExpression ")" ;
```
See `SqlExprParser-EBNF-Final.ebnf` for the complete formal grammar.
## Type System
### RuntimeValue Types
- `Integer(i64)`: 64-bit signed integers
- `Float(f64)`: 64-bit floating point
- `String(String)`: UTF-8 strings
- `Boolean(bool)`: true/false
- `Null`: SQL NULL value
### Type Coercion Rules
1. **Arithmetic**: Int + Int → Int, Float + Float → Float
2. **Mixed arithmetic**: Int + Float → Float (automatic promotion)
3. **Division**: Always returns Float (e.g., `7 / 2 = 3.5`)
4. **Comparisons**: Same types compared directly; Int/Float mixing allowed
5. **NULL handling**: NULL in arithmetic/comparisons raises error; use `IS NULL`
## Examples
### Boolean Logic
```sql
TRUE AND FALSE -- false
age >= 18 AND status = 'active' -- depends on bindings
(x > 10 OR y > 10) AND NOT deleted -- compound condition
```
### Arithmetic
```sql
(price * quantity) > 1000 -- arithmetic in comparison
(revenue - cost) / revenue >= 0.2 -- percentage calculation
amount % 100 = 0 -- check divisibility
```
### Pattern Matching
```sql
email LIKE '%@example.com' -- domain match
name LIKE 'J%n' -- starts with J, ends with n
code LIKE 'A___B' -- A + 3 chars + B
text LIKE '50\%' ESCAPE '\' -- literal % character
```
### Range and Membership
```sql
age BETWEEN 18 AND 65 -- inclusive range
status IN ('active', 'pending') -- membership test
score NOT BETWEEN 0 AND 59 -- exclusion
role NOT IN ('admin', 'moderator') -- negative membership
```
### Null Handling
```sql
middle_name IS NULL -- null check
email IS NOT NULL -- non-null check
-- x + NULL would raise NullInOperation error
-- x > NULL would raise NullInOperation error
```
## Running Examples
```bash
# Run the feature showcase
cargo run --example showcase
# Enable pretty-printing of AST
SQLEXPR_PRETTY=true cargo run --example showcase
# Run all tests
cargo test
# Run specific test suite
cargo test --test parser_tests
cargo test --test evaluator_tests
# Build documentation
cargo doc --open
```
## Testing
The project includes comprehensive test coverage:
- **Parser tests** (`tests/parser_tests.rs`): 155 tests covering all grammar features
- **Evaluator tests** (`tests/evaluator_tests.rs`): 111 tests covering all operations
- **Unit tests** (`src/lib.rs`, modules): 13 embedded tests
- **Doc tests**: 1 documentation example test
Total: **280 tests**
Run tests with:
```bash
cargo test # All tests
cargo test --verbose # With output
cargo test <pattern> # Specific tests
```
### Viewing Abstract Syntax Trees (ASTs)
Tell the parser to pretty print ASTs of parsed expressions using the *SQLEXPR_PRETTY* environment variable. For example, the following commands can be used to dump the ASTs generated by the *parser_tests* and *evaluator_tests* programs. These commands should be run from the top-level project directory. For easy reference, the output files from these test programs are shipped with the source code.
```bash
SQLEXPR_PRETTY=true cargo test --test parser_tests -- --nocapture --test-threads=1 > examples/output/parser_tests.out
SQLEXPR_PRETTY=true cargo test --test evaluator_tests -- --nocapture --test-threads=1 > examples/output/evaluator_tests.out
```
## Error Messages
The library provides detailed error messages:
### Parse Errors
```
Parse error: Unexpected token ')' near position 15 in:
(x > 5 AND y < )
```
### Evaluation Errors
```
Type error in addition: expected numeric types, got string and integer
(context: arithmetic operation)
NULL value in GreaterThan operation (context: cannot compare NULL).
NULL is only allowed in IS NULL/IS NOT NULL
Division by zero in expression: x / 0 > 5
```
## Performance Considerations
- **Parser**: Single-pass recursive descent, O(n) complexity
- **Lexer**: Single-pass tokenization, O(n) complexity
- **Evaluator**: Direct evaluation without intermediate representation
- **Short-circuit**: AND/OR operators short-circuit for efficiency
- **Pattern matching**: Regex-based LIKE uses Rust's `regex` crate
## Limitations
1. **No subqueries**: Only standalone boolean expressions
2. **No aggregate functions**: No `SUM`, `COUNT`, etc.
3. **No date/time types**: Only basic types (int, float, string, bool, null)
4. **Case-sensitive strings**: String comparisons are case-sensitive
5. **No COLLATE**: String ordering uses Rust's string comparison
## License
See `LICENSE` file for details.
## Contributing
Contributions are welcome! Please ensure:
1. All tests pass: `cargo test`
2. Code follows Rust conventions: `cargo fmt`
3. No warnings: `cargo clippy`
4. Add tests for new features
## Documentation
- **Grammar**: See `SqlExprParser-EBNF-Final.ebnf`
- **API docs**: Run `cargo doc --open`
- **Design docs**: See `docs/` directory
- **Examples**: See `examples/` directory
## Acknowledgments
Anthopic's Claude Sonnet 4.5 was used to generate most of the code and documentation in this project.