lialoonk-sql-query-parser 0.1.0

A comprehensive SQL query parser and analyzer built with Rust, providing detailed metadata extraction and JSON serialization of parsed SQL statements.
Documentation
# SQL Query Parser


A comprehensive SQL query parser and analyzer built with Rust, providing detailed metadata extraction and JSON serialization of parsed SQL statements.

## Overview


This project implements a complete SQL parser that can analyze SELECT, INSERT, UPDATE, and DELETE statements. The parser extracts valuable metadata including table names, column references, function calls, aliases, and JOIN operations, making it useful for SQL analysis, optimization, and code generation tools.

## Technical Description


### Parsing Process


The SQL parser uses the Pest parser generator to define a comprehensive grammar for SQL syntax. The parsing process follows these steps:

1. **Lexical Analysis**: Input SQL text is tokenized according to the grammar rules defined in `grammar/grammar.pest`
2. **Syntax Analysis**: Tokens are parsed into an Abstract Syntax Tree (AST) using recursive descent parsing
3. **Semantic Analysis**: The AST is traversed to extract metadata about the query structure
4. **Serialization**: Extracted metadata can be serialized to JSON for further processing

### What is Being Parsed


The parser supports the following SQL constructs:
- **SELECT statements** with projections, FROM clauses, JOINs, WHERE conditions
- **INSERT statements** with table names and value lists
- **UPDATE statements** with SET clauses and WHERE conditions
- **DELETE statements** with table references and WHERE conditions
- **Complex expressions** including arithmetic, comparison, and logical operators
- **Function calls** and aggregate functions (SUM, COUNT, AVG, MIN, MAX)
- **Table aliases** and column references
- **JOIN operations** (INNER, LEFT, RIGHT, FULL)

## Features


- Full SQL syntax parsing (SELECT, INSERT, UPDATE, DELETE)
- JOIN operations support (INNER, LEFT, RIGHT, FULL)
- Metadata extraction (tables, columns, functions, aliases)
- JSON serialization of analysis results
- File and stdin input support
- Comprehensive error handling
- Extensive unit test coverage

## Usage


### Command Line Interface


```bash
# Parse SQL from command line

lialoonk-sql-query-parser parse --query "SELECT * FROM users"

# Analyze query metadata

lialoonk-sql-query-parser parse --query "SELECT SUM(price) FROM orders" --format analyze

# Parse from file

lialoonk-sql-query-parser parse --file query.sql --format json

# Display help

lialoonk-sql-query-parser help

# Display project credits

lialoonk-sql-query-parser credits
```

### Library Usage


```rust
use lialoonk_sql_query_parser::{parse_sql, analyze_sql, analyze_sql_json};

let sql = "SELECT id, name FROM users WHERE id = 1";

// Parse to AST
let ast = parse_sql(sql)?;

// Extract metadata
let metadata = analyze_sql(sql)?;

// Serialize to JSON
let json = analyze_sql_json(sql)?;
```

## Grammar Rules


The parser uses a comprehensive grammar defined in `grammar/grammar.pest` with over 60 rules covering all major SQL constructs. Each rule is thoroughly documented and tested.

### Grammar Examples


Here are some key grammar rules with examples:

#### SELECT Statement

```pest
select_stmt = { SELECT_KEY ~ projection ~ (FROM_KEY ~ from_item ~ join_clause?)? ~ where_clause? }
```

Parses queries like:
```sql
SELECT id, name FROM users WHERE id = 1
SELECT * FROM users u JOIN posts p ON u.id = p.user_id
```

#### Expressions

```pest
expr = { or_expr }
or_expr = { and_expr ~ (OR_KEY ~ and_expr)* }
and_expr = { not_expr ~ (AND_KEY ~ not_expr)* }
comparison = { addition ~ comparison_suffix* }
```

Supports complex expressions:
```sql
WHERE (price > 100 AND category = 'electronics') OR status = 'active'
WHERE age BETWEEN 18 AND 65
```

#### Functions and Aggregates

```pest
function_call = { identifier ~ "(" ~ (expr ~ ("," ~ expr)*)? ~ ")" }
```

Handles both regular and aggregate functions:
```sql
SELECT COUNT(*), SUM(price), AVG(rating) FROM products
SELECT UPPER(name), CONCAT(first_name, ' ', last_name) FROM users
```

## Technology Stack


### Core Technologies


- **Rust**: High-performance systems programming language with memory safety guarantees
- **Pest**: Fast parser generator for Rust with expressive grammar definitions
- **Serde**: Serialization framework for JSON output and structured data handling

### Architecture


The parser follows a modular architecture:

1. **Grammar Layer** (`grammar/grammar.pest`): Defines SQL syntax rules using Pest's PEG grammar
2. **Parser Layer** (`src/lib.rs`): Core parsing logic with AST construction
3. **Analysis Layer** (`src/lib.rs`): Metadata extraction and semantic analysis
4. **CLI Layer** (`src/main.rs`): Command-line interface using Clap
5. **Test Layer** (`tests/parser_rules.rs`): Comprehensive unit test coverage

### Key Components


- **SqlParser**: Main parser struct generated by Pest derive macro
- **QueryMetadata**: Data structure containing extracted query information
- **JoinInfo**: Structure representing JOIN operation details
- **CLI Commands**: Parse, help, and credits subcommands

## Analysis Examples


The parser extracts comprehensive metadata from SQL queries. Here are examples of analysis output:

### Simple SELECT Query


**Input:**
```sql
SELECT id, name FROM users WHERE active = true
```

**Analysis Output:**
```json
{
  "tables": ["users"],
  "columns": ["id", "name", "active"],
  "aliases": {},
  "functions": [],
  "aggregates": [],
  "joins": []
}
```

### Complex Query with JOINs and Aggregates


**Input:**
```sql
SELECT u.name, COUNT(o.id) as order_count, SUM(o.total) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2023-01-01'
GROUP BY u.id, u.name
HAVING COUNT(o.id) > 0
```

**Analysis Output:**
```json
{
  "tables": ["users", "orders"],
  "columns": ["u.name", "o.id", "o.total", "u.created_at", "u.id"],
  "aliases": {
    "u": "users",
    "o": "orders"
  },
  "functions": ["COUNT", "SUM"],
  "aggregates": ["COUNT", "SUM"],
  "joins": [
    {
      "join_type": "LEFT",
      "table": "orders",
      "alias": "o",
      "condition": "u.id = o.user_id"
    }
  ]
}
```

### INSERT Statement Analysis


**Input:**
```sql
INSERT INTO products (name, price, category) VALUES ('Laptop', 999.99, 'Electronics')
```

**Analysis Output:**
```json
{
  "tables": ["products"],
  "columns": ["name", "price", "category"],
  "aliases": {},
  "functions": [],
  "aggregates": [],
  "joins": []
}
```

## Repository


[https://github.com/Lialoonk/sql-query-parser](https://github.com/Lialoonk/sql-query-parser)