sql_query_analyzer 0.1.0

Static analysis tool for SQL queries with 18 built-in rules for performance, security, and style
sql_query_analyzer-0.1.0 is not a library.
Visit the last successful build: sql_query_analyzer-0.5.2

SQL Query Analyzer

Crates.io Docs.rs CI License: MIT

Static analysis and LLM-powered optimization for SQL queries.

A comprehensive SQL analysis tool that combines fast, deterministic static analysis with optional AI-powered insights. Identifies performance issues, style violations, and security vulnerabilities in your SQL queries.

Highlights

  • 18 Built-in Rules — Performance, style, and security checks run instantly without API calls
  • Schema-Aware Analysis — Validates queries against your database schema, suggests missing indexes
  • Multiple Output Formats — Text, JSON, YAML, and SARIF for CI/CD integration
  • Parallel Execution — Rules execute concurrently using rayon
  • Optional LLM Analysis — Deep semantic analysis via OpenAI, Anthropic, or local Ollama
  • Configurable — Disable rules, override severity levels, customize via TOML

Installation

From source

cargo install --path .

Pre-built binaries

Download from Releases.

Quick Start

# Run static analysis (no API key required)
sql-query-analyzer analyze -s schema.sql -q queries.sql

# Output as SARIF for CI/CD
sql-query-analyzer analyze -s schema.sql -q queries.sql -f sarif > results.sarif

# Pipe queries from stdin
echo "SELECT * FROM users" | sql-query-analyzer analyze -s schema.sql -q -

# Enable LLM analysis
export LLM_API_KEY="sk-..."
sql-query-analyzer analyze -s schema.sql -q queries.sql --provider openai

Rules

Performance Rules

ID Rule Severity Description
PERF001 Select star without limit Warning SELECT * without LIMIT can return unbounded rows
PERF002 Leading wildcard Warning LIKE '%value' prevents index usage
PERF003 OR instead of IN Info Multiple OR conditions can be simplified to IN
PERF004 Large offset Warning OFFSET > 1000 causes performance degradation
PERF005 Missing join condition Error Cartesian product detected
PERF006 Distinct with order by Info Potentially redundant operations
PERF007 Scalar subquery Warning N+1 query pattern detected
PERF008 Function on column Warning Function calls prevent index usage
PERF009 NOT IN with subquery Warning Can cause unexpected NULL behavior
PERF010 UNION without ALL Info Unnecessary deduplication overhead
PERF011 Select without where Info Full table scan on large tables

Style Rules

ID Rule Severity Description
STYLE001 Select star Info Explicit column list preferred
STYLE002 Missing table alias Info Multi-table queries should use aliases

Security Rules

ID Rule Severity Description
SEC001 Missing WHERE in UPDATE Error Potentially dangerous bulk update
SEC002 Missing WHERE in DELETE Error Potentially dangerous bulk delete

Schema-Aware Rules

ID Rule Severity Description
SCHEMA001 Missing index on filter Warning WHERE/JOIN column lacks index
SCHEMA002 Column not in schema Warning Referenced column doesn't exist
SCHEMA003 Index suggestion Info ORDER BY column could benefit from index

Configuration

Configuration is loaded from (in order of precedence):

  1. Command-line arguments
  2. Environment variables
  3. .sql-analyzer.toml in current directory
  4. ~/.config/sql-analyzer/config.toml

Example Configuration

[rules]
# Disable specific rules by ID
disabled = ["STYLE001", "PERF011"]

# Override default severity levels
[rules.severity]
PERF001 = "error"      # Promote to error
SCHEMA001 = "info"     # Demote to info

[llm]
provider = "ollama"
model = "codellama"
ollama_url = "http://localhost:11434"

[retry]
max_retries = 3
initial_delay_ms = 1000
max_delay_ms = 30000
backoff_factor = 2.0

Environment Variables

Variable Description
LLM_API_KEY API key for OpenAI/Anthropic
LLM_PROVIDER Provider name (openai, anthropic, ollama)
LLM_MODEL Model identifier
OLLAMA_URL Ollama base URL

CLI Reference

sql-query-analyzer analyze [OPTIONS] -s <SCHEMA> -q <QUERIES>

Options

Flag Description Default
-s, --schema <FILE> Path to SQL schema file required
-q, --queries <FILE> Path to SQL queries file (use - for stdin) required
-p, --provider <PROVIDER> LLM provider: openai, anthropic, ollama ollama
-a, --api-key <KEY> API key (or use LLM_API_KEY env) -
-m, --model <MODEL> Model name provider default
--ollama-url <URL> Ollama base URL http://localhost:11434
--dialect <DIALECT> SQL dialect: generic, mysql, postgresql, sqlite generic
-f, --output-format <FMT> Output: text, json, yaml, sarif text
-v, --verbose Show complexity scores false
--dry-run Show what would be sent to LLM false
--no-color Disable colored output false

Exit Codes

Code Meaning
0 Success, no issues or only informational
1 Warnings found
2 Errors found

Example

schema.sql:

CREATE TABLE users (
    id INT PRIMARY KEY,
    email VARCHAR(255) NOT NULL,
    created_at TIMESTAMP
);

CREATE TABLE orders (
    id INT PRIMARY KEY,
    user_id INT NOT NULL,
    total DECIMAL(10,2),
    status VARCHAR(20)
);

CREATE INDEX idx_orders_user ON orders(user_id);

queries.sql:

SELECT * FROM users WHERE email = 'test@example.com';
SELECT * FROM orders WHERE user_id = 1 ORDER BY created_at DESC;
DELETE FROM users;

Output:

=== Static Analysis ===
Found 1 error(s), 2 warning(s), 1 info

Query #1:
  [ERROR] SEC002: DELETE without WHERE clause is dangerous
         → Add WHERE clause to limit affected rows
  [ WARN] SCHEMA001: Column 'email' in WHERE clause has no index
         → Consider adding index on 'email'

Query #2:
  [ WARN] SCHEMA001: Column 'created_at' in ORDER BY has no index
         → Consider adding index on 'created_at'
  [ INFO] SCHEMA003: ORDER BY column 'created_at' could benefit from index
         → CREATE INDEX idx_created_at ON table(created_at)

CI/CD Integration

GitHub Actions

name: SQL Analysis

on:
  pull_request:
    paths:
      - '**/*.sql'

jobs:
  analyze:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install sql-query-analyzer
        run: cargo install sql-query-analyzer

      - name: Analyze SQL
        run: |
          sql-query-analyzer analyze \
            -s db/schema.sql \
            -q db/queries.sql \
            -f sarif > results.sarif

      - name: Upload SARIF
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: results.sarif

GitLab CI

sql-analysis:
  stage: test
  script:
    - cargo install sql-query-analyzer
    - sql-query-analyzer analyze -s schema.sql -q queries.sql -f sarif > gl-sast-report.json
  artifacts:
    reports:
      sast: gl-sast-report.json

Pre-commit Hook

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: sql-analyzer
        name: SQL Query Analyzer
        entry: sql-query-analyzer analyze -s schema.sql -q
        language: system
        files: \.sql$

LLM Providers

Provider Model Examples Notes
OpenAI gpt-4, gpt-3.5-turbo Requires API key
Anthropic claude-sonnet-4-20250514 Requires API key
Ollama llama3.2, codellama, mistral Local, no API key

Using Ollama (Recommended for Development)

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull a model
ollama pull llama3.2

# Run analysis
sql-query-analyzer analyze -s schema.sql -q queries.sql

Architecture

┌─────────────────────────────────────────────────────┐
│                    CLI Interface                     │
└─────────────────────┬───────────────────────────────┘
                      │
         ┌────────────┴────────────┐
         ▼                         ▼
┌─────────────────┐       ┌─────────────────┐
│  SQL Parser     │       │  Schema Parser  │
│  (sqlparser)    │       │  (sqlparser)    │
└────────┬────────┘       └────────┬────────┘
         │                         │
         └────────────┬────────────┘
                      ▼
         ┌────────────────────────┐
         │    Static Analysis     │
         │  (18 rules, parallel)  │
         └────────────┬───────────┘
                      │
                      ▼
         ┌────────────────────────┐
         │   LLM Analysis (opt)   │
         │  OpenAI/Anthropic/     │
         │  Ollama                │
         └────────────┬───────────┘
                      │
                      ▼
         ┌────────────────────────┐
         │   Output Formatter     │
         │  Text/JSON/YAML/SARIF  │
         └────────────────────────┘

Performance

  • Parallel rule execution via rayon
  • Query caching to avoid re-parsing identical queries
  • Lazy evaluation for complexity scoring
  • Memory-efficient string storage with CompactString

Typical performance: ~1000 queries analyzed in <100ms (static analysis only).

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Development

# Run tests
cargo test

# Run with all checks
cargo clippy --all-targets -- -D warnings

# Generate docs
cargo doc --open

# Format code
cargo fmt

Acknowledgements

The idea for this tool came from Yegor Bugayenko:

It would be great to have a tool that takes two inputs: 1) the entire database schema in SQL, and 2) all SQL queries that my web app issues to the database during unit testing. The tool should use an LLM to analyze the queries and identify which ones are suboptimal, especially with respect to the existing indexes.

License

MIT © 2025