sql_query_analyzer-0.1.0 is not a library.

Visit the last successful build: sql_query_analyzer-0.5.2

SQL Query Analyzer

Static analysis and LLM-powered optimization for SQL queries.

A comprehensive SQL analysis tool that combines fast, deterministic static analysis with optional AI-powered insights. Identifies performance issues, style violations, and security vulnerabilities in your SQL queries.

Highlights

18 Built-in Rules — Performance, style, and security checks run instantly without API calls
Schema-Aware Analysis — Validates queries against your database schema, suggests missing indexes
Multiple Output Formats — Text, JSON, YAML, and SARIF for CI/CD integration
Parallel Execution — Rules execute concurrently using rayon
Optional LLM Analysis — Deep semantic analysis via OpenAI, Anthropic, or local Ollama
Configurable — Disable rules, override severity levels, customize via TOML

Installation

From source

cargo install --path .

Pre-built binaries

Download from Releases.

Quick Start

# Run static analysis (no API key required)
sql-query-analyzer analyze -s schema.sql -q queries.sql

# Output as SARIF for CI/CD
sql-query-analyzer analyze -s schema.sql -q queries.sql -f sarif > results.sarif

# Pipe queries from stdin
echo "SELECT * FROM users" | sql-query-analyzer analyze -s schema.sql -q -

# Enable LLM analysis
export LLM_API_KEY="sk-..."
sql-query-analyzer analyze -s schema.sql -q queries.sql --provider openai

Rules

Performance Rules

ID	Rule	Severity	Description
`PERF001`	Select star without limit	Warning	`SELECT *` without `LIMIT` can return unbounded rows
`PERF002`	Leading wildcard	Warning	`LIKE '%value'` prevents index usage
`PERF003`	OR instead of IN	Info	Multiple `OR` conditions can be simplified to `IN`
`PERF004`	Large offset	Warning	`OFFSET > 1000` causes performance degradation
`PERF005`	Missing join condition	Error	Cartesian product detected
`PERF006`	Distinct with order by	Info	Potentially redundant operations
`PERF007`	Scalar subquery	Warning	N+1 query pattern detected
`PERF008`	Function on column	Warning	Function calls prevent index usage
`PERF009`	NOT IN with subquery	Warning	Can cause unexpected NULL behavior
`PERF010`	UNION without ALL	Info	Unnecessary deduplication overhead
`PERF011`	Select without where	Info	Full table scan on large tables

Style Rules

ID	Rule	Severity	Description
`STYLE001`	Select star	Info	Explicit column list preferred
`STYLE002`	Missing table alias	Info	Multi-table queries should use aliases

Security Rules

ID	Rule	Severity	Description
`SEC001`	Missing WHERE in UPDATE	Error	Potentially dangerous bulk update
`SEC002`	Missing WHERE in DELETE	Error	Potentially dangerous bulk delete

Schema-Aware Rules

ID	Rule	Severity	Description
`SCHEMA001`	Missing index on filter	Warning	WHERE/JOIN column lacks index
`SCHEMA002`	Column not in schema	Warning	Referenced column doesn't exist
`SCHEMA003`	Index suggestion	Info	ORDER BY column could benefit from index

Configuration

Configuration is loaded from (in order of precedence):

Command-line arguments
Environment variables
.sql-analyzer.toml in current directory
~/.config/sql-analyzer/config.toml

Example Configuration

[rules]
# Disable specific rules by ID
disabled = ["STYLE001", "PERF011"]

# Override default severity levels
[rules.severity]
PERF001 = "error"      # Promote to error
SCHEMA001 = "info"     # Demote to info

[llm]
provider = "ollama"
model = "codellama"
ollama_url = "http://localhost:11434"

[retry]
max_retries = 3
initial_delay_ms = 1000
max_delay_ms = 30000
backoff_factor = 2.0

Environment Variables

Variable	Description
`LLM_API_KEY`	API key for OpenAI/Anthropic
`LLM_PROVIDER`	Provider name (openai, anthropic, ollama)
`LLM_MODEL`	Model identifier
`OLLAMA_URL`	Ollama base URL

CLI Reference

sql-query-analyzer analyze [OPTIONS] -s <SCHEMA> -q <QUERIES>

Options

Flag	Description	Default
`-s, --schema <FILE>`	Path to SQL schema file	required
`-q, --queries <FILE>`	Path to SQL queries file (use `-` for stdin)	required
`-p, --provider <PROVIDER>`	LLM provider: `openai`, `anthropic`, `ollama`	`ollama`
`-a, --api-key <KEY>`	API key (or use `LLM_API_KEY` env)	-
`-m, --model <MODEL>`	Model name	provider default
`--ollama-url <URL>`	Ollama base URL	`http://localhost:11434`
`--dialect <DIALECT>`	SQL dialect: `generic`, `mysql`, `postgresql`, `sqlite`	`generic`
`-f, --output-format <FMT>`	Output: `text`, `json`, `yaml`, `sarif`	`text`
`-v, --verbose`	Show complexity scores	false
`--dry-run`	Show what would be sent to LLM	false
`--no-color`	Disable colored output	false

Exit Codes

Code	Meaning
`0`	Success, no issues or only informational
`1`	Warnings found
`2`	Errors found

Example

schema.sql:

CREATE TABLE users (
    id INT PRIMARY KEY,
    email VARCHAR(255) NOT NULL,
    created_at TIMESTAMP
);

CREATE TABLE orders (
    id INT PRIMARY KEY,
    user_id INT NOT NULL,
    total DECIMAL(10,2),
    status VARCHAR(20)
);

CREATE INDEX idx_orders_user ON orders(user_id);

queries.sql:

SELECT * FROM users WHERE email = 'test@example.com';
SELECT * FROM orders WHERE user_id = 1 ORDER BY created_at DESC;
DELETE FROM users;

Output:

=== Static Analysis ===
Found 1 error(s), 2 warning(s), 1 info

Query #1:
  [ERROR] SEC002: DELETE without WHERE clause is dangerous
         → Add WHERE clause to limit affected rows
  [ WARN] SCHEMA001: Column 'email' in WHERE clause has no index
         → Consider adding index on 'email'

Query #2:
  [ WARN] SCHEMA001: Column 'created_at' in ORDER BY has no index
         → Consider adding index on 'created_at'
  [ INFO] SCHEMA003: ORDER BY column 'created_at' could benefit from index
         → CREATE INDEX idx_created_at ON table(created_at)

CI/CD Integration

GitHub Actions

name: SQL Analysis

on:
  pull_request:
    paths:
      - '**/*.sql'

jobs:
  analyze:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install sql-query-analyzer
        run: cargo install sql-query-analyzer

      - name: Analyze SQL
        run: |
          sql-query-analyzer analyze \
            -s db/schema.sql \
            -q db/queries.sql \
            -f sarif > results.sarif

      - name: Upload SARIF
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: results.sarif

GitLab CI

sql-analysis:
  stage: test
  script:
    - cargo install sql-query-analyzer
    - sql-query-analyzer analyze -s schema.sql -q queries.sql -f sarif > gl-sast-report.json
  artifacts:
    reports:
      sast: gl-sast-report.json

Pre-commit Hook

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: sql-analyzer
        name: SQL Query Analyzer
        entry: sql-query-analyzer analyze -s schema.sql -q
        language: system
        files: \.sql$

LLM Providers

Provider	Model Examples	Notes
OpenAI	`gpt-4`, `gpt-3.5-turbo`	Requires API key
Anthropic	`claude-sonnet-4-20250514`	Requires API key
Ollama	`llama3.2`, `codellama`, `mistral`	Local, no API key

Using Ollama (Recommended for Development)

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull a model
ollama pull llama3.2

# Run analysis
sql-query-analyzer analyze -s schema.sql -q queries.sql

Architecture

┌─────────────────────────────────────────────────────┐
│                    CLI Interface                     │
└─────────────────────┬───────────────────────────────┘
                      │
         ┌────────────┴────────────┐
         ▼                         ▼
┌─────────────────┐       ┌─────────────────┐
│  SQL Parser     │       │  Schema Parser  │
│  (sqlparser)    │       │  (sqlparser)    │
└────────┬────────┘       └────────┬────────┘
         │                         │
         └────────────┬────────────┘
                      ▼
         ┌────────────────────────┐
         │    Static Analysis     │
         │  (18 rules, parallel)  │
         └────────────┬───────────┘
                      │
                      ▼
         ┌────────────────────────┐
         │   LLM Analysis (opt)   │
         │  OpenAI/Anthropic/     │
         │  Ollama                │
         └────────────┬───────────┘
                      │
                      ▼
         ┌────────────────────────┐
         │   Output Formatter     │
         │  Text/JSON/YAML/SARIF  │
         └────────────────────────┘

Performance

Parallel rule execution via rayon
Query caching to avoid re-parsing identical queries
Lazy evaluation for complexity scoring
Memory-efficient string storage with CompactString

Typical performance: ~1000 queries analyzed in <100ms (static analysis only).

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Development

# Run tests
cargo test

# Run with all checks
cargo clippy --all-targets -- -D warnings

# Generate docs
cargo doc --open

# Format code
cargo fmt

Acknowledgements

The idea for this tool came from Yegor Bugayenko:

It would be great to have a tool that takes two inputs: 1) the entire database schema in SQL, and 2) all SQL queries that my web app issues to the database during unit testing. The tool should use an LLM to analyze the queries and identify which ones are suboptimal, especially with respect to the existing indexes.

sql_query_analyzer 0.1.0