oxigdal-query 0.1.2

SQL-like query language and cost-based optimizer for geospatial data
Documentation

oxigdal-query

Crates.io Documentation License

A high-performance SQL-like query language and cost-based optimizer for geospatial data processing. This crate provides a complete query engine with parsing, optimization, parallel execution, and result caching designed for efficient data filtering and transformation.

Features

  • SQL-like Query Language: Parse and execute SQL queries with full support for SELECT, WHERE, JOIN, GROUP BY, ORDER BY, LIMIT, and OFFSET clauses
  • Cost-Based Query Optimizer: Intelligent query optimization using cost models for join reordering, predicate pushdown, and constant folding
  • Parallel Query Execution: Built-in support for parallelized query execution using Rayon with efficient data batching
  • Result Caching: Configurable query result caching with TTL and size limits using DashMap for concurrent access
  • Index Selection: Automatic index selection for optimized query execution
  • Geospatial Support: Full support for spatial predicates including ST_Intersects and other PostGIS-compatible functions
  • Async/Await: Non-blocking asynchronous query execution with Tokio integration
  • Error Handling: Comprehensive error types with detailed position information for parsing errors
  • Pure Rust: 100% Pure Rust implementation with no C/Fortran dependencies
  • Performance: Benchmarked query parser, optimizer, and executor for baseline performance metrics

Installation

Add to your Cargo.toml:

[dependencies]
oxigdal-query = "0.1.2"
oxigdal-core = "0.1.2"
tokio = { version = "1", features = ["full"] }

Quick Start

Basic Query Parsing and Optimization

use oxigdal_query::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Parse SQL query
    let sql = "SELECT id, name FROM users WHERE age > 18";
    let statement = parser::sql::parse_sql(sql)?;

    // Optimize query
    let optimizer = optimizer::Optimizer::new();
    let optimized = optimizer.optimize(statement)?;

    println!("Original cost: {}", optimized.original_cost.total());
    println!("Optimized cost: {}", optimized.optimized_cost.total());
    Ok(())
}

Query Engine with Execution

use oxigdal_query::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create query engine
    let mut engine = QueryEngine::new();

    // Register data sources
    // (Implement DataSource trait for your data)
    // engine.register_data_source("users".to_string(), your_data_source);

    // Execute SQL query
    let sql = "SELECT id, name FROM users WHERE age > 18 ORDER BY name LIMIT 10";
    let results = engine.execute_sql(sql).await?;

    println!("Results: {} rows", results.iter().map(|b| b.num_rows).sum::<usize>());
    Ok(())
}

Query Explanation

use oxigdal_query::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let engine = QueryEngine::new();

    // Get query execution plan
    let sql = "SELECT COUNT(*), AVG(age) FROM users GROUP BY country";
    let explain = engine.explain_sql(sql)?;

    println!("Query Plan:");
    for node in &explain.nodes {
        println!("  - {}: {}", node.node_type, node.description);
    }
    println!("Total Cost: {}", explain.total_cost.total());
    Ok(())
}

Usage

Basic Query Execution

use oxigdal_query::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut engine = QueryEngine::new();

    // Parse and execute simple query
    let results = engine.execute_sql("SELECT * FROM users").await?;
    Ok(())
}

With Custom Optimizer Configuration

use oxigdal_query::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create optimizer with custom config
    let config = OptimizerConfig {
        max_passes: 5,
        enable_predicate_pushdown: true,
        enable_join_reordering: true,
        enable_constant_folding: true,
        enable_cse: true,
        enable_filter_fusion: true,
        enable_projection_pushdown: true,
    };

    let optimizer = Optimizer::with_config(config);
    let sql = "SELECT * FROM users WHERE age > 18 AND status = 'active'";
    let statement = parser::sql::parse_sql(sql)?;

    let optimized = optimizer.optimize(statement)?;
    println!("Cost reduction: {:.2}%",
        (1.0 - optimized.optimized_cost.total() / optimized.original_cost.total()) * 100.0);
    Ok(())
}

Query Result Caching

use oxigdal_query::*;
use std::time::Duration;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create cache with custom config
    let cache_config = CacheConfig {
        max_size_bytes: 2 * 1024 * 1024 * 1024, // 2 GB
        ttl: Duration::from_secs(600), // 10 minutes
        enabled: true,
    };

    let optimizer_config = OptimizerConfig::default();
    let mut engine = QueryEngine::with_config(optimizer_config, cache_config);

    // First execution - cache miss
    let sql = "SELECT COUNT(*) FROM large_table";
    let result1 = engine.execute_sql(sql).await?;

    // Second execution - cache hit
    let result2 = engine.execute_sql(sql).await?;

    let stats = engine.cache_statistics();
    println!("Cache hits: {}", stats.hits);
    println!("Cache misses: {}", stats.misses);
    Ok(())
}

Geospatial Queries

use oxigdal_query::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let sql = "SELECT geom, name FROM buildings WHERE ST_Intersects(geom, ST_MakeEnvelope(0, 0, 100, 100))";
    let statement = parser::sql::parse_sql(sql)?;

    let optimizer = Optimizer::new();
    let optimized = optimizer.optimize(statement)?;
    println!("Spatial query optimized successfully");
    Ok(())
}

API Overview

Module Description
parser SQL query parsing with AST generation - supports standard SQL SELECT statements
optimizer Cost-based query optimization with configurable rules and heuristics
executor Query execution engine with support for scans, filters, joins, aggregations, and sorting
cache Query result caching with TTL and size management
index Index selection and management for optimized data access
explain Query plan explanation and visualization with cost estimates
parallel Parallel execution utilities for distributed query processing
error Comprehensive error types with detailed diagnostics

Optimization Rules

The optimizer implements the following optimization strategies:

  • Predicate Pushdown: Push filter predicates down to reduce data earlier
  • Join Reordering: Reorder joins for optimal execution
  • Constant Folding: Pre-compute constant expressions
  • Common Subexpression Elimination: Eliminate redundant computations
  • Filter Fusion: Combine multiple filters into single operations
  • Projection Pushdown: Push column selections down the tree

Performance

Benchmarks on standard hardware demonstrate:

Operation Dataset Size Time
Parse simple SELECT N/A ~10 µs
Parse complex query N/A ~50 µs
Optimize query N/A ~100 µs
Full table scan 100k rows ~1 ms
Filtered scan 100k rows ~2 ms
Aggregate query 100k rows ~5 ms

Run benchmarks locally:

cargo bench --bench query_bench

Examples

See the tests directory for comprehensive examples:

  • parser_test.rs - SQL parsing examples (simple select, joins, aggregates, spatial queries)
  • executor_test.rs - Query execution patterns
  • optimizer_test.rs - Optimization strategies
  • end_to_end_test.rs - Complete workflows

Error Handling

This library follows the "no unwrap" policy. All fallible operations return Result<T, QueryError> with specific error variants:

use oxigdal_query::*;

fn main() {
    let sql = "INVALID SQL";
    match parser::sql::parse_sql(sql) {
        Ok(statement) => println!("Parsed: {:?}", statement),
        Err(QueryError::ParseError { message, line, column }) => {
            eprintln!("Parse error at {}:{}: {}", line, column, message);
        }
        Err(QueryError::SemanticError(msg)) => eprintln!("Semantic error: {}", msg),
        Err(QueryError::ExecutionError(msg)) => eprintln!("Execution error: {}", msg),
        Err(e) => eprintln!("Error: {}", e),
    }
}

Documentation

Full documentation is available at docs.rs.

Generate and view local documentation:

cargo doc --open

Pure Rust

This library is 100% Pure Rust with no C/Fortran dependencies. All functionality works out of the box without external system libraries.

Related Projects

Contributing

Contributions are welcome! Please ensure:

  • No use of unwrap() - use Result<T, E> instead
  • All tests pass: cargo test --all-features
  • No clippy warnings: cargo clippy
  • Code follows COOLJAPAN ecosystem policies

License

Licensed under Apache-2.0.

See LICENSE file for details.


Part of the COOLJAPAN ecosystem for high-performance geospatial and scientific computing in Pure Rust.