sqlite_wasm_reader
A pure Rust SQLite reader library designed for WASI (WebAssembly System Interface) environments. This library provides read-only access to SQLite databases without any C dependencies, making it perfect for WebAssembly applications running in WasmTime or other WASI-compatible runtimes.
Version 0.3.1
This version introduces comprehensive SQL query support with enhanced WHERE clause capabilities including logical operators (AND
, OR
, NOT
), null checks (IS NULL
, IS NOT NULL
), membership testing (IN
), range queries (BETWEEN
), pattern matching (LIKE
), and complex expressions with parentheses.
See CHANGELOG.md for detailed release information.
Features
- Pure Rust Implementation: No C dependencies, fully written in Rust
- WASI Compatible: Designed to work in WebAssembly environments with WASI support
- Read-Only Access: Focused on reading SQLite databases (no write operations)
- Minimal Dependencies: Only depends on
byteorder
andthiserror
- Configurable Logging: Built-in logging system with multiple levels (Error, Warn, Info, Debug, Trace)
- Robust B-tree Traversal: Proper in-order traversal with cycle detection
- Memory Efficient: Designed to handle large databases with limited memory constraints
- Simple API: Easy-to-use interface for reading tables and data
- Deterministic Results: Consistent query results across different runs and environments
- Robust Data Handling: Graceful handling of edge cases like NaN values without runtime panics
- Performance Optimized: Minimized memory allocations and optimized page parsing for better performance
Why Read-Only?
This library is intentionally designed as read-only for several important reasons:
1. WASI Environment Constraints
- Sandboxed Execution: WASI environments are designed for security and isolation, often with restricted file system access
- No Native Dependencies: Avoiding C bindings eliminates compatibility issues in WebAssembly runtimes
- Deterministic Behavior: Read-only operations are more predictable and safer in sandboxed environments
2. Security and Safety
- Immutable Data: Prevents accidental data corruption or malicious modifications
- Audit Trail: Read-only access ensures data integrity for forensic and compliance purposes
- No Lock Contention: Eliminates complex locking mechanisms required for concurrent writes
3. Performance and Reliability
- Simplified Architecture: Read-only design allows for optimized, streamlined code paths
- Reduced Complexity: No need to handle transaction management, rollbacks, or write-ahead logging
- Memory Efficiency: Lower memory footprint without write buffers and transaction logs
4. Use Case Alignment
- Data Analysis: Perfect for reading and analyzing existing SQLite databases
- Reporting: Generate reports from production databases without modification risk
- Audit and Compliance: Safe access to sensitive data for regulatory requirements
- Data Migration: Read data from SQLite for migration to other systems
Target Use Cases
This library is specifically designed for the following scenarios:
1. WebAssembly Data Processing
// Process SQLite data in a WASI environment
use ;
use SelectQuery;
2. Sandboxed Analytics
- Security Scanning: Analyze file metadata and content in isolated environments
- Malware Detection: Read database files for threat analysis without execution risk
- Content Analysis: Process user-generated content in secure containers
3. Edge Computing and IoT
- Local Data Access: Read configuration databases on edge devices
- Offline Analytics: Process data when network connectivity is limited
- Resource-Constrained Environments: Lightweight database access for embedded systems
4. Data Pipeline Integration
// Extract data from SQLite for ETL processes
5. Forensic and Compliance
- Data Auditing: Safely examine databases for compliance verification
- Digital Forensics: Read evidence databases without contamination
- Regulatory Reporting: Generate reports from production systems
6. Development and Testing
- Test Data Access: Read test databases in CI/CD pipelines
- Development Tools: Build tools that analyze database schemas and content
- Debugging: Examine database state during development
When to Use This Library
Use this library when you need to:
- ✅ Read SQLite databases in WASI/WebAssembly environments
- ✅ Analyze data without modifying the source database
- ✅ Work in sandboxed or restricted environments
- ✅ Build lightweight, dependency-free applications
- ✅ Process large databases with memory constraints
- ✅ Integrate SQLite reading into data pipelines
Why Writing from WASM Sandboxes is Problematic
This library is intentionally read-only because writing to SQLite from WebAssembly sandboxes presents significant risks:
Data Corruption from Concurrent Writes
- Multiple WASM Instances: When multiple WebAssembly instances write to the same SQLite database simultaneously, they can corrupt the database structure
- No File Locking: WASI environments often lack proper file locking mechanisms that SQLite relies on for write safety
Technical Limitations
- WAL Mode Issues: SQLite's Write-Ahead Logging requires coordination of multiple files that may not be available in sandboxed environments
- Shared Memory Problems: SQLite's locking mechanisms rely on shared memory regions that may not be properly isolated in WASM
- Partial Writes: If a WASM instance crashes during a write operation, the database can be left in an inconsistent state
Installation
Add this to your Cargo.toml
:
[]
= "0.3.1"
Quick Start
use ;
Logging
The library includes a configurable logging system to help with debugging and monitoring:
use ;
// Initialize with custom log level
init_logger;
// Change log level at runtime
set_log_level;
// Log levels available:
// - Error: Critical errors that prevent operation
// - Warn: Important warnings and errors
// - Info: General information about operations (default)
// - Debug: Detailed debugging information
// - Trace: Very detailed tracing information
API Reference
Database Operations
// Open a database
let mut db = open?;
// List all tables
let tables = db.tables?;
// Execute a query using indexes
let query = parse?;
let rows = db.execute_query?;
// Count rows in a table efficiently
let count = db.count_table_rows?;
Query Builder Helpers
For programmatic construction of SELECT
queries without writing raw SQL, use the fluent helper API:
use ;
let query = new
.select_columns
.with_where
.with_order_by
.with_limit;
let rows = db.execute_query?;
SQL Query Support
sqlite_wasm_reader
lets you query data either by parsing raw SQL or by constructing SelectQuery
objects directly and executing them with Database::execute_query()
.
use ;
use SelectQuery;
use ;
use Value;
Both paths end in a call to execute_query
, which accepts any SelectQuery
(parsed or manually constructed). This method uses intelligent query processing:
- Index Acceleration: Automatically uses available indexes for exact equality matches when suitable indexes exist
- Table Scan Fallback: Seamlessly falls back to full table scans when no suitable index is found, ensuring all queries work
- WHERE filtering with logical operators (
AND
,OR
,NOT
),LIKE
,IN
,BETWEEN
,IS NULL
/IS NOT NULL
, and parentheses - Column projection (
SELECT *
or explicit columns) ORDER BY
andLIMIT
processing in memory
Use whichever style (raw SQL vs builder) best fits your workflow.
Value Types
The library supports all basic SQLite types:
use Value;
// NULL values
Null
// Integer values
Integer
// Floating point values
Real
// Text values
Text
// BLOB values
Blob
Row Access
Rows are represented as HashMap<String, Value>
:
for row in rows
Building for WASI
To build this crate for WASI target:
# Add the WASI target
# Build the project
Running with WasmTime
# Run with wasmtime
# Run with file access
Examples
The library includes several examples demonstrating different use cases:
Basic Database Reading
use ;
Efficient Row Counting
use ;
use SelectQuery;
Logging and Debugging
use ;
Running the Examples
# Basic database reading
# Logging example with custom log level
# Efficient row counting
# WASI-compatible example
Limitations
- Read-Only: This library only supports reading SQLite databases, not writing
- Basic SQL Types: Supports NULL, INTEGER, REAL, TEXT, and BLOB types
- Partial Index Support: Uses indexes for exact equality matches when available, falls back to table scans for complex queries or when no suitable index exists
- Simple Schema Parsing: Basic CREATE TABLE parsing for column names
- Memory Constraints: Executing
SELECT *
on very large tables can be memory-intensive. Prefer filtering with WHERE clauses and/or fetching data in smaller chunks usingLIMIT
/OFFSET
whenever possible.
Architecture
The library is structured into several modules:
format
: SQLite file format constants and structurespage
: Page reading and parsingbtree
: B-tree traversal for table data with cycle detectionrecord
: SQLite record parsingvalue
: Value types (NULL, INTEGER, REAL, TEXT, BLOB)database
: Main database interfacelogging
: Configurable logging systemerror
: Error types and handling
Performance Considerations
- Memory Usage: For huge datasets, process data in pages via repeated queries with
LIMIT
/OFFSET
, or add selective WHERE conditions to minimize the rows materialized at once. - B-tree Traversal: The library uses efficient in-order traversal with cycle detection
- Logging Overhead: Set appropriate log levels to minimize performance impact
- WASI Environment: Optimized for WebAssembly environments with limited resources
- Row Counting: Use
count_table_rows()
for efficient row counting without loading data
Error Handling
The library provides comprehensive error handling:
use ;
use SelectQuery;
match open
Robustness Features
The library is designed for production use with several robustness features:
Deterministic Query Results
- Index-based queries return results in consistent, sorted order
- No unpredictable row ordering that could cause flaky tests or inconsistent behavior
- Reliable for applications that depend on consistent data presentation
Robust Data Handling
- Graceful handling of NaN values in floating-point comparisons
- No runtime panics on valid SQLite data
- Proper error recovery for malformed records
- Safe handling of edge cases and corrupted data
Memory Safety
- Optimized memory usage with minimal allocations
- Cycle detection in B-tree traversal to prevent infinite loops
- Bounds checking to prevent buffer overflows
- Safe handling of large databases with limited memory constraints
Error Recovery
- Comprehensive error types for different failure scenarios
- Graceful degradation when encountering problematic data
- Detailed logging for debugging and monitoring
- Safe fallbacks when optimal paths fail
License
This project is licensed under Apache License, Version 2.0, (LICENSE or http://www.apache.org/licenses/LICENSE-2.0)
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.