sqlite_wasm_reader 0.3.1

A pure Rust SQLite reader library for WASI environments
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
# sqlite_wasm_reader

A pure Rust SQLite reader library designed for WASI (WebAssembly System Interface) environments. This library provides read-only access to SQLite databases without any C dependencies, making it perfect for WebAssembly applications running in WasmTime or other WASI-compatible runtimes.

## Version 0.3.1

This version introduces comprehensive SQL query support with enhanced WHERE clause capabilities including logical operators (`AND`, `OR`, `NOT`), null checks (`IS NULL`, `IS NOT NULL`), membership testing (`IN`), range queries (`BETWEEN`), pattern matching (`LIKE`), and complex expressions with parentheses.

See [CHANGELOG.md](CHANGELOG.md) for detailed release information.

## Features

- **Pure Rust Implementation**: No C dependencies, fully written in Rust
- **WASI Compatible**: Designed to work in WebAssembly environments with WASI support
- **Read-Only Access**: Focused on reading SQLite databases (no write operations)
- **Minimal Dependencies**: Only depends on `byteorder` and `thiserror`
- **Configurable Logging**: Built-in logging system with multiple levels (Error, Warn, Info, Debug, Trace)
- **Robust B-tree Traversal**: Proper in-order traversal with cycle detection
- **Memory Efficient**: Designed to handle large databases with limited memory constraints
- **Simple API**: Easy-to-use interface for reading tables and data
- **Deterministic Results**: Consistent query results across different runs and environments
- **Robust Data Handling**: Graceful handling of edge cases like NaN values without runtime panics
- **Performance Optimized**: Minimized memory allocations and optimized page parsing for better performance

## Why Read-Only?

This library is intentionally designed as **read-only** for several important reasons:

### 1. **WASI Environment Constraints**
- **Sandboxed Execution**: WASI environments are designed for security and isolation, often with restricted file system access
- **No Native Dependencies**: Avoiding C bindings eliminates compatibility issues in WebAssembly runtimes
- **Deterministic Behavior**: Read-only operations are more predictable and safer in sandboxed environments

### 2. **Security and Safety**
- **Immutable Data**: Prevents accidental data corruption or malicious modifications
- **Audit Trail**: Read-only access ensures data integrity for forensic and compliance purposes
- **No Lock Contention**: Eliminates complex locking mechanisms required for concurrent writes

### 3. **Performance and Reliability**
- **Simplified Architecture**: Read-only design allows for optimized, streamlined code paths
- **Reduced Complexity**: No need to handle transaction management, rollbacks, or write-ahead logging
- **Memory Efficiency**: Lower memory footprint without write buffers and transaction logs

### 4. **Use Case Alignment**
- **Data Analysis**: Perfect for reading and analyzing existing SQLite databases
- **Reporting**: Generate reports from production databases without modification risk
- **Audit and Compliance**: Safe access to sensitive data for regulatory requirements
- **Data Migration**: Read data from SQLite for migration to other systems

## Target Use Cases

This library is specifically designed for the following scenarios:

### 1. **WebAssembly Data Processing**
```rust
// Process SQLite data in a WASI environment
use sqlite_wasm_reader::{Database, Error};
use sqlite_wasm_reader::query::SelectQuery;

fn analyze_user_data(db_path: &str) -> Result<(), Error> {
    let mut db = Database::open(db_path)?;
    let users = db.execute_query(&SelectQuery::parse("SELECT * FROM users")?)?;
    
    // Perform analysis without modifying the database
    for user in users {
        // Analyze user data...
    }
    Ok(())
}
```

### 2. **Sandboxed Analytics**
- **Security Scanning**: Analyze file metadata and content in isolated environments
- **Malware Detection**: Read database files for threat analysis without execution risk
- **Content Analysis**: Process user-generated content in secure containers

### 3. **Edge Computing and IoT**
- **Local Data Access**: Read configuration databases on edge devices
- **Offline Analytics**: Process data when network connectivity is limited
- **Resource-Constrained Environments**: Lightweight database access for embedded systems

### 4. **Data Pipeline Integration**
```rust
// Extract data from SQLite for ETL processes
fn extract_table_data(db_path: &str, table_name: &str) -> Result<Vec<Row>, Error> {
    let mut db = Database::open(db_path)?;
    db.execute_query(&SelectQuery::parse(&format!("SELECT * FROM {}", table_name))?)
}
```

### 5. **Forensic and Compliance**
- **Data Auditing**: Safely examine databases for compliance verification
- **Digital Forensics**: Read evidence databases without contamination
- **Regulatory Reporting**: Generate reports from production systems

### 6. **Development and Testing**
- **Test Data Access**: Read test databases in CI/CD pipelines
- **Development Tools**: Build tools that analyze database schemas and content
- **Debugging**: Examine database state during development

## When to Use This Library

**Use this library when you need to:**
- ✅ Read SQLite databases in WASI/WebAssembly environments
- ✅ Analyze data without modifying the source database
- ✅ Work in sandboxed or restricted environments
- ✅ Build lightweight, dependency-free applications
- ✅ Process large databases with memory constraints
- ✅ Integrate SQLite reading into data pipelines

## Why Writing from WASM Sandboxes is Problematic

This library is intentionally read-only because writing to SQLite from WebAssembly sandboxes presents significant risks:

### **Data Corruption from Concurrent Writes**
- **Multiple WASM Instances**: When multiple WebAssembly instances write to the same SQLite database simultaneously, they can corrupt the database structure
- **No File Locking**: WASI environments often lack proper file locking mechanisms that SQLite relies on for write safety

### **Technical Limitations**
- **WAL Mode Issues**: SQLite's Write-Ahead Logging requires coordination of multiple files that may not be available in sandboxed environments
- **Shared Memory Problems**: SQLite's locking mechanisms rely on shared memory regions that may not be properly isolated in WASM
- **Partial Writes**: If a WASM instance crashes during a write operation, the database can be left in an inconsistent state

## Installation

Add this to your `Cargo.toml`:

```toml
[dependencies]
sqlite_wasm_reader = "0.3.1"
```

## Quick Start

```rust
use sqlite_wasm_reader::{Database, Error, LogLevel, init_default_logger};

fn main() -> Result<(), Error> {
    // Initialize logging (optional, defaults to Info level)
    init_default_logger();
    
    // Open a SQLite database
    let mut db = Database::open("example.db")?;
    
    // List all tables
    let tables = db.tables()?;
    for table in tables {
        println!("Table: {}", table);
    }
    
    // Execute a query using indexes
    let query = SelectQuery::parse("SELECT * FROM users WHERE id = 1")?;
    let rows = db.execute_query(&query)?;
    for row in rows {
        println!("{:?}", row);
    }
    
    Ok(())
}
```

## Logging

The library includes a configurable logging system to help with debugging and monitoring:

```rust
use sqlite_wasm_reader::{LogLevel, init_logger, set_log_level};

// Initialize with custom log level
init_logger(LogLevel::Debug);

// Change log level at runtime
set_log_level(LogLevel::Trace);

// Log levels available:
// - Error: Critical errors that prevent operation
// - Warn: Important warnings and errors
// - Info: General information about operations (default)
// - Debug: Detailed debugging information
// - Trace: Very detailed tracing information
```

## API Reference

### Database Operations

```rust
// Open a database
let mut db = Database::open("path/to/database.db")?;

// List all tables
let tables = db.tables()?;

// Execute a query using indexes
let query = SelectQuery::parse("SELECT * FROM table_name WHERE column = 'value'")?;
let rows = db.execute_query(&query)?;

// Count rows in a table efficiently
let count = db.count_table_rows("table_name")?;
```

### Query Builder Helpers

For programmatic construction of `SELECT` queries without writing raw SQL, use the fluent helper API:

```rust
use sqlite_wasm_reader::{query::{SelectQuery, Expr}, Value};

let query = SelectQuery::new("users")
    .select_columns(vec!["id".into(), "name".into()])
    .with_where(
        Expr::eq("status", Value::Text("active".into()))
            .and(Expr::between("age", Value::Integer(18), Value::Integer(65)))
            .or(Expr::is_null("deleted_at"))
    )
    .with_order_by("name", true)
    .with_limit(100);

let rows = db.execute_query(&query)?;
```

### SQL Query Support

`sqlite_wasm_reader` lets you query data either by parsing raw SQL _or_ by constructing `SelectQuery` objects directly and executing them with `Database::execute_query()`.

```rust
use sqlite_wasm_reader::{Database, Error};
use sqlite_wasm_reader::query::SelectQuery;
use sqlite_wasm_reader::query::{SelectQuery, Expr};
use sqlite_wasm_reader::value::Value;

fn complex_report(db: &mut Database) -> Result<(), Error> {
    // Option 1. Parse raw SQL, then execute
    let raw = "SELECT * FROM users WHERE age > 18 AND status = 'active' ORDER BY name LIMIT 10";
    let parsed = SelectQuery::parse(raw)?;
    let rows = db.execute_query(&parsed)?;
    println!("{} rows (raw SQL): {}", rows.len(), raw);

    // Option 2. Build programmatically using helpers
    let builder = SelectQuery::new("users")
        .select_columns(vec!["id".into(), "name".into(), "age".into()])
        .with_where(
            Expr::gt("age", Value::Integer(18))
                .and(Expr::eq("status", Value::Text("active".into())))
        )
        .with_order_by("name", true)
        .with_limit(10);

    let rows = db.execute_query(&builder)?;
    println!("{} rows (builder API)", rows.len());

    Ok(())
}
```

Both paths end in a call to `execute_query`, which accepts any `SelectQuery` (parsed or manually constructed). This method uses intelligent query processing:

* **Index Acceleration**: Automatically uses available indexes for exact equality matches when suitable indexes exist
* **Table Scan Fallback**: Seamlessly falls back to full table scans when no suitable index is found, ensuring all queries work
* **WHERE filtering** with logical operators (`AND`, `OR`, `NOT`), `LIKE`, `IN`, `BETWEEN`, `IS NULL` / `IS NOT NULL`, and parentheses
* **Column projection** (`SELECT *` or explicit columns)
* **`ORDER BY` and `LIMIT`** processing in memory

Use whichever style (raw SQL vs builder) best fits your workflow.

### Value Types

The library supports all basic SQLite types:

```rust
use sqlite_wasm_reader::Value;

// NULL values
Value::Null

// Integer values
Value::Integer(42)

// Floating point values
Value::Real(3.14)

// Text values
Value::Text("hello".to_string())

// BLOB values
Value::Blob(vec![0x01, 0x02, 0x03])
```

### Row Access

Rows are represented as `HashMap<String, Value>`:

```rust
for row in rows {
    // Access by column name
    if let Some(id) = row.get("id") {
        match id {
            Value::Integer(i) => println!("ID: {}", i),
            Value::Text(s) => println!("ID: {}", s),
            _ => println!("Unexpected ID type"),
        }
    }
    
    // Check if column exists
    if row.contains_key("name") {
        println!("Has name column");
    }
}
```

## Building for WASI

To build this crate for WASI target:

```bash
# Add the WASI target
rustup target add wasm32-wasip1

# Build the project
cargo build --target wasm32-wasip1 --release
```

## Running with WasmTime

```bash
# Run with wasmtime
wasmtime run --dir=. target/wasm32-wasip1/release/your_app.wasm

# Run with file access
wasmtime run --dir=. --mapdir /data:./data target/wasm32-wasip1/release/your_app.wasm
```

## Examples

The library includes several examples demonstrating different use cases:

### Basic Database Reading

```rust
use sqlite_wasm_reader::{Database, Error, Value};

fn main() -> Result<(), Error> {
    let mut db = Database::open("users.db")?;
    
    // Read user table
    let users = db.execute_query(&SelectQuery::parse("SELECT * FROM users")?)?;
    
    for user in users {
        let name = user.get("name").unwrap_or(&Value::Null);
        let email = user.get("email").unwrap_or(&Value::Null);
        
        println!("User: {} <{}>", name, email);
    }
    
    Ok(())
}
```

### Efficient Row Counting

```rust
use sqlite_wasm_reader::{Database, Error};
use sqlite_wasm_reader::query::SelectQuery;

fn main() -> Result<(), Error> {
    let mut db = Database::open("database.db")?;
    
    // Count rows without loading all data into memory
    let user_count = db.count_table_rows("users")?;
    let order_count = db.count_table_rows("orders")?;
    
    println!("Users: {}, Orders: {}", user_count, order_count);
    
    Ok(())
}
```

### Logging and Debugging

```rust
use sqlite_wasm_reader::{Database, Error, LogLevel, init_default_logger, set_log_level, log_debug};

fn main() -> Result<(), Error> {
    // Initialize logging with debug level
    init_default_logger();
    set_log_level(LogLevel::Debug);
    
    let mut db = Database::open("database.db")?;
    
    // Enable debug logging for troubleshooting
    log_debug("Starting database analysis");
    
    let tables = db.tables()?;
    log_debug(&format!("Found {} tables", tables.len()));
    
    for table in tables {
        let count = db.count_table_rows(&table)?;
        log_debug(&format!("Table {} has {} rows", table, count));
    }
    
    Ok(())
}
```

### Running the Examples

```bash
# Basic database reading
cargo run --example read_db -- database.db

# Logging example with custom log level
cargo run --example logging_example -- database.db debug

# Efficient row counting
cargo run --example count_rows -- database.db

# WASI-compatible example
cargo build --example wasi_example --target wasm32-wasip1
wasmtime run --dir=. target/wasm32-wasip1/debug/examples/wasi_example.wasm -- database.db
```

## Limitations

- **Read-Only**: This library only supports reading SQLite databases, not writing
- **Basic SQL Types**: Supports NULL, INTEGER, REAL, TEXT, and BLOB types
- **Partial Index Support**: Uses indexes for exact equality matches when available, falls back to table scans for complex queries or when no suitable index exists
- **Simple Schema Parsing**: Basic CREATE TABLE parsing for column names
- **Memory Constraints**: Executing `SELECT *` on very large tables can be memory-intensive. Prefer filtering with WHERE clauses and/or fetching data in smaller chunks using `LIMIT` / `OFFSET` whenever possible.

## Architecture

The library is structured into several modules:

- `format`: SQLite file format constants and structures
- `page`: Page reading and parsing
- `btree`: B-tree traversal for table data with cycle detection
- `record`: SQLite record parsing
- `value`: Value types (NULL, INTEGER, REAL, TEXT, BLOB)
- `database`: Main database interface
- `logging`: Configurable logging system
- `error`: Error types and handling

## Performance Considerations

- **Memory Usage**: For huge datasets, process data in pages via repeated queries with `LIMIT` / `OFFSET`, or add selective WHERE conditions to minimize the rows materialized at once.
- **B-tree Traversal**: The library uses efficient in-order traversal with cycle detection
- **Logging Overhead**: Set appropriate log levels to minimize performance impact
- **WASI Environment**: Optimized for WebAssembly environments with limited resources
- **Row Counting**: Use `count_table_rows()` for efficient row counting without loading data

## Error Handling

The library provides comprehensive error handling:

```rust
use sqlite_wasm_reader::{Database, Error};
use sqlite_wasm_reader::query::SelectQuery;

match Database::open("database.db") {
    Ok(mut db) => {
        // Database opened successfully
    }
    Err(Error::Io(e)) => {
        eprintln!("IO error: {}", e);
    }
    Err(Error::InvalidFormat(msg)) => {
        eprintln!("Invalid SQLite format: {}", msg);
    }
    Err(Error::TableNotFound(table)) => {
        eprintln!("Table not found: {}", table);
    }
    Err(e) => {
        eprintln!("Other error: {}", e);
    }
}
```

## Robustness Features

The library is designed for production use with several robustness features:

### **Deterministic Query Results**
- Index-based queries return results in consistent, sorted order
- No unpredictable row ordering that could cause flaky tests or inconsistent behavior
- Reliable for applications that depend on consistent data presentation

### **Robust Data Handling**
- Graceful handling of NaN values in floating-point comparisons
- No runtime panics on valid SQLite data
- Proper error recovery for malformed records
- Safe handling of edge cases and corrupted data

### **Memory Safety**
- Optimized memory usage with minimal allocations
- Cycle detection in B-tree traversal to prevent infinite loops
- Bounds checking to prevent buffer overflows
- Safe handling of large databases with limited memory constraints

### **Error Recovery**
- Comprehensive error types for different failure scenarios
- Graceful degradation when encountering problematic data
- Detailed logging for debugging and monitoring
- Safe fallbacks when optimal paths fail

## License

This project is licensed under Apache License, Version 2.0, ([LICENSE](LICENSE) or http://www.apache.org/licenses/LICENSE-2.0)

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.