stoolap 0.4.0

High-performance embedded SQL database with MVCC, time-travel queries, and full ACID compliance
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
---
layout: doc
title: Python Driver
category: Drivers
order: 2
icon: python
---

# Python Driver

High-performance Python driver for Stoolap. Built with [PyO3](https://pyo3.rs) for native performance with both sync and async APIs. All operations release the GIL for true concurrency.

## Installation

```bash
pip install stoolap-python
```

Requires Python >= 3.9. Supported versions: 3.9, 3.10, 3.11, 3.12, 3.13.

## Quick Start

```python
from stoolap import Database

db = Database.open(':memory:')

db.exec("""
    CREATE TABLE users (
        id INTEGER PRIMARY KEY,
        name TEXT NOT NULL,
        email TEXT
    )
""")

# Insert with positional parameters ($1, $2, ...)
db.execute(
    'INSERT INTO users (id, name, email) VALUES ($1, $2, $3)',
    [1, 'Alice', 'alice@example.com']
)

# Insert with named parameters (:key)
db.execute(
    'INSERT INTO users (id, name, email) VALUES (:id, :name, :email)',
    {'id': 2, 'name': 'Bob', 'email': 'bob@example.com'}
)

# Query rows as dicts
users = db.query('SELECT * FROM users ORDER BY id')
# [{'id': 1, 'name': 'Alice', 'email': 'alice@example.com'}, ...]

# Query single row
user = db.query_one('SELECT * FROM users WHERE id = $1', [1])
# {'id': 1, 'name': 'Alice', 'email': 'alice@example.com'}

# Query in raw columnar format (faster)
raw = db.query_raw('SELECT id, name FROM users ORDER BY id')
# {'columns': ['id', 'name'], 'rows': [[1, 'Alice'], [2, 'Bob']]}

db.close()
```

## Opening a Database

```python
# In-memory
db = Database.open(':memory:')
db = Database.open('')
db = Database.open('memory://')

# File-based (data persists across restarts)
db = Database.open('./mydata')
db = Database.open('file:///absolute/path/to/db')
```

## Methods

| Method | Returns | Description |
|--------|---------|-------------|
| `execute(sql, params?)` | `int` | Execute DML statement, return rows affected |
| `exec(sql)` | `None` | Execute one or more statements (no parameters) |
| `query(sql, params?)` | `list[dict]` | Query rows as dicts |
| `query_one(sql, params?)` | `dict \| None` | Query single row |
| `query_raw(sql, params?)` | `dict` | Query in columnar format |
| `execute_batch(sql, params_list)` | `int` | Execute with multiple param sets |
| `prepare(sql)` | `PreparedStatement` | Create a prepared statement |
| `begin()` | `Transaction` | Begin a transaction |
| `close()` | `None` | Close the database |

## Persistence

File-based databases persist data to disk using WAL (Write-Ahead Logging) and immutable cold volumes. A background checkpoint cycle seals hot rows into columnar volume files, compacts them, and truncates the WAL. Data survives process restarts.

```python
db = Database.open('./mydata')

db.exec('CREATE TABLE kv (key TEXT PRIMARY KEY, value TEXT)')
db.execute('INSERT INTO kv VALUES ($1, $2)', ['hello', 'world'])
db.close()

# Reopen: data is still there
db2 = Database.open('./mydata')
row = db2.query_one('SELECT * FROM kv WHERE key = $1', ['hello'])
# {'key': 'hello', 'value': 'world'}
db2.close()
```

### Configuration

Pass configuration as query parameters in the path:

```python
# Maximum durability (fsync on every WAL write)
db = Database.open('./mydata?sync=full')

# High throughput (no fsync, larger buffers)
db = Database.open('./mydata?sync=none&wal_buffer_size=131072')

# Custom checkpoint interval with compression
db = Database.open('./mydata?checkpoint_interval=60&compression=on')

# Multiple options
db = Database.open(
    './mydata?sync=full&checkpoint_interval=120&compact_threshold=8&wal_max_size=134217728'
)
```

### Sync Modes

Controls the durability vs. performance trade-off:

| Mode | Value | Description |
|------|-------|-------------|
| `none` | `sync=none` | No fsync. Fastest, but data may be lost on crash |
| `normal` | `sync=normal` | Fsync on commit batches. Good balance (default) |
| `full` | `sync=full` | Fsync on every WAL write. Slowest, maximum durability |

### All Configuration Parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `sync` | `normal` | Sync mode: `none`, `normal`, or `full` |
| `checkpoint_interval` | `60` | Seconds between automatic checkpoint cycles |
| `compact_threshold` | `4` | Sub-target volumes per table before merging |
| `keep_snapshots` | `3` | Backup snapshots to retain per table |
| `wal_flush_trigger` | `32768` | WAL flush trigger size in bytes (32 KB) |
| `wal_buffer_size` | `65536` | WAL buffer size in bytes (64 KB) |
| `wal_max_size` | `67108864` | Max WAL file size before rotation (64 MB) |
| `commit_batch_size` | `100` | Commits to batch before syncing (normal mode) |
| `sync_interval_ms` | `1000` | Minimum ms between syncs (normal mode) |
| `wal_compression` | `on` | LZ4 compression for WAL entries |
| `volume_compression` | `on` | LZ4 compression for cold volume files |
| `compression` | -- | Alias that sets both `wal_compression` and `volume_compression` |
| `compression_threshold` | `64` | Minimum bytes before compressing an entry |
| `checkpoint_on_close` | `on` | Seal all hot rows to volumes on clean shutdown |
| `target_volume_rows` | `1048576` | Target rows per cold volume (min 65536) |

## Raw Query Format

`query_raw` returns `{"columns": [...], "rows": [[...], ...]}` instead of a list of dicts. Faster when you don't need named keys.

```python
raw = db.query_raw('SELECT id, name, email FROM users ORDER BY id')
print(raw['columns'])  # ['id', 'name', 'email']
print(raw['rows'])     # [[1, 'Alice', 'alice@example.com'], [2, 'Bob', 'bob@example.com']]
```

## Batch Execution

Execute the same SQL with multiple parameter sets in a single call. Automatically wraps in a transaction.

```python
changes = db.execute_batch(
    'INSERT INTO users VALUES ($1, $2, $3)',
    [
        [1, 'Alice', 'alice@example.com'],
        [2, 'Bob', 'bob@example.com'],
        [3, 'Charlie', 'charlie@example.com'],
    ]
)
print(changes)  # 3
```

## Prepared Statements

Prepared statements parse SQL once and reuse the cached execution plan on every call. No parsing or cache lookup overhead per execution.

```python
insert = db.prepare('INSERT INTO users VALUES ($1, $2, $3)')
insert.execute([1, 'Alice', 'alice@example.com'])
insert.execute([2, 'Bob', 'bob@example.com'])

lookup = db.prepare('SELECT * FROM users WHERE id = $1')
user = lookup.query_one([1])
# {'id': 1, 'name': 'Alice', 'email': 'alice@example.com'}
```

### Methods

All methods mirror `Database` but without the `sql` parameter (it's bound at prepare time).

| Method | Returns | Description |
|--------|---------|-------------|
| `execute(params?)` | `int` | Execute DML statement |
| `query(params?)` | `list[dict]` | Query rows as dicts |
| `query_one(params?)` | `dict \| None` | Query single row |
| `query_raw(params?)` | `dict` | Query in columnar format |
| `execute_batch(params_list)` | `int` | Execute with multiple param sets |

Property: `sql` returns the SQL text of this prepared statement.

### Batch with Prepared Statement

```python
insert = db.prepare('INSERT INTO users VALUES ($1, $2, $3)')
changes = insert.execute_batch([
    [1, 'Alice', 'alice@example.com'],
    [2, 'Bob', 'bob@example.com'],
    [3, 'Charlie', 'charlie@example.com'],
])
print(changes)  # 3
```

## Transactions

### Using Context Manager

```python
with db.begin() as tx:
    tx.execute('INSERT INTO users VALUES ($1, $2, $3)', [1, 'Alice', 'alice@example.com'])
    tx.execute('INSERT INTO users VALUES ($1, $2, $3)', [2, 'Bob', 'bob@example.com'])

    # Read within the transaction (sees uncommitted changes)
    rows = tx.query('SELECT * FROM users')
    one = tx.query_one('SELECT * FROM users WHERE id = $1', [1])
    raw = tx.query_raw('SELECT id, name FROM users')

    # Auto-commits on clean exit, auto-rollbacks on exception
```

### Manual Control

```python
tx = db.begin()
try:
    tx.execute('INSERT INTO users VALUES ($1, $2, $3)', [1, 'Alice', 'alice@example.com'])
    tx.execute('INSERT INTO users VALUES ($1, $2, $3)', [2, 'Bob', 'bob@example.com'])
    tx.commit()
except:
    tx.rollback()
    raise
```

### Transaction Methods

| Method | Returns | Description |
|--------|---------|-------------|
| `execute(sql, params?)` | `int` | Execute DML statement |
| `query(sql, params?)` | `list[dict]` | Query rows as dicts |
| `query_one(sql, params?)` | `dict \| None` | Query single row |
| `query_raw(sql, params?)` | `dict` | Query in columnar format |
| `execute_batch(sql, params_list)` | `int` | Execute with multiple param sets |
| `commit()` | `None` | Commit the transaction |
| `rollback()` | `None` | Rollback the transaction |

### Batch in Transaction

```python
with db.begin() as tx:
    changes = tx.execute_batch(
        'INSERT INTO users VALUES ($1, $2, $3)',
        [
            [1, 'Alice', 'alice@example.com'],
            [2, 'Bob', 'bob@example.com'],
        ]
    )
    print(changes)  # 2
```

## Async API

Async wrappers use `asyncio.to_thread()` for non-blocking operations. All methods release the GIL.

```python
import asyncio
from stoolap import AsyncDatabase

async def main():
    db = await AsyncDatabase.open(':memory:')

    await db.exec("""
        CREATE TABLE users (
            id INTEGER PRIMARY KEY,
            name TEXT NOT NULL,
            email TEXT
        )
    """)

    await db.execute(
        'INSERT INTO users VALUES ($1, $2, $3)',
        [1, 'Alice', 'alice@example.com']
    )

    users = await db.query('SELECT * FROM users')
    print(users)

    await db.close()

asyncio.run(main())
```

### Async Transaction

```python
async with await db.begin() as tx:
    await tx.execute('INSERT INTO users VALUES ($1, $2, $3)', [1, 'Alice', 'alice@example.com'])
    await tx.execute('INSERT INTO users VALUES ($1, $2, $3)', [2, 'Bob', 'bob@example.com'])
    # Auto-commits on clean exit, auto-rollbacks on exception
```

### Async Prepared Statement

```python
stmt = db.prepare('SELECT * FROM users WHERE id = $1')
user = await stmt.query_one([1])
rows = await stmt.query([1])
raw = await stmt.query_raw([1])
```

### Async Methods

`AsyncDatabase`, `AsyncTransaction`, and `AsyncPreparedStatement` mirror all sync methods as coroutines with the same names, parameters, and return types. Just `await` the call. The only exception is `prepare()`, which is synchronous (no `await` needed).

## Parameters

Both positional and named parameters are supported across all methods:

```python
# Positional ($1, $2, ...)
db.query('SELECT * FROM users WHERE id = $1 AND name = $2', [1, 'Alice'])

# Named (:key)
db.query(
    'SELECT * FROM users WHERE id = :id AND name = :name',
    {'id': 1, 'name': 'Alice'}
)
```

Named parameter keys can include an optional prefix:

```python
# All equivalent
db.query('SELECT * FROM users WHERE id = :id', {'id': 1})
db.query('SELECT * FROM users WHERE id = :id', {':id': 1})
db.query('SELECT * FROM users WHERE id = :id', {'@id': 1})
db.query('SELECT * FROM users WHERE id = :id', {'$id': 1})
```

## Error Handling

All methods raise `StoolapError` on errors (invalid SQL, constraint violations, etc.):

```python
from stoolap import Database, StoolapError

try:
    db.execute('INSERT INTO users VALUES ($1, $2)', [1, None])  # NOT NULL violation
except StoolapError as e:
    print(f'Database error: {e}')

# Invalid SQL raises at prepare time
try:
    db.prepare('INVALID SQL HERE')
except StoolapError as e:
    print(f'Parse error: {e}')
```

`StoolapError` inherits from `RuntimeError`.

## Type Mapping

| Python | Stoolap | Notes |
|--------|---------|-------|
| `int` | `INTEGER` | 64-bit signed |
| `float` | `FLOAT` | 64-bit double |
| `str` | `TEXT` | UTF-8 encoded |
| `bool` | `BOOLEAN` | Checked before `int` (bool is a subclass of int in Python) |
| `None` | `NULL` | Any type |
| `datetime.datetime` | `TIMESTAMP` | Timezone-aware converted to UTC; naive treated as UTC |
| `dict` | `JSON` | Serialized via `json.dumps()` |
| `list` | `JSON` | Serialized via `json.dumps()` |

## Building from Source

Requires:
- [Rust]https://rustup.rs (stable)
- [Python]https://python.org >= 3.9
- [maturin]https://www.maturin.rs (`pip install maturin`)

```bash
git clone https://github.com/stoolap/stoolap-python.git
cd stoolap-python
python -m venv .venv && source .venv/bin/activate
pip install maturin pytest pytest-asyncio
maturin develop --release
pytest
```