stoolap 0.4.0

High-performance embedded SQL database with MVCC, time-travel queries, and full ACID compliance
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
---
layout: doc
title: ROLLUP, CUBE, and GROUPING SETS
category: SQL Features
order: 5
---

# ROLLUP, CUBE, and GROUPING SETS

ROLLUP, CUBE, and GROUPING SETS are extensions to GROUP BY that generate multiple levels of aggregation in a single query. They're useful for generating reports with subtotals and grand totals.

## ROLLUP

ROLLUP creates a hierarchical set of subtotals, rolling up from the most detailed level to a grand total.

### Syntax

```sql
SELECT columns, aggregate_functions
FROM table
GROUP BY ROLLUP(column1, column2, ...);
```

### Example

```sql
CREATE TABLE sales (
    id INTEGER PRIMARY KEY,
    region TEXT,
    category TEXT,
    amount FLOAT
);

INSERT INTO sales VALUES
(1, 'East', 'Electronics', 100),
(2, 'East', 'Electronics', 150),
(3, 'East', 'Clothing', 50),
(4, 'West', 'Electronics', 200),
(5, 'West', 'Clothing', 75),
(6, 'West', 'Clothing', 60);

SELECT region, category, SUM(amount) as total
FROM sales
GROUP BY ROLLUP(region, category);
```

Result:
```
region | category    | total
-------+-------------+-------
East   | Clothing    | 50.0
East   | Electronics | 250.0
West   | Electronics | 200.0
West   | Clothing    | 135.0
East   | NULL        | 300.0    -- Subtotal for East
West   | NULL        | 335.0    -- Subtotal for West
NULL   | NULL        | 635.0    -- Grand total
```

### How ROLLUP Works

For `ROLLUP(region, category)`, it produces:
1. Detail rows: `(region, category)` - each combination
2. Region subtotals: `(region, NULL)` - totals per region
3. Grand total: `(NULL, NULL)` - overall total

The NULL values indicate the aggregation level.

### Column Order Matters

ROLLUP aggregates from right to left:

```sql
-- ROLLUP(a, b, c) produces:
-- (a, b, c) - detail
-- (a, b, NULL) - subtotal by a, b
-- (a, NULL, NULL) - subtotal by a
-- (NULL, NULL, NULL) - grand total
```

## CUBE

CUBE generates all possible combinations of grouping columns, providing a complete multi-dimensional analysis.

### Syntax

```sql
SELECT columns, aggregate_functions
FROM table
GROUP BY CUBE(column1, column2, ...);
```

### Example

```sql
SELECT region, category, SUM(amount) as total
FROM sales
GROUP BY CUBE(region, category);
```

Result:
```
region | category    | total
-------+-------------+-------
East   | Clothing    | 50.0
East   | Electronics | 250.0
West   | Electronics | 200.0
West   | Clothing    | 135.0
East   | NULL        | 300.0    -- Subtotal for East
West   | NULL        | 335.0    -- Subtotal for West
NULL   | Clothing    | 185.0    -- Subtotal for Clothing
NULL   | Electronics | 450.0    -- Subtotal for Electronics
NULL   | NULL        | 635.0    -- Grand total
```

### How CUBE Works

For `CUBE(region, category)`, it produces all 2^n combinations:
1. `(region, category)` - detail rows
2. `(region, NULL)` - totals by region
3. `(NULL, category)` - totals by category
4. `(NULL, NULL)` - grand total

## GROUPING SETS

GROUPING SETS provides explicit control over which grouping combinations to generate, allowing you to specify exactly which aggregation levels you need.

### Syntax

```sql
SELECT columns, aggregate_functions
FROM table
GROUP BY GROUPING SETS ((columns1), (columns2), ...);
```

### Example

```sql
-- Specify exact grouping combinations
SELECT region, category, SUM(amount) as total
FROM sales
GROUP BY GROUPING SETS ((region, category), (region), ());
```

Result:
```
region | category    | total
-------+-------------+-------
East   | Clothing    | 50.0
East   | Electronics | 250.0
West   | Electronics | 200.0
West   | Clothing    | 135.0
East   | NULL        | 300.0    -- region subtotal
West   | NULL        | 335.0    -- region subtotal
NULL   | NULL        | 635.0    -- grand total
```

### How GROUPING SETS Works

Each inner parentheses defines a grouping:
- `(region, category)` - group by both columns (detail rows)
- `(region)` - group by region only (region subtotals)
- `()` - empty set produces grand total

### Equivalence with ROLLUP and CUBE

GROUPING SETS can express any ROLLUP or CUBE:

```sql
-- These are equivalent:
GROUP BY ROLLUP(a, b)
GROUP BY GROUPING SETS ((a, b), (a), ())

-- These are equivalent:
GROUP BY CUBE(a, b)
GROUP BY GROUPING SETS ((a, b), (a), (b), ())
```

### Selective Subtotals

Unlike ROLLUP and CUBE, GROUPING SETS lets you pick specific combinations:

```sql
-- Only region and category subtotals, no detail rows
SELECT region, category, SUM(amount) as total
FROM sales
GROUP BY GROUPING SETS ((region), (category));
```

Result:
```
region | category    | total
-------+-------------+-------
East   | NULL        | 300.0    -- region subtotal
West   | NULL        | 335.0    -- region subtotal
NULL   | Clothing    | 185.0    -- category subtotal
NULL   | Electronics | 450.0    -- category subtotal
```

## GROUPING() Function

The GROUPING() function identifies whether a NULL value in the result represents an actual NULL in the data or indicates a super-aggregate row (subtotal/grand total).

### Syntax

```sql
GROUPING(column)
```

Returns:
- `0` if the column is part of the current grouping (normal row)
- `1` if the column is aggregated (super-aggregate row)

### Example

```sql
SELECT
    region,
    category,
    SUM(amount) as total,
    GROUPING(region) as is_region_aggregated,
    GROUPING(category) as is_category_aggregated
FROM sales
GROUP BY GROUPING SETS ((region, category), (region), ());
```

Result:
```
region | category    | total | is_region_aggregated | is_category_aggregated
-------+-------------+-------+----------------------+-----------------------
East   | Clothing    | 50.0  | 0                    | 0
East   | Electronics | 250.0 | 0                    | 0
West   | Electronics | 200.0 | 0                    | 0
West   | Clothing    | 135.0 | 0                    | 0
East   | NULL        | 300.0 | 0                    | 1   -- category aggregated
West   | NULL        | 335.0 | 0                    | 1   -- category aggregated
NULL   | NULL        | 635.0 | 1                    | 1   -- both aggregated
```

### Distinguishing NULL Values

Use GROUPING() to handle NULL values properly:

```sql
SELECT
    CASE WHEN GROUPING(region) = 1 THEN '(All Regions)' ELSE COALESCE(region, 'Unknown') END as region,
    CASE WHEN GROUPING(category) = 1 THEN '(All Categories)' ELSE COALESCE(category, 'Unknown') END as category,
    SUM(amount) as total
FROM sales
GROUP BY ROLLUP(region, category);
```

## Comparison: ROLLUP vs CUBE vs GROUPING SETS

| Feature | ROLLUP | CUBE | GROUPING SETS |
|---------|--------|------|---------------|
| Subtotals | Hierarchical only | All combinations | User-defined |
| Groupings | n + 1 | 2^n | As specified |
| Use case | Hierarchical reports | Cross-tabulation | Custom reports |
| Flexibility | Low | Medium | High |

### Grouping Count Example

For 3 columns (a, b, c):

**ROLLUP(a, b, c)** produces 4 groupings:
- (a, b, c)
- (a, b, NULL)
- (a, NULL, NULL)
- (NULL, NULL, NULL)

**CUBE(a, b, c)** produces 8 groupings:
- (a, b, c)
- (a, b, NULL)
- (a, NULL, c)
- (NULL, b, c)
- (a, NULL, NULL)
- (NULL, b, NULL)
- (NULL, NULL, c)
- (NULL, NULL, NULL)

## Use Cases

### ROLLUP for Hierarchical Reports

Time-based hierarchies (year > quarter > month):

```sql
SELECT
    EXTRACT(YEAR FROM order_date) as year,
    EXTRACT(QUARTER FROM order_date) as quarter,
    SUM(amount) as total
FROM orders
GROUP BY ROLLUP(
    EXTRACT(YEAR FROM order_date),
    EXTRACT(QUARTER FROM order_date)
);
```

Geographic hierarchies (country > region > city):

```sql
SELECT country, region, city, SUM(sales) as total
FROM stores
GROUP BY ROLLUP(country, region, city);
```

### CUBE for Cross-Tabulation

Analyze sales by multiple dimensions:

```sql
SELECT product_type, customer_segment, SUM(revenue) as total
FROM sales
GROUP BY CUBE(product_type, customer_segment);
```

This gives totals for:
- Each product_type + customer_segment combination
- Each product_type (all segments)
- Each customer_segment (all products)
- Grand total

## Working with NULL Markers

NULL in the result indicates an aggregation level. To distinguish from actual NULL data:

```sql
SELECT
    COALESCE(region, '(All Regions)') as region,
    COALESCE(category, '(All Categories)') as category,
    SUM(amount) as total
FROM sales
GROUP BY ROLLUP(region, category);
```

Result:
```
region         | category          | total
---------------+-------------------+-------
East           | Clothing          | 50.0
East           | Electronics       | 250.0
West           | Electronics       | 200.0
West           | Clothing          | 135.0
East           | (All Categories)  | 300.0
West           | (All Categories)  | 335.0
(All Regions)  | (All Categories)  | 635.0
```

## Performance Considerations

- **CUBE** generates 2^n groupings, which can be expensive for many columns
- **ROLLUP** is more efficient for hierarchical data (n+1 groupings)
- Consider adding a WHERE clause to reduce input data
- Indexes don't help with ROLLUP/CUBE aggregations

### Recommended Limits

| Columns | ROLLUP Groupings | CUBE Groupings |
|---------|------------------|----------------|
| 2 | 3 | 4 |
| 3 | 4 | 8 |
| 4 | 5 | 16 |
| 5 | 6 | 32 |
| 6 | 7 | 64 |

For more than 4-5 columns, CUBE may produce too many rows.

## Complete Example

```sql
-- Sales analysis with ROLLUP
CREATE TABLE quarterly_sales (
    id INTEGER PRIMARY KEY,
    year INTEGER,
    quarter INTEGER,
    product TEXT,
    revenue FLOAT
);

INSERT INTO quarterly_sales VALUES
(1, 2024, 1, 'Widget', 10000),
(2, 2024, 1, 'Gadget', 15000),
(3, 2024, 2, 'Widget', 12000),
(4, 2024, 2, 'Gadget', 18000),
(5, 2024, 3, 'Widget', 11000),
(6, 2024, 3, 'Gadget', 16000);

-- Hierarchical report: Year > Quarter > Product
SELECT
    year,
    quarter,
    product,
    SUM(revenue) as total_revenue,
    COUNT(*) as transactions
FROM quarterly_sales
GROUP BY ROLLUP(year, quarter, product)
ORDER BY year, quarter, product;
```

This produces a report with:
- Detail rows per product per quarter
- Quarterly totals (all products)
- Yearly totals (all quarters, all products)
- Grand total