pg_tviews 0.1.0-beta.12

Transactional materialized views with incremental refresh for PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
# Monitoring Guide

Comprehensive monitoring setup for pg_tviews in production environments.

**Version**: 0.1.0-beta.1 • **Last Updated**: December 11, 2025

## Overview

This guide covers monitoring pg_tviews performance, health, and operational metrics. Proper monitoring ensures your TVIEWs stay performant and consistent with your base tables.

## Health Checks

### Basic Health Check

Implement a basic health check function:

```sql
-- Create health check function
CREATE OR REPLACE FUNCTION pg_tviews_health_check()
RETURNS TABLE(
    check_name text,
    status text,
    message text,
    details jsonb
) AS $$
DECLARE
    ext_version text;
    ivm_available boolean;
    trigger_count int;
    tview_count int;
BEGIN
    -- Check extension version
    SELECT pg_tviews_version() INTO ext_version;

    RETURN QUERY SELECT
        'extension_version'::text,
        'OK'::text,
        'pg_tviews extension loaded'::text,
        jsonb_build_object('version', ext_version);

    -- Check jsonb_delta availability
    SELECT pg_tviews_check_jsonb_delta() INTO ivm_available;

    RETURN QUERY SELECT
        'jsonb_delta'::text,
        CASE WHEN ivm_available THEN 'OK' ELSE 'WARNING' END,
        CASE WHEN ivm_available
             THEN 'jsonb_delta extension available'
             ELSE 'jsonb_delta extension not available - reduced performance'
        END,
        jsonb_build_object('available', ivm_available);

    -- Check trigger count
    SELECT COUNT(*)::int INTO trigger_count
    FROM pg_trigger
    WHERE tgname LIKE 'tview%';

    RETURN QUERY SELECT
        'triggers'::text,
        CASE WHEN trigger_count > 0 THEN 'OK' ELSE 'ERROR' END,
        format('%s TVIEW triggers found', trigger_count),
        jsonb_build_object('count', trigger_count);

    -- Check TVIEW count
    SELECT COUNT(*)::int INTO tview_count
    FROM pg_tview_meta;

    RETURN QUERY SELECT
        'tviews'::text,
        CASE WHEN tview_count > 0 THEN 'OK' ELSE 'WARNING' END,
        format('%s TVIEWs registered', tview_count),
        jsonb_build_object('count', tview_count);

END;
$$ LANGUAGE plpgsql;
```

### Usage

```sql
-- Run health check
SELECT * FROM pg_tviews_health_check();

-- Check for issues
SELECT * FROM pg_tviews_health_check()
WHERE status IN ('ERROR', 'WARNING');
```

## Performance Metrics

### Queue Statistics

Monitor TVIEW refresh queue performance:

```sql
-- Get current queue statistics
SELECT pg_tviews_queue_stats();
```

Returns JSONB with:
```json
{
  "queue_size": 5,
  "total_refreshes": 23,
  "total_iterations": 2,
  "max_iterations": 3,
  "total_timing_ms": 45.2,
  "graph_cache_hit_rate": 0.85,
  "table_cache_hit_rate": 0.92,
  "graph_cache_hits": 12,
  "graph_cache_misses": 2,
  "table_cache_hits": 18,
  "table_cache_misses": 2
}
```

### Key Metrics to Monitor

```sql
-- Queue size (should be low)
SELECT (pg_tviews_queue_stats()->>'queue_size')::int as queue_size;

-- Refresh timing (should be fast)
SELECT (pg_tviews_queue_stats()->>'total_timing_ms')::float as total_timing_ms;

-- Cache hit rates (should be > 80%)
SELECT
    (pg_tviews_queue_stats()->>'graph_cache_hit_rate')::float as graph_hit_rate,
    (pg_tviews_queue_stats()->>'table_cache_hit_rate')::float as table_hit_rate;
```

### Performance Summary

Track historical performance:

```sql
-- Create performance tracking table
CREATE TABLE tview_performance_history (
    id BIGSERIAL PRIMARY KEY,
    collected_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    queue_stats jsonb,
    active_tviews jsonb
);

-- Collect performance data
CREATE OR REPLACE FUNCTION collect_tview_performance()
RETURNS void AS $$
BEGIN
    INSERT INTO tview_performance_history (queue_stats, active_tviews)
    SELECT
        pg_tviews_queue_stats(),
        jsonb_object_agg(entity, jsonb_build_object(
            'table_size', pg_total_relation_size(table_oid),
            'last_refresh', now()
        ))
    FROM pg_tview_meta;
END;
$$ LANGUAGE plpgsql;
```

## Cache Monitoring

### Cache Hit Rates

Monitor caching effectiveness:

```sql
-- Graph cache performance
SELECT
    (pg_tviews_queue_stats()->>'graph_cache_hits')::int as hits,
    (pg_tviews_queue_stats()->>'graph_cache_misses')::int as misses,
    CASE
        WHEN (pg_tviews_queue_stats()->>'graph_cache_hits')::int +
             (pg_tviews_queue_stats()->>'graph_cache_misses')::int > 0
        THEN ROUND(
            (pg_tviews_queue_stats()->>'graph_cache_hits')::float /
            ((pg_tviews_queue_stats()->>'graph_cache_hits')::float +
             (pg_tviews_queue_stats()->>'graph_cache_misses')::float) * 100, 2
        )
        ELSE 0
    END as hit_rate_percentage;

-- Table cache performance
SELECT
    (pg_tviews_queue_stats()->>'table_cache_hits')::int as hits,
    (pg_tviews_queue_stats()->>'table_cache_misses')::int as misses,
    CASE
        WHEN (pg_tviews_queue_stats()->>'table_cache_hits')::int +
             (pg_tviews_queue_stats()->>'table_cache_misses')::int > 0
        THEN ROUND(
            (pg_tviews_queue_stats()->>'table_cache_hits')::float /
            ((pg_tviews_queue_stats()->>'table_cache_hits')::float +
             (pg_tviews_queue_stats()->>'table_cache_misses')::float) * 100, 2
        )
        ELSE 0
    END as hit_rate_percentage;
```

### Cache Size Monitoring

Track cache memory usage:

```sql
-- Estimate cache sizes
SELECT
    'graph_cache' as cache_type,
    COUNT(*) as entries,
    pg_size_pretty(SUM(octet_length(entity::text))) as estimated_size
FROM pg_tview_meta;

-- Monitor for cache bloat
SELECT
    schemaname,
    tablename,
    n_tup_ins, n_tup_upd, n_tup_del,
    n_live_tup, n_dead_tup,
    ROUND(n_dead_tup::float / GREATEST(n_live_tup, 1) * 100, 2) as bloat_ratio
FROM pg_stat_user_tables
WHERE schemaname = 'public' AND tablename LIKE 'tv_%'
ORDER BY bloat_ratio DESC;
```

## TVIEW Consistency Monitoring

### Count Consistency

Monitor TVIEW vs base table counts:

```sql
-- Check for count mismatches
CREATE OR REPLACE FUNCTION check_tview_consistency()
RETURNS TABLE(
    tview_name text,
    base_count bigint,
    tview_count bigint,
    difference bigint,
    status text
) AS $$
DECLARE
    rec record;
BEGIN
    FOR rec IN
        SELECT
            m.entity,
            m.table_oid,
            (SELECT COUNT(*) FROM pg_class c WHERE c.oid = m.table_oid) as tview_count,
            CASE
                WHEN m.entity = 'post' THEN (SELECT COUNT(*) FROM tb_post)
                WHEN m.entity = 'user' THEN (SELECT COUNT(*) FROM tb_user)
                WHEN m.entity = 'comment' THEN (SELECT COUNT(*) FROM tb_comment)
                ELSE 0
            END as base_count
        FROM pg_tview_meta m
    LOOP
        RETURN QUERY SELECT
            'tv_' || rec.entity,
            rec.base_count,
            rec.tview_count,
            rec.base_count - rec.tview_count,
            CASE
                WHEN rec.base_count = rec.tview_count THEN 'OK'
                WHEN ABS(rec.base_count - rec.tview_count) < (rec.base_count * 0.01) THEN 'WARNING'
                ELSE 'ERROR'
            END;
    END LOOP;
END;
$$ LANGUAGE plpgsql;
```

### Data Consistency

Spot check data consistency:

```sql
-- Random sampling of TVIEW consistency
CREATE OR REPLACE FUNCTION sample_tview_consistency(sample_size int DEFAULT 10)
RETURNS TABLE(
    entity text,
    id uuid,
    base_data jsonb,
    tview_data jsonb,
    matches boolean
) AS $$
DECLARE
    rec record;
BEGIN
    -- Sample posts
    FOR rec IN
        SELECT
            p.id,
            jsonb_build_object('title', p.title, 'content', p.content) as base_data,
            (SELECT data FROM tv_post WHERE id = p.id) as tview_data
        FROM tb_post p
        ORDER BY RANDOM()
        LIMIT sample_size
    LOOP
        RETURN QUERY SELECT
            'post'::text,
            rec.id,
            rec.base_data,
            rec.tview_data,
            rec.base_data = (rec.tview_data - 'id' - 'author') as matches;
    END LOOP;
END;
$$ LANGUAGE plpgsql;
```

## Alerting Setup

### Nagios/Icinga Checks

```bash
#!/bin/bash
# Nagios check for pg_tviews health

PSQL="psql -h $PGHOST -U $PGUSER -d $PGDATABASE -t -c"

# Check extension
VERSION=$($PSQL "SELECT pg_tviews_version()")
if [ $? -ne 0 ]; then
    echo "CRITICAL: Cannot connect to pg_tviews"
    exit 2
fi

# Check queue size
QUEUE_SIZE=$($PSQL "SELECT (pg_tviews_queue_stats()->>'queue_size')::int")
if [ $QUEUE_SIZE -gt 1000 ]; then
    echo "CRITICAL: TVIEW queue size $QUEUE_SIZE > 1000"
    exit 2
elif [ $QUEUE_SIZE -gt 100 ]; then
    echo "WARNING: TVIEW queue size $QUEUE_SIZE > 100"
    exit 1
fi

# Check cache hit rates
GRAPH_RATE=$($PSQL "SELECT (pg_tviews_queue_stats()->>'graph_cache_hit_rate')::float * 100")
if [ $(echo "$GRAPH_RATE < 80" | bc -l) -eq 1 ]; then
    echo "WARNING: Graph cache hit rate ${GRAPH_RATE}% < 80%"
    exit 1
fi

echo "OK: pg_tviews healthy - queue: $QUEUE_SIZE, cache: ${GRAPH_RATE}%"
exit 0
```

### Prometheus Metrics

```sql
-- Create Prometheus metrics endpoint
CREATE OR REPLACE FUNCTION prometheus_metrics()
RETURNS text AS $$
DECLARE
    result text := '';
    rec record;
BEGIN
    -- Queue metrics
    SELECT * INTO rec FROM pg_tviews_queue_stats();

    result := result || '# HELP pgtviews_queue_size Current refresh queue size' || E'\n';
    result := result || '# TYPE pgtviews_queue_size gauge' || E'\n';
    result := result || 'pgtviews_queue_size ' || (rec->>'queue_size') || E'\n';

    result := result || '# HELP pgtviews_refresh_timing_ms Total refresh timing in milliseconds' || E'\n';
    result := result || '# TYPE pgtviews_refresh_timing_ms gauge' || E'\n';
    result := result || 'pgtviews_refresh_timing_ms ' || (rec->>'total_timing_ms') || E'\n';

    result := result || '# HELP pgtviews_cache_hit_rate Cache hit rate (0.0-1.0)' || E'\n';
    result := result || '# TYPE pgtviews_cache_hit_rate gauge' || E'\n';
    result := result || 'pgtviews_cache_hit_rate{type="graph"} ' || (rec->>'graph_cache_hit_rate') || E'\n';
    result := result || 'pgtviews_cache_hit_rate{type="table"} ' || (rec->>'table_cache_hit_rate') || E'\n';

    -- TVIEW metrics
    FOR rec IN SELECT entity, table_oid FROM pg_tview_meta LOOP
        result := result || '# HELP pgtviews_tview_size_bytes TVIEW table size in bytes' || E'\n';
        result := result || '# TYPE pgtviews_tview_size_bytes gauge' || E'\n';
        result := result || 'pgtviews_tview_size_bytes{tview="' || rec.entity || '"} ' ||
                  (SELECT pg_total_relation_size(rec.table_oid)) || E'\n';
    END LOOP;

    RETURN result;
END;
$$ LANGUAGE plpgsql;
```

### Grafana Dashboards

Create Grafana dashboard with these panels:

1. **Queue Size**: Line graph of queue size over time
2. **Refresh Timing**: Average refresh time in milliseconds
3. **Cache Hit Rates**: Graph and table cache hit percentages
4. **TVIEW Sizes**: Table sizes for each TVIEW
5. **Consistency Checks**: Count differences between TVIEWs and base tables
6. **Error Rates**: Failed refresh operations

## Logging and Tracing

### PostgreSQL Log Configuration

```sql
-- Enable detailed logging
ALTER SYSTEM SET log_statement = 'ddl';
ALTER SYSTEM SET log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ';
ALTER SYSTEM SET log_min_duration_statement = 1000;  -- Log slow queries

-- Log TVIEW operations
ALTER SYSTEM SET log_min_messages = 'info';
```

### TVIEW Operation Logging

```sql
-- Create audit log table
CREATE TABLE tview_audit_log (
    id BIGSERIAL PRIMARY KEY,
    logged_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    operation text NOT NULL,
    entity text,
    pk_value bigint,
    timing_ms float,
    success boolean,
    error_message text
);

-- Log TVIEW operations (requires custom triggers or application logging)
CREATE OR REPLACE FUNCTION log_tview_operation(
    op text,
    ent text,
    pk_val bigint,
    timing float DEFAULT NULL,
    success bool DEFAULT true,
    error_msg text DEFAULT NULL
) RETURNS void AS $$
BEGIN
    INSERT INTO tview_audit_log (operation, entity, pk_value, timing_ms, success, error_message)
    VALUES (op, ent, pk_val, timing, success, error_msg);
END;
$$ LANGUAGE plpgsql;
```

### Application-Level Monitoring

```javascript
// Application monitoring example
const monitorTviewRefresh = async (operation, entity, pkValue) => {
  const startTime = Date.now();

  try {
    // Perform operation
    await performDatabaseOperation(operation, entity, pkValue);

    // Log success
    await logTviewOperation('refresh', entity, pkValue,
                          Date.now() - startTime, true);

  } catch (error) {
    // Log failure
    await logTviewOperation('refresh', entity, pkValue,
                          Date.now() - startTime, false, error.message);

    // Alert on failures
    alertSystem.sendAlert('TVIEW_REFRESH_FAILED', {
      entity, pkValue, error: error.message
    });
  }
};
```

## Operational Dashboards

### Real-time Dashboard

```sql
-- Real-time metrics view
CREATE OR REPLACE VIEW tview_realtime_metrics AS
SELECT
    now() as collected_at,
    (pg_tviews_queue_stats()->>'queue_size')::int as queue_size,
    (pg_tviews_queue_stats()->>'total_timing_ms')::float as total_timing_ms,
    (pg_tviews_queue_stats()->>'graph_cache_hit_rate')::float as graph_cache_hit_rate,
    (pg_tviews_queue_stats()->>'table_cache_hit_rate')::float as table_cache_hit_rate,
    (SELECT COUNT(*) FROM pg_tview_meta) as tview_count,
    (SELECT COUNT(*) FROM pg_trigger WHERE tgname LIKE 'tview%') as trigger_count
;
```

### Historical Trends

```sql
-- Daily summary view
CREATE OR REPLACE VIEW tview_daily_summary AS
SELECT
    DATE(collected_at) as date,
    AVG((queue_stats->>'queue_size')::int) as avg_queue_size,
    MAX((queue_stats->>'queue_size')::int) as max_queue_size,
    AVG((queue_stats->>'total_timing_ms')::float) as avg_timing_ms,
    MIN((queue_stats->>'graph_cache_hit_rate')::float) as min_graph_hit_rate,
    COUNT(*) as samples_count
FROM tview_performance_history
WHERE collected_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY DATE(collected_at)
ORDER BY date DESC;
```

## Troubleshooting with Monitoring

### High Queue Size

**Symptoms**: Queue size > 100
**Investigation**:
```sql
-- Check what's in the queue
SELECT pg_tviews_debug_queue();

-- Check for slow operations
SELECT * FROM pg_stat_activity
WHERE query LIKE '%pg_tviews%' AND state = 'active';

-- Check cascade depth
SELECT COUNT(*) FROM pg_tview_meta;  -- Many TVIEWs = deep cascades
```

**Solutions**:
- Enable statement-level triggers
- Reduce cascade depth
- Optimize slow TVIEW definitions

### Low Cache Hit Rates

**Symptoms**: Cache hit rate < 80%
**Investigation**:
```sql
-- Check cache statistics
SELECT pg_tviews_queue_stats();

-- Check for cache invalidation
SELECT * FROM pg_tview_meta;  -- Many entities = more cache misses
```

**Solutions**:
- Increase shared_buffers
- Reduce number of entities
- Optimize TVIEW dependencies

### Performance Degradation

**Symptoms**: Slow refresh times
**Investigation**:
```sql
-- Check system resources
SELECT * FROM pg_stat_bgwriter;
SELECT * FROM pg_stat_database;

-- Check TVIEW sizes
SELECT schemaname, tablename, pg_total_relation_size(oid) as size
FROM pg_class
WHERE relname LIKE 'tv_%'
ORDER BY size DESC;
```

**Solutions**:
- Add indexes on TVIEWs
- Partition large TVIEWs
- Optimize TVIEW definitions

## Best Practices

### Monitoring Setup

1. **Automate Health Checks**: Run every 5 minutes
2. **Set Up Alerts**: Critical for queue size > 1000, timing > 5 seconds
3. **Monitor Trends**: Track performance over time
4. **Log Everything**: Enable comprehensive logging

### Alert Thresholds

- **Queue Size**: Warning > 100, Critical > 1000
- **Refresh Timing**: Warning > 1 second, Critical > 5 seconds
- **Cache Hit Rate**: Warning < 80%, Critical < 50%
- **Count Mismatch**: Any difference between TVIEW and base table

### Dashboard Organization

1. **Real-time Panel**: Current queue size, timing, hit rates
2. **Historical Trends**: 24-hour and 7-day views
3. **TVIEW Health**: Size, count consistency, error rates
4. **System Resources**: CPU, memory, disk I/O correlation

## See Also

- [Operator Guide]../user-guides/operators.md - Production deployment
- [Troubleshooting Guide]troubleshooting.md - Issue resolution
- [Performance Tuning]performance-tuning.md - Optimization strategies