pg_exporter 0.11.1

PostgreSQL metric exporter for Prometheus
Documentation
# pg_stat_statements Collector

The `statements` collector tracks query performance metrics from PostgreSQL's `pg_stat_statements` extension. It's one of the most powerful tools for identifying and optimizing slow queries in production.

`pg_exporter` supports PostgreSQL 14 and newer, so all metrics documented here assume PostgreSQL 14+.

## Why This Matters

- **Find slow queries during incidents** - "What query is causing high load?"
- **Detect N+1 query problems** - Before they scale and impact production
- **Identify performance regressions** - After deployments or configuration changes
- **Optimize based on real data** - Use actual production query patterns, not guesses
- **Track resource-intensive queries** - I/O, WAL generation, temp files

This collector complements other collectors:
- **`default`** - System-wide metrics (cache hit ratio, checkpoints, connections)
- **`stat.user_tables`** - Table-level metrics (bloat, vacuum, DML rates)
- **`statements`** - Query-level metrics (execution time, frequency, I/O)

Together, they provide complete visibility from system → table → query level.

## Prerequisites

### Enable the Extension

```sql
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
```

### Configure PostgreSQL

Add to `postgresql.conf`:

```ini
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
pg_stat_statements.max = 10000
```

Restart PostgreSQL after modifying `postgresql.conf`.

## Usage

Enable the collector:

```bash
pg_exporter --dsn postgresql:///postgres?user=postgres_exporter --collector.statements
```

By default, it tracks the **top 25 queries** by total execution time.

Configure the number of queries to track:

```bash
# Track top 10 queries
pg_exporter --dsn postgresql://... --collector.statements --statements.top-n=10

# Environment variable form
PG_EXPORTER_STATEMENTS_TOP_N=50 pg_exporter --dsn postgresql://... --collector.statements
```

## Key Metrics

### Execution Time
- `pg_stat_statements_total_exec_time_seconds` - Total time spent in this query
- `pg_stat_statements_mean_exec_time_seconds` - Average time per execution
- `pg_stat_statements_max_exec_time_seconds` - Slowest execution
- `pg_stat_statements_stddev_exec_time_seconds` - Execution time variance

### Frequency
- `pg_stat_statements_calls_total` - How many times the query was executed
- `pg_stat_statements_rows_total` - Total rows returned/affected

### I/O Metrics
- `pg_stat_statements_shared_blks_hit_total` - Cache hits (fast)
- `pg_stat_statements_shared_blks_read_total` - Disk reads (slow)
- `pg_stat_statements_temp_blks_written_total` - Queries spilling to disk
- `pg_stat_statements_cache_hit_ratio` - Query cache effectiveness (0.0-1.0)

### Resource Usage
- `pg_stat_statements_wal_bytes_total` - WAL generation

## Use Cases

### 1. Finding Slow Queries

PromQL query to find queries with highest total time:

```promql
topk(10, 
  pg_stat_statements_total_exec_time_seconds
)
```

### 2. Identifying N+1 Problems

Queries executed many times with low row counts:

```promql
pg_stat_statements_calls_total > 1000
and
pg_stat_statements_rows_total / pg_stat_statements_calls_total < 10
```

### 3. Cache Miss Detection

Queries with poor cache hit ratios:

```promql
pg_stat_statements_cache_hit_ratio < 0.9
and
pg_stat_statements_calls_total > 100
```

### 4. Temp File Usage

Queries writing to disk (needs more work_mem):

```promql
rate(pg_stat_statements_temp_blks_written_total[5m]) > 0
```

### 5. Performance Regression Detection

Alert on queries getting slower:

```promql
increase(pg_stat_statements_mean_exec_time_seconds[1h]) > 0.5
```

## Labels

All metrics include these labels:

- `queryid` - Unique query identifier
- `datname` - Database name
- `usename` - User/role name
- `query_short` - First 80 characters of the query (or `<utility>` for VACUUM/ANALYZE)

`query_short` is intentionally capped at 80 characters to keep Prometheus label
cardinality and label size under control. It is meant for fast identification in
Prometheus and Grafana, not as a full SQL text export.

When you need the full normalized statement text, use the `queryid` label from
the metric and query `pg_stat_statements` directly.

Example:

```sql
SELECT
    queryid::text,
    d.datname,
    r.rolname,
    s.query
FROM pg_stat_statements s
JOIN pg_database d ON d.oid = s.dbid
LEFT JOIN pg_roles r ON r.oid = s.userid
WHERE queryid::text = '<queryid-from-metric>';
```

If you want to narrow the search further, also filter by `datname`:

```sql
SELECT
    queryid::text,
    d.datname,
    r.rolname,
    s.query
FROM pg_stat_statements s
JOIN pg_database d ON d.oid = s.dbid
LEFT JOIN pg_roles r ON r.oid = s.userid
WHERE queryid::text = '<queryid-from-metric>'
  AND d.datname = '<database-from-metric>';
```

## Important Notes

### Query Text Normalization

PostgreSQL normalizes queries by replacing constants with placeholders:

```sql
-- These are the same query:
SELECT * FROM users WHERE id = 1;
SELECT * FROM users WHERE id = 2;

-- Tracked as:
SELECT * FROM users WHERE id = $1;
```

### Utility Statements

Utility statements (VACUUM, ANALYZE, CREATE INDEX, etc.) may appear as `<utility>` in the `query_short` label since PostgreSQL doesn't always track their full text.

### Top N Queries

The collector tracks the top N queries **by total execution time**. This means:
- Long-running infrequent queries appear at the top
- Fast but frequent queries also appear if their total time is high
- Adjust `--statements.top-n` based on your query diversity and scrape budget

### Performance Impact

`pg_stat_statements` has minimal overhead (typically <1% CPU). However:
- Higher `pg_stat_statements.max` values use more memory
- The collector queries `pg_stat_statements` on each scrape
- For high-traffic databases, consider longer scrape intervals

## Troubleshooting

### Extension Not Found

```
ERROR: extension "pg_stat_statements" is not available
```

**Solution**: Install the extension package:
```bash
# Debian/Ubuntu
apt-get install postgresql-contrib

# RHEL/CentOS
yum install postgresql-contrib
```

### No Metrics Appear

**Possible causes**:
1. Extension not loaded - Check `SHOW shared_preload_libraries;`
2. Extension not created - Run `CREATE EXTENSION pg_stat_statements;`
3. No queries executed yet - Run some queries to populate stats
4. Collector not enabled - Use `--collector.statements`

### Query Text Shows as NULL or `<utility>`

This is normal for:
- Utility statements (VACUUM, ANALYZE, etc.)
- Queries from other monitoring tools
- Internal PostgreSQL operations

## Best Practices

1. **Enable in production** - Query-level insights are essential for troubleshooting
2. **Monitor the top 50-100 queries** - Balance coverage vs cardinality
3. **Reset stats after major changes** - `SELECT pg_stat_statements_reset();` after schema migrations
4. **Set appropriate scrape intervals** - 30-60 seconds is usually sufficient
5. **Combine with other collectors** - Use `activity`, `stat`, and `vacuum` collectors together for complete visibility

## Example Grafana Dashboard

Track your slowest queries:

```promql
# Panel 1: Top 10 Slowest Queries (by total time)
topk(10, pg_stat_statements_total_exec_time_seconds)

# Panel 2: Most Called Queries
topk(10, rate(pg_stat_statements_calls_total[5m]))

# Panel 3: Cache Hit Ratio Heatmap
pg_stat_statements_cache_hit_ratio

# Panel 4: Queries Writing Temp Files
sum by (query_short) (
  rate(pg_stat_statements_temp_blks_written_total[5m])
)
```

## References

- [PostgreSQL pg_stat_statements documentation]https://www.postgresql.org/docs/current/pgstatstatements.html
- [Source code]pg_statements.rs
- [Integration tests]../../tests/collectors/statements/pg_statements.rs