chopin-pg 0.5.20

A high-performance, asynchronous PostgreSQL driver for the Chopin framework.
Documentation

chopin-pg

Build status Crates.io Downloads License Rust

High-fidelity engineering for the modern virtuoso.

chopin-pg is a high‑performance, zero‑dependency PostgreSQL driver for the Chopin suite. Built for thread‑per‑core architectures with synchronous non‑blocking I/O, per‑worker connection pools, and zero external runtime dependencies (only libc).

Features

  • Zero external dependencies — all crypto (SCRAM-SHA-256), codec, and protocol are hand-written
  • Thread-per-core — each worker owns its connections and pool; no Arc, no Mutex
  • Synchronous non-blocking I/O — sockets in NB mode with poll-based application-level timeouts
  • Extended Query Protocol — prepared statements with binary parameter encoding
  • Statement cache — FNV-1a hash-based with LRU eviction and configurable capacity
  • Connection poolPgPool with checkout timeout, idle/max lifetime, test-on-checkout, auto-reconnect
  • COPY protocol — bulk COPY IN/COPY OUT with streaming CopyWriter/CopyReader
  • LISTEN/NOTIFY — async notification support with buffered delivery
  • Transactionsbegin/commit/rollback, savepoints, nested transactions, closure-based API
  • 22 PostgreSQL types — Bool, Int2/4/8, Float4/8, Text, Bytes, Json, Jsonb, Uuid, Date, Time, Timestamp, Timestamptz, Interval, Inet, Numeric, MacAddr, Point, Range, Array
  • Binary wire format — per-parameter format codes with binary result decoding
  • SCRAM-SHA-256 auth — zero-dep implementation; cleartext password also supported
  • Unix domain socketsPgConfig.socket_dir or ?host= URL parameter
  • Error classificationErrorClass::Transient/Permanent/Client/Pool with SQLSTATE mapping
  • Retry helperretry(max_retries, || { ... }) with transient error detection
  • Production hardening — broken connection flag, TCP_NODELAY, zero-copy writes, Rc<ColumnDesc> sharing

🚀 Benchmarks

chopin-pg is 1.5–3x faster than async drivers on query throughput due to its synchronous non-blocking architecture and zero external dependencies.

Real-World Performance (localhost PostgreSQL, 100K iterations for simple queries, 10K for CRUD)

Benchmark results from bench_compare — actual run, single connection per driver:

Traditional CRUD

Workload chopin-pg sqlx (tokio) tokio-postgres vs sqlx vs tokio-pg
SELECT 1 50,044 req/s 17,902 req/s 22,105 req/s 2.80x 2.26x
Parameterized Query 53,405 req/s 17,840 req/s 20,283 req/s 2.99x 2.63x
CRUD SELECT 47,026 req/s 17,315 req/s 18,780 req/s 2.72x 2.50x
CRUD UPDATE 15,230 req/s 9,579 req/s 9,583 req/s 1.59x 1.59x
CRUD INSERT 14,083 req/s 9,394 req/s 10,254 req/s 1.50x 1.37x

TFB Multi-Query (500 requests per N)

N chopin-pg sqlx tokio-postgres vs sqlx vs tokio-pg
1 45,616 req/s 17,564 req/s 18,105 req/s 2.60x 2.52x
5 9,506 req/s 3,514 req/s 3,725 req/s 2.71x 2.55x
10 4,475 req/s 1,763 req/s 1,769 req/s 2.54x 2.53x
15 3,317 req/s 1,140 req/s 1,243 req/s 2.91x 2.67x
20 2,325 req/s 869 req/s 918 req/s 2.68x 2.53x

TFB Database Updates (500 requests per N, SELECT+UPDATE each row)

N chopin-pg sqlx tokio-postgres vs sqlx vs tokio-pg
1 12,068 req/s 6,443 req/s 6,370 req/s 1.87x 1.89x
5 2,137 req/s 1,272 req/s 1,228 req/s 1.68x 1.74x
10 1,047 req/s 605 req/s 654 req/s 1.73x 1.60x
15 624 req/s 410 req/s 398 req/s 1.52x 1.57x
20 525 req/s 286 req/s 318 req/s 1.84x 1.65x

Benchmark Configuration:

  • 100K iterations for simple queries (SELECT 1, parameterized queries)
  • 10K iterations for CRUD operations (SELECT, UPDATE, INSERT)
  • 500 requests per TFB Multi-Query count (N=1,5,10,15,20)
  • Single connection per driver (no connection pooling overhead)
  • Localhost PostgreSQL on port 5432

Why Faster?

  1. No async runtime overhead — Synchronous poll()-based I/O eliminates Tokio task scheduler, Future polling, and context switching
  2. Shared-nothing per-worker pools — No lock contention; each worker owns its connections
  3. Zero external dependencies — Only libc; no 50+ transitive deps from tokio/sqlx
  4. Hand-tuned protocol — Custom SCRAM-SHA-256, statement cache, CompactBytes inline storage (≤24 bytes)
  5. CPU affinity — Workers pinned to cores; no task migration

Trade-off: Synchronous API vs. async/await ergonomics. Chopin excels for high-throughput backends (REST APIs, database proxies); less suitable for general async applications.

For detailed benchmark setup, running your own comparisons, and profiling instructions, see:

�🛠️ Quick Start

use chopin_pg::{PgConfig, PgConnection, PgResult};

fn main() -> PgResult<()> {
    let config = PgConfig::from_url("postgres://user:pass@localhost:5432/db")?;
    let mut conn = PgConnection::connect(&config)?;

    // Simple query (no parameters)
    let rows = conn.query_simple("SELECT current_database()")?;
    println!("Database: {:?}", rows[0].get(0)?);

    // Prepared statement with binary parameters
    let rows = conn.query(
        "SELECT id, name FROM users WHERE id = $1",
        &[&42i32],
    )?;
    for row in &rows {
        let id: i32 = row.get_typed(0)?;
        let name: String = row.get_typed(1)?;
        println!("User {}: {}", id, name);
    }

    // Execute (returns affected row count)
    let affected = conn.execute(
        "UPDATE users SET active = $1 WHERE id = $2",
        &[&true, &42i32],
    )?;
    println!("Updated {} rows", affected);

    Ok(())
}

🔗 Connection Pool

use chopin_pg::{PgConfig, PgPool, PgPoolConfig};
use std::time::Duration;

let config = PgConfig::from_url("postgres://user:pass@localhost:5432/db")?;

// Simple pool
let mut pool = PgPool::connect(config.clone(), 10)?;

// Advanced pool with configuration
let pool_config = PgPoolConfig::new()
    .max_size(25)
    .min_size(5)
    .checkout_timeout(Duration::from_secs(5))
    .idle_timeout(Duration::from_secs(300))
    .max_lifetime(Duration::from_secs(3600))
    .test_on_checkout(true);

let mut pool = PgPool::connect_with_config(config, pool_config)?;

// Get a connection (auto-returned on drop)
let mut conn = pool.get()?;
conn.query_simple("SELECT 1")?;

// Monitor pool health
println!("Active: {}, Idle: {}, Total: {}",
    pool.active_connections(), pool.idle_connections(), pool.total_connections());
let stats = pool.stats();
println!("Checkouts: {}, Created: {}", stats.total_checkouts, stats.total_connections_created);

📋 COPY Protocol (Bulk Operations)

// Bulk COPY IN
let mut writer = conn.copy_in("COPY users (name, email) FROM STDIN WITH (FORMAT csv)")?;
writer.write_row(&["Alice", "alice@example.com"])?;
writer.write_row(&["Bob", "bob@example.com"])?;
let rows_copied = writer.finish()?;
println!("Copied {} rows", rows_copied);

// COPY OUT
let mut reader = conn.copy_out("COPY users TO STDOUT WITH (FORMAT csv)")?;
let all_data = reader.read_all()?;
println!("Export: {}", String::from_utf8_lossy(&all_data));

🔔 LISTEN/NOTIFY

conn.listen("events")?;
conn.notify("events", "hello world")?;

// Poll for notifications
if let Some(notif) = conn.poll_notification()? {
    println!("Channel: {}, Payload: {}", notif.channel, notif.payload);
}

// Drain all buffered notifications
for notif in conn.drain_notifications() {
    println!("{}: {}", notif.channel, notif.payload);
}

conn.unlisten("events")?;

🔄 Transactions

// Closure-based (auto-commit on Ok, auto-rollback on Err)
conn.transaction(|tx| {
    tx.execute("INSERT INTO users (name) VALUES ($1)", &[&"Alice"])?;
    tx.execute("INSERT INTO users (name) VALUES ($1)", &[&"Bob"])?;
    Ok(())
})?;

// Manual control
conn.begin()?;
conn.execute("INSERT INTO users (name) VALUES ($1)", &[&"Charlie"])?;
conn.commit()?;

// Savepoints
conn.begin()?;
conn.savepoint("sp1")?;
conn.execute("INSERT INTO users (name) VALUES ($1)", &[&"Dave"])?;
conn.rollback_to("sp1")?;  // undo Dave
conn.commit()?;

// Nested transactions
conn.transaction(|tx| {
    tx.execute("INSERT INTO users (name) VALUES ($1)", &[&"Eve"])?;
    tx.transaction(|nested_tx| {
        nested_tx.execute("INSERT INTO users (name) VALUES ($1)", &[&"Frank"])?;
        Ok(())
    })?;
    Ok(())
})?;

📊 Supported PostgreSQL Types

PgValue Variant PostgreSQL Type Rust ToSql/FromSql
Bool BOOLEAN bool
Int2 SMALLINT i16
Int4 INTEGER i32
Int8 BIGINT i64
Float4 REAL f32
Float8 DOUBLE PRECISION f64
Text TEXT, VARCHAR String, &str
Bytes BYTEA Vec<u8>, &[u8]
Json JSON String
Jsonb JSONB Vec<u8>
Uuid UUID [u8; 16]
Date DATE i32 (PG epoch days)
Time TIME i64 (microseconds)
Timestamp TIMESTAMP i64 (microseconds)
Timestamptz TIMESTAMPTZ i64 (microseconds)
Interval INTERVAL {months, days, microseconds}
Inet INET, CIDR IpAddr, Ipv4Addr, Ipv6Addr
Numeric NUMERIC String (lossless precision)
MacAddr MACADDR [u8; 6]
Point POINT (f64, f64)
Range INT4RANGE, INT8RANGE, etc. String
Array ARRAY types Vec<T> for scalar T

🔐 Authentication

  • SCRAM-SHA-256 — fully implemented with zero external dependencies
  • Cleartext password — supported
  • MD5 — recognized but returns an error (not implemented)

🔌 Connection Pool Sizing for High Concurrency

When handling high concurrency (e.g., 512+ concurrent connections), proper connection pool sizing is critical. Understanding the relationship between HTTP concurrency and database pool size is essential to avoid connection starvation and timeouts.

Why Pool Size Matters

A common mistake is assuming a 1:1 ratio between concurrent HTTP connections and database pool size. This fails because:

  • Not all incoming requests hit the database simultaneously. At any moment, only 30-40% of HTTP connections are actively waiting on DB queries. The rest are parsing requests, serializing responses, or executing in middleware.
  • Database connections are expensive. Each connection consumes memory and resources on both the client and server. Creating a connection for every possible concurrent request wastes resources.
  • Connection starvation causes cascading failures. If all pool connections are busy and a new request arrives, it must wait. If many requests queue, timeouts increase exponentially.

The Right Formula

Pool Size per Worker = (Total Concurrent Connections / Number of Workers) × Connection Ratio

Connection Ratio (typical): 0.3 to 0.5 (or 2:1 to 5:1 HTTP:DB ratio)

Example: 512 Concurrent Connections

Assuming an 8-core system with 8 workers:

512 connections ÷ 8 workers = 64 connections per worker

❌ Pool size 64 per worker:  64:64 = 1:1 ratio (FAILS - connection starvation)
✅ Pool size 25 per worker:  64:25 = 2.5:1 ratio (RECOMMENDED)
✅ Pool size 20 per worker:  64:20 = 3.2:1 ratio (CONSERVATIVE)
✅ Pool size 32 per worker:  64:32 = 2:1 ratio (IF READ-HEAVY)

Why 64 failed: A 1:1 ratio means every HTTP connection needs its own DB connection. Since DB operations are fast, the pool becomes the bottleneck instead of the database. Requests queue up waiting for available connections, leading to timeouts.

Configuration

Set pool size when initializing the connection pool:

use chopin_pg::{PgConfig, PgPool};

let config = PgConfig::from_url("postgres://user:pass@localhost:5432/db")?;

// For 512 concurrent with 8 workers, use 25 per worker
let pool = PgPool::new(config, 25);  // ← Recommended starting point

Load Testing Recommendations

After configuring pool size, validate under realistic load:

# Load test with 512 concurrent clients, 8 threads, 30 seconds
wrk -t 8 -c 512 -d 30s http://localhost:8080/api/endpoint

# Monitor for:
# - Connection pool timeouts
# - Response latency increases
# - "All connections busy" errors in logs

# Database connection stats (in psql):
SELECT count(*) FROM pg_stat_activity;  -- Current active connections
SHOW max_connections;                    -- PostgreSQL server limit (default: 100)

Tuning Guidelines

Load Pattern Suggested Pool Size Ratio Notes
Read-heavy (80%+ reads) 30-35 per worker 2:1 Queries are fast; can support higher concurrency
Balanced (50/50) 20-25 per worker 2.5-3.2:1 Starting point for most workloads
Write-heavy (80%+ writes) 15-20 per worker 4-5:1 Queries are slower; queue requests instead
Microservices + API calls 25-40 per worker 2-3:1 External latency means more waiting connections

Monitoring & Alerts

Set up monitoring for pool exhaustion:

// Desired: Log when pool utilization > 80%
// If pool_size=25 and active_connections > 20, investigate

// Symptoms of undersized pool:
// - Increasing avg response time under sustained load
// - Queries queued in pg_stat_activity
// - Application logs: "Pool connection timeout"
// - Database slow query log fills up

Summary

  • Never use 1:1 ratio of HTTP connections to DB pool size
  • Start with 2.5:1 ratio (20-25 pool size for 512 concurrent / 8 workers)
  • Load test under realistic conditions before production deployment
  • Monitor pool utilization and adjust based on actual behavior

For 512 concurrent connections, a well-tuned pool of 25 connections per worker will handle typical API workloads efficiently while preventing resource exhaustion.