pg_ripple_http 0.128.0

SPARQL 1.1 Protocol HTTP endpoint for pg_ripple — connects PostgreSQL 18 RDF triple store to the web
pg_ripple_http-0.128.0 is not a library.

pg_ripple_http

Standalone HTTP service that exposes a W3C SPARQL 1.1 Protocol endpoint for pg_ripple. Any standard SPARQL client — YASGUI, SPARQLWrapper, Jena, or plain curl — can query pg_ripple without a PostgreSQL driver.

Build

cargo build --release -p pg_ripple_http

The binary is placed at target/release/pg_ripple_http.

Requirements: Rust 1.88+, and a running PostgreSQL 18 instance with the pg_ripple extension installed.

Run

./target/release/pg_ripple_http

On startup, the service connects to PostgreSQL, verifies that pg_ripple is available, and logs the connection details:

INFO pg_ripple_http: connected to postgresql://localhost/postgres (port 7878), triple store contains 12345 triples
INFO pg_ripple_http: pg_ripple_http listening on http://0.0.0.0:7878

Configuration

All configuration is via environment variables:

Variable Default Description
PG_RIPPLE_HTTP_PG_URL postgresql://localhost/postgres PostgreSQL connection URL
PG_RIPPLE_HTTP_PORT 7878 HTTP listening port
PG_RIPPLE_HTTP_POOL_SIZE 16 Database connection pool size
PG_RIPPLE_HTTP_AUTH_TOKEN (unset) If set, requests must include Authorization: Bearer <token>
PG_RIPPLE_HTTP_AUTH_REALM pg_ripple Value used in the Bearer realm= field of WWW-Authenticate response headers (L16-06, v0.117.0)
PG_RIPPLE_HTTP_METRICS_TOKEN (unset) If set, GET /metrics requires Authorization: Bearer <token> (M16-22)
PG_RIPPLE_HTTP_RATE_LIMIT 0 Max requests/sec per client IP (0 = disabled)
PG_RIPPLE_HTTP_CORS_ORIGINS * Comma-separated allowed origins, or * for all

Example:

export PG_RIPPLE_HTTP_PG_URL="postgresql://user:password@db-host:5432/mydb"
export PG_RIPPLE_HTTP_PORT=8080
export PG_RIPPLE_HTTP_AUTH_TOKEN="my-secret-token"
./target/release/pg_ripple_http

Endpoints

GET /health (liveness)

Returns 200 OK when the process is alive and the database is reachable. Use this for Kubernetes liveness probes — a non-200 response means the container should be restarted.

curl http://localhost:7878/health

GET /ready (readiness)

Returns 200 OK once the server has completed start-up and successfully obtained a connection from the pool, meaning it is ready to serve traffic. Use this for Kubernetes readiness probes — a non-200 response removes the pod from the service endpoints until recovery.

curl http://localhost:7878/ready

M16-09 (v0.115.0): Kubernetes livenessProbe should point at /health and readinessProbe at /ready. Using exec: pg_isready for both probes is deprecated from v0.115.0 onward.

GET /metrics

Prometheus-compatible metrics.

curl http://localhost:7878/metrics
# With optional bearer-token protection (M16-22, v0.115.0):
curl -H "Authorization: Bearer $PG_RIPPLE_HTTP_METRICS_TOKEN" http://localhost:7878/metrics

Security (M16-22, v0.115.0): Set PG_RIPPLE_HTTP_METRICS_TOKEN to require a bearer token on the /metrics endpoint. Requests without a valid token receive 401 Unauthorized with a WWW-Authenticate: Bearer challenge. Even without this token you should still restrict the metrics port to your Prometheus scraper IP via a reverse-proxy ACL.

See Security → Metrics Port Isolation for details.

GET /sparql?query=…

Run a SPARQL query via URL parameter.

curl -G http://localhost:7878/sparql \
  --data-urlencode "query=SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10"

POST /sparql

Run a SPARQL query or update via request body.

Content-Type Body
application/sparql-query Raw SPARQL SELECT/ASK/CONSTRUCT/DESCRIBE
application/sparql-update Raw SPARQL INSERT/DELETE
application/x-www-form-urlencoded query=… or update=…
# SELECT
curl -X POST http://localhost:7878/sparql \
  -H "Content-Type: application/sparql-query" \
  -d "SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10"

# Update
curl -X POST http://localhost:7878/sparql \
  -H "Content-Type: application/sparql-update" \
  -d 'INSERT DATA { <http://example.org/alice> <http://example.org/name> "Alice" }'

Content negotiation

Set the Accept header to control the response format:

Accept Format
application/sparql-results+json (default for SELECT/ASK) SPARQL Results JSON
application/sparql-results+xml SPARQL Results XML
text/csv CSV
text/tab-separated-values TSV
text/turtle (default for CONSTRUCT/DESCRIBE) Turtle
application/n-triples N-Triples
application/ld+json JSON-LD
curl -G http://localhost:7878/sparql \
  -H "Accept: text/csv" \
  --data-urlencode "query=SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 5"

Authentication

If PG_RIPPLE_HTTP_AUTH_TOKEN is set, every request must include the token:

curl -G http://localhost:7878/sparql \
  -H "Authorization: Bearer my-secret-token" \
  --data-urlencode "query=SELECT * WHERE { ?s ?p ?o } LIMIT 5"

Both Authorization: Bearer <token> and Authorization: Basic <token> are accepted.

Docker Compose

The root docker-compose.yml runs both PostgreSQL and pg_ripple_http together:

docker compose up

Services:

Service Port Description
postgres 5432 PostgreSQL 18 + pg_ripple
sparql 7878 SPARQL HTTP endpoint

Datalog API

Since v0.39.0, pg_ripple_http also exposes a /datalog REST namespace that lets any HTTP client manage Datalog rule sets, trigger inference, run goal-directed queries, check integrity constraints, and inspect monitoring statistics — without a PostgreSQL driver.

All Datalog endpoints accept and return application/json (unless specified otherwise). Rule text uses text/x-datalog or text/plain.

Authentication

The same PG_RIPPLE_HTTP_AUTH_TOKEN bearer token covers all /datalog/* endpoints. Optionally, set PG_RIPPLE_HTTP_DATALOG_WRITE_TOKEN to restrict mutating endpoints (POST, PUT, DELETE) to a separate token while keeping read endpoints (inference triggers, monitoring, GET) covered by the main token.

Phase 1 — Rule management

POST /datalog/rules/{rule_set}

Load rules from Datalog text. Body: text/x-datalog.

curl -X POST http://localhost:7878/datalog/rules/my-ontology \
  -H "Content-Type: text/x-datalog" \
  -d 'ancestor(?x, ?y) :- parent(?x, ?y).
ancestor(?x, ?z) :- parent(?x, ?y), ancestor(?y, ?z).'
# → {"rule_set": "my-ontology", "rules_loaded": 2}

POST /datalog/rules/{rule_set}/builtin

Load a built-in rule set (rdfs, owl-rl).

curl -X POST http://localhost:7878/datalog/rules/rdfs/builtin
# → {"rule_set": "rdfs", "rules_loaded": 13}

GET /datalog/rules

List all rule sets and their rules.

curl http://localhost:7878/datalog/rules
# → [{"id": 1, "rule_set": "my-ontology", "rule_text": "…", "active": true}, …]

DELETE /datalog/rules/{rule_set}

Delete all rules in a rule set.

curl -X DELETE http://localhost:7878/datalog/rules/my-ontology
# → {"deleted": 2}

POST /datalog/rules/{rule_set}/add

Add a single rule to an existing rule set. Body: text/x-datalog.

curl -X POST http://localhost:7878/datalog/rules/my-ontology/add \
  -H "Content-Type: text/x-datalog" \
  -d 'sibling(?x, ?y) :- parent(?p, ?x), parent(?p, ?y).'
# → {"rule_set": "my-ontology", "rule_id": 3}

DELETE /datalog/rules/{rule_set}/{rule_id}

Remove a single rule by ID (triggers DRed retraction).

curl -X DELETE http://localhost:7878/datalog/rules/my-ontology/3
# → {"removed": 1}

PUT /datalog/rules/{rule_set}/enable

Activate a rule set.

curl -X PUT http://localhost:7878/datalog/rules/my-ontology/enable
# → {"rule_set": "my-ontology", "enabled": true}

PUT /datalog/rules/{rule_set}/disable

Deactivate a rule set.

curl -X PUT http://localhost:7878/datalog/rules/my-ontology/disable
# → {"rule_set": "my-ontology", "enabled": false}

Phase 2 — Inference

POST /datalog/infer/{rule_set}

Materialize derived triples (semi-naive evaluation).

curl -X POST http://localhost:7878/datalog/infer/my-ontology
# → {"derived": 42}

POST /datalog/infer/{rule_set}/stats

Infer with detailed per-stratum statistics.

curl -X POST http://localhost:7878/datalog/infer/my-ontology/stats
# → {"derived": 42, "iterations": 3, "eliminated_rules": 0, "parallel_groups": 2, …}

POST /datalog/infer/{rule_set}/agg

Aggregate-aware inference (Datalog^agg).

curl -X POST http://localhost:7878/datalog/infer/my-ontology/agg
# → {"derived": 12}

POST /datalog/infer/{rule_set}/wfs

Well-Founded Semantics inference (three-valued).

curl -X POST http://localhost:7878/datalog/infer/my-ontology/wfs
# → {"derived": 8}

POST /datalog/infer/{rule_set}/demand

Demand-transformed (goal-directed) inference. Body: JSON.

curl -X POST http://localhost:7878/datalog/infer/my-ontology/demand \
  -H "Content-Type: application/json" \
  -d '{"demands": [{"predicate": "ancestor", "bound": [0]}]}'
# → {"derived": 12, "iterations": 2, "demand_predicates": ["ancestor_bf"]}

POST /datalog/infer/{rule_set}/lattice

Lattice-based inference (Datalog^L). Body: JSON.

curl -X POST http://localhost:7878/datalog/infer/my-ontology/lattice \
  -H "Content-Type: application/json" \
  -d '{"lattice": "min"}'
# → {"derived": 5}

Phase 3 — Query & constraints

POST /datalog/query/{rule_set}

Goal-directed query via magic sets. Body: Datalog goal text.

curl -X POST http://localhost:7878/datalog/query/my-ontology \
  -H "Content-Type: text/x-datalog" \
  -d 'ancestor(ex:alice, ?y).'
# → {"derived": 5, "iterations": 2, "matching": [{"y": "http://example.org/bob"}, …]}

GET /datalog/constraints

Check all constraint rules; return violations.

curl http://localhost:7878/datalog/constraints
# → [{"rule": "no_self_parent", "violated": false}, …]

GET /datalog/constraints/{rule_set}

Check constraints for a specific rule set.

curl http://localhost:7878/datalog/constraints/my-ontology

Phase 4 — Admin & monitoring

GET /datalog/stats/cache

Rule plan cache statistics.

curl http://localhost:7878/datalog/stats/cache
# → {"size": 12, "hits": 340, "misses": 8, …}

GET /datalog/stats/tabling

Tabling/memoization cache statistics.

curl http://localhost:7878/datalog/stats/tabling
# → {"entries": 100, "hit_rate": 0.82, …}

GET /datalog/lattices

List registered lattice types.

curl http://localhost:7878/datalog/lattices
# → [{"name": "min", "join_fn": "LEAST", "bottom": "Infinity"}, …]

POST /datalog/lattices

Register a new lattice type. Body: JSON.

curl -X POST http://localhost:7878/datalog/lattices \
  -H "Content-Type: application/json" \
  -d '{"name": "my_min", "join_fn": "my_schema.my_min", "bottom": "9999"}'
# → {"created": "my_min"}

GET /datalog/views

List all Datalog materialized views.

curl http://localhost:7878/datalog/views
# → [{"name": "ancestor_view", "goal": "ancestor(?x, ?y).", …}, …]

POST /datalog/views

Create a Datalog materialized view. Body: JSON.

curl -X POST http://localhost:7878/datalog/views \
  -H "Content-Type: application/json" \
  -d '{"name": "ancestor_view", "goal": "ancestor(?x, ?y).", "rule_set": "my-ontology"}'
# → {"created": "ancestor_view"}

DELETE /datalog/views/{name}

Drop a Datalog materialized view.

curl -X DELETE http://localhost:7878/datalog/views/ancestor_view
# → {"dropped": "ancestor_view"}

Error codes

HTTP Status error field Trigger
400 datalog_parse_error Malformed Datalog rule text
400 datalog_goal_error Invalid goal pattern
400 invalid_request Missing body, wrong content-type, non-numeric rule_id
404 rule_set_not_found Infer/drop on a nonexistent rule set
401 Missing or invalid Authorization token
503 service_unavailable Connection pool exhausted

All error responses include a trace_id field for log correlation.

Read-Replica Routing (v0.120.0)

pg_ripple_http can route read-only SPARQL queries to a PostgreSQL standby replica, reducing load on the primary.

Configuration

Set the PG_RIPPLE_HTTP_REPLICA_DSN environment variable to a PostgreSQL DSN pointing to a read-replica:

export PG_RIPPLE_HTTP_REPLICA_DSN="host=replica.db.example.com dbname=ripple user=app password=..."

When this variable is set, a separate connection pool is created for the replica.

Usage

Add ?replica=ok to any SPARQL SELECT, CONSTRUCT, or ASK request to route it to the replica:

# GET
curl "http://localhost:7878/sparql?query=SELECT+%2A+WHERE+%7B+%3Fs+%3Fp+%3Fo+%7D+LIMIT+10&replica=ok"

# POST (application/sparql-query)
curl -X POST "http://localhost:7878/sparql?replica=ok" \
  -H "Content-Type: application/sparql-query" \
  -d "SELECT * WHERE { ?s ?p ?o } LIMIT 10"

# POST (form-encoded)
curl -X POST "http://localhost:7878/spacurl -X POST "http://localhost:7878/spacurl -X POST "http://localhost:7878/spacECTcurl -X POST "http://localhost:78+LIcurl -X POST "http://localhost:7878/spacurl -X POST "http://localhost:7878tocurl -X POST "http://localhost:7878/sliccurl -X POST "e replica is unavailable, the request falls back to the primary and a
  `WARN`-leve  `WARN`-leve  `WARN`-lev Routing decisions are logged at `DEBUG` level.
- The `?replica=ok` parameter is silently ignored if `PG_RIPPLE_HTTP_REPLICA_DSN`
  is not configured.