# pg_ripple_http
Standalone HTTP service that exposes a [W3C SPARQL 1.1 Protocol](https://www.w3.org/TR/sparql11-protocol/) endpoint for [pg_ripple](../README.md). Any standard SPARQL client — YASGUI, SPARQLWrapper, Jena, or plain `curl` — can query pg_ripple without a PostgreSQL driver.
## Build
```bash
cargo build --release -p pg_ripple_http
```
The binary is placed at `target/release/pg_ripple_http`.
**Requirements:** Rust 1.88+, and a running PostgreSQL 18 instance with the `pg_ripple` extension installed.
## Run
```bash
./target/release/pg_ripple_http
```
On startup, the service connects to PostgreSQL, verifies that pg_ripple is available, and logs the connection details:
```
INFO pg_ripple_http: connected to postgresql://localhost/postgres (port 7878), triple store contains 12345 triples
INFO pg_ripple_http: pg_ripple_http listening on http://0.0.0.0:7878
```
## Configuration
All configuration is via environment variables:
| `PG_RIPPLE_HTTP_PG_URL` | `postgresql://localhost/postgres` | PostgreSQL connection URL |
| `PG_RIPPLE_HTTP_PORT` | `7878` | HTTP listening port |
| `PG_RIPPLE_HTTP_POOL_SIZE` | `16` | Database connection pool size |
| `PG_RIPPLE_HTTP_AUTH_TOKEN` | (unset) | If set, requests must include `Authorization: Bearer <token>` |
| `PG_RIPPLE_HTTP_AUTH_REALM` | `pg_ripple` | Value used in the `Bearer realm=` field of `WWW-Authenticate` response headers (L16-06, v0.117.0) |
| `PG_RIPPLE_HTTP_METRICS_TOKEN` | (unset) | If set, `GET /metrics` requires `Authorization: Bearer <token>` (M16-22) |
| `PG_RIPPLE_HTTP_RATE_LIMIT` | `0` | Max requests/sec per client IP (0 = disabled) |
| `PG_RIPPLE_HTTP_CORS_ORIGINS` | `*` | Comma-separated allowed origins, or `*` for all |
Example:
```bash
export PG_RIPPLE_HTTP_PG_URL="postgresql://user:password@db-host:5432/mydb"
export PG_RIPPLE_HTTP_PORT=8080
export PG_RIPPLE_HTTP_AUTH_TOKEN="my-secret-token"
./target/release/pg_ripple_http
```
## Endpoints
### `GET /health` (liveness)
Returns `200 OK` when the process is alive **and** the database is reachable.
Use this for Kubernetes **liveness** probes — a non-200 response means the
container should be restarted.
```bash
curl http://localhost:7878/health
```
### `GET /ready` (readiness)
Returns `200 OK` once the server has completed start-up and successfully
obtained a connection from the pool, meaning it is ready to serve traffic.
Use this for Kubernetes **readiness** probes — a non-200 response removes
the pod from the service endpoints until recovery.
```bash
curl http://localhost:7878/ready
```
> **M16-09 (v0.115.0):** Kubernetes `livenessProbe` should point at `/health`
> and `readinessProbe` at `/ready`. Using `exec: pg_isready` for both probes
> is deprecated from v0.115.0 onward.
### `GET /metrics`
Prometheus-compatible metrics.
```bash
curl http://localhost:7878/metrics
# With optional bearer-token protection (M16-22, v0.115.0):
curl -H "Authorization: Bearer $PG_RIPPLE_HTTP_METRICS_TOKEN" http://localhost:7878/metrics
```
> **Security (M16-22, v0.115.0):** Set `PG_RIPPLE_HTTP_METRICS_TOKEN` to
> require a bearer token on the `/metrics` endpoint. Requests without a
> valid token receive `401 Unauthorized` with a `WWW-Authenticate: Bearer`
> challenge. Even without this token you should still restrict the metrics
> port to your Prometheus scraper IP via a reverse-proxy ACL.
>
> See [Security → Metrics Port Isolation](../docs/src/operations/security.md) for details.
### `GET /sparql?query=…`
Run a SPARQL query via URL parameter.
```bash
curl -G http://localhost:7878/sparql \
--data-urlencode "query=SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10"
```
### `POST /sparql`
Run a SPARQL query or update via request body.
| `application/sparql-query` | Raw SPARQL SELECT/ASK/CONSTRUCT/DESCRIBE |
| `application/sparql-update` | Raw SPARQL INSERT/DELETE |
| `application/x-www-form-urlencoded` | `query=…` or `update=…` |
```bash
# SELECT
curl -X POST http://localhost:7878/sparql \
-H "Content-Type: application/sparql-query" \
-d "SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10"
# Update
curl -X POST http://localhost:7878/sparql \
-H "Content-Type: application/sparql-update" \
-d 'INSERT DATA { <http://example.org/alice> <http://example.org/name> "Alice" }'
```
## Content negotiation
Set the `Accept` header to control the response format:
| `application/sparql-results+json` *(default for SELECT/ASK)* | SPARQL Results JSON |
| `application/sparql-results+xml` | SPARQL Results XML |
| `text/csv` | CSV |
| `text/tab-separated-values` | TSV |
| `text/turtle` *(default for CONSTRUCT/DESCRIBE)* | Turtle |
| `application/n-triples` | N-Triples |
| `application/ld+json` | JSON-LD |
```bash
curl -G http://localhost:7878/sparql \
-H "Accept: text/csv" \
--data-urlencode "query=SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 5"
```
## Authentication
If `PG_RIPPLE_HTTP_AUTH_TOKEN` is set, every request must include the token:
```bash
curl -G http://localhost:7878/sparql \
-H "Authorization: Bearer my-secret-token" \
--data-urlencode "query=SELECT * WHERE { ?s ?p ?o } LIMIT 5"
```
Both `Authorization: Bearer <token>` and `Authorization: Basic <token>` are accepted.
## Docker Compose
The root `docker-compose.yml` runs both PostgreSQL and `pg_ripple_http` together:
```bash
docker compose up
```
Services:
| `postgres` | 5432 | PostgreSQL 18 + pg_ripple |
| `sparql` | 7878 | SPARQL HTTP endpoint |
---
## Datalog API
Since v0.39.0, `pg_ripple_http` also exposes a `/datalog` REST namespace that lets any HTTP client manage Datalog rule sets, trigger inference, run goal-directed queries, check integrity constraints, and inspect monitoring statistics — without a PostgreSQL driver.
All Datalog endpoints accept and return `application/json` (unless specified otherwise). Rule text uses `text/x-datalog` or `text/plain`.
### Authentication
The same `PG_RIPPLE_HTTP_AUTH_TOKEN` bearer token covers all `/datalog/*` endpoints. Optionally, set `PG_RIPPLE_HTTP_DATALOG_WRITE_TOKEN` to restrict mutating endpoints (`POST`, `PUT`, `DELETE`) to a separate token while keeping read endpoints (inference triggers, monitoring, GET) covered by the main token.
### Phase 1 — Rule management
#### `POST /datalog/rules/{rule_set}`
Load rules from Datalog text. Body: `text/x-datalog`.
```bash
curl -X POST http://localhost:7878/datalog/rules/my-ontology \
-H "Content-Type: text/x-datalog" \
-d 'ancestor(?x, ?y) :- parent(?x, ?y).
ancestor(?x, ?z) :- parent(?x, ?y), ancestor(?y, ?z).'
# → {"rule_set": "my-ontology", "rules_loaded": 2}
```
#### `POST /datalog/rules/{rule_set}/builtin`
Load a built-in rule set (`rdfs`, `owl-rl`).
```bash
curl -X POST http://localhost:7878/datalog/rules/rdfs/builtin
# → {"rule_set": "rdfs", "rules_loaded": 13}
```
#### `GET /datalog/rules`
List all rule sets and their rules.
```bash
curl http://localhost:7878/datalog/rules
# → [{"id": 1, "rule_set": "my-ontology", "rule_text": "…", "active": true}, …]
```
#### `DELETE /datalog/rules/{rule_set}`
Delete all rules in a rule set.
```bash
curl -X DELETE http://localhost:7878/datalog/rules/my-ontology
# → {"deleted": 2}
```
#### `POST /datalog/rules/{rule_set}/add`
Add a single rule to an existing rule set. Body: `text/x-datalog`.
```bash
curl -X POST http://localhost:7878/datalog/rules/my-ontology/add \
-H "Content-Type: text/x-datalog" \
-d 'sibling(?x, ?y) :- parent(?p, ?x), parent(?p, ?y).'
# → {"rule_set": "my-ontology", "rule_id": 3}
```
#### `DELETE /datalog/rules/{rule_set}/{rule_id}`
Remove a single rule by ID (triggers DRed retraction).
```bash
curl -X DELETE http://localhost:7878/datalog/rules/my-ontology/3
# → {"removed": 1}
```
#### `PUT /datalog/rules/{rule_set}/enable`
Activate a rule set.
```bash
curl -X PUT http://localhost:7878/datalog/rules/my-ontology/enable
# → {"rule_set": "my-ontology", "enabled": true}
```
#### `PUT /datalog/rules/{rule_set}/disable`
Deactivate a rule set.
```bash
curl -X PUT http://localhost:7878/datalog/rules/my-ontology/disable
# → {"rule_set": "my-ontology", "enabled": false}
```
### Phase 2 — Inference
#### `POST /datalog/infer/{rule_set}`
Materialize derived triples (semi-naive evaluation).
```bash
curl -X POST http://localhost:7878/datalog/infer/my-ontology
# → {"derived": 42}
```
#### `POST /datalog/infer/{rule_set}/stats`
Infer with detailed per-stratum statistics.
```bash
curl -X POST http://localhost:7878/datalog/infer/my-ontology/stats
# → {"derived": 42, "iterations": 3, "eliminated_rules": 0, "parallel_groups": 2, …}
```
#### `POST /datalog/infer/{rule_set}/agg`
Aggregate-aware inference (Datalog^agg).
```bash
curl -X POST http://localhost:7878/datalog/infer/my-ontology/agg
# → {"derived": 12}
```
#### `POST /datalog/infer/{rule_set}/wfs`
Well-Founded Semantics inference (three-valued).
```bash
curl -X POST http://localhost:7878/datalog/infer/my-ontology/wfs
# → {"derived": 8}
```
#### `POST /datalog/infer/{rule_set}/demand`
Demand-transformed (goal-directed) inference. Body: JSON.
```bash
curl -X POST http://localhost:7878/datalog/infer/my-ontology/demand \
-H "Content-Type: application/json" \
-d '{"demands": [{"predicate": "ancestor", "bound": [0]}]}'
# → {"derived": 12, "iterations": 2, "demand_predicates": ["ancestor_bf"]}
```
#### `POST /datalog/infer/{rule_set}/lattice`
Lattice-based inference (Datalog^L). Body: JSON.
```bash
curl -X POST http://localhost:7878/datalog/infer/my-ontology/lattice \
-H "Content-Type: application/json" \
-d '{"lattice": "min"}'
# → {"derived": 5}
```
### Phase 3 — Query & constraints
#### `POST /datalog/query/{rule_set}`
Goal-directed query via magic sets. Body: Datalog goal text.
```bash
curl -X POST http://localhost:7878/datalog/query/my-ontology \
-H "Content-Type: text/x-datalog" \
-d 'ancestor(ex:alice, ?y).'
# → {"derived": 5, "iterations": 2, "matching": [{"y": "http://example.org/bob"}, …]}
```
#### `GET /datalog/constraints`
Check all constraint rules; return violations.
```bash
curl http://localhost:7878/datalog/constraints
# → [{"rule": "no_self_parent", "violated": false}, …]
```
#### `GET /datalog/constraints/{rule_set}`
Check constraints for a specific rule set.
```bash
curl http://localhost:7878/datalog/constraints/my-ontology
```
### Phase 4 — Admin & monitoring
#### `GET /datalog/stats/cache`
Rule plan cache statistics.
```bash
curl http://localhost:7878/datalog/stats/cache
# → {"size": 12, "hits": 340, "misses": 8, …}
```
#### `GET /datalog/stats/tabling`
Tabling/memoization cache statistics.
```bash
curl http://localhost:7878/datalog/stats/tabling
# → {"entries": 100, "hit_rate": 0.82, …}
```
#### `GET /datalog/lattices`
List registered lattice types.
```bash
curl http://localhost:7878/datalog/lattices
# → [{"name": "min", "join_fn": "LEAST", "bottom": "Infinity"}, …]
```
#### `POST /datalog/lattices`
Register a new lattice type. Body: JSON.
```bash
curl -X POST http://localhost:7878/datalog/lattices \
-H "Content-Type: application/json" \
-d '{"name": "my_min", "join_fn": "my_schema.my_min", "bottom": "9999"}'
# → {"created": "my_min"}
```
#### `GET /datalog/views`
List all Datalog materialized views.
```bash
curl http://localhost:7878/datalog/views
# → [{"name": "ancestor_view", "goal": "ancestor(?x, ?y).", …}, …]
```
#### `POST /datalog/views`
Create a Datalog materialized view. Body: JSON.
```bash
curl -X POST http://localhost:7878/datalog/views \
-H "Content-Type: application/json" \
-d '{"name": "ancestor_view", "goal": "ancestor(?x, ?y).", "rule_set": "my-ontology"}'
# → {"created": "ancestor_view"}
```
#### `DELETE /datalog/views/{name}`
Drop a Datalog materialized view.
```bash
curl -X DELETE http://localhost:7878/datalog/views/ancestor_view
# → {"dropped": "ancestor_view"}
```
### Error codes
| `400` | `datalog_parse_error` | Malformed Datalog rule text |
| `400` | `datalog_goal_error` | Invalid goal pattern |
| `400` | `invalid_request` | Missing body, wrong content-type, non-numeric `rule_id` |
| `404` | `rule_set_not_found` | Infer/drop on a nonexistent rule set |
| `401` | — | Missing or invalid `Authorization` token |
| `503` | `service_unavailable` | Connection pool exhausted |
All error responses include a `trace_id` field for log correlation.