docspec-http 1.5.0

HTTP API server for DocSpec document conversion
Documentation

docspec-http

HTTP API server for DocSpec markdown or HTML conversion to BlockNote JSON (default), HTML, or oxa.dev JSON via Accept.

Send markdown (Content-Type: text/markdown) or HTML (Content-Type: text/html), receive BlockNote JSON (default), HTML (Accept: text/html), or oxa.dev JSON (Accept: application/vnd.oxa+json). The underlying DocSpec pipeline is streaming, but this v1 HTTP wrapper buffers the request body and the conversion output in memory before responding. End-to-end streaming over HTTP is planned for a future version. For now, request size scales with available memory.

HTML is paragraph-only. The HTML reader currently parses <p> elements only, and the HTML writer currently emits only paragraph events. Other HTML input elements and non-paragraph output events (headings, lists, tables, formatting, etc.) are silently dropped. See docspec-html-reader and docspec-html-writer.

Quick Start

cargo build -p docspec-http --bin docspec-http --release
./target/release/docspec-http --port 3000

Default host is 127.0.0.1. Default port is 3000.

./target/release/docspec-http --host 0.0.0.0 --port 8080

Endpoints

Method Path Description
POST /conversion Convert markdown or HTML to BlockNote (default), HTML, or oxa.dev JSON
OPTIONS /conversion Preflight / allowed methods
GET /health Liveness check
HEAD /health Liveness check (no body)
OPTIONS /health Allowed methods

curl Examples

# Convert markdown to BlockNote JSON (default)
curl -X POST \
     -H 'Content-Type: text/markdown' \
     --data '# Hello World' \
     http://localhost:3000/conversion

# Convert HTML to BlockNote JSON
curl -X POST \
     -H 'Content-Type: text/html' \
     --data '<p>Hello World</p>' \
     http://localhost:3000/conversion

# Convert markdown to HTML
curl -X POST \
     -H 'Content-Type: text/markdown' \
     -H 'Accept: text/html' \
     --data 'Hello World' \
     http://localhost:3000/conversion

# Convert markdown to oxa.dev JSON (opt-in via Accept)
curl -X POST \
     -H 'Content-Type: text/markdown' \
     -H 'Accept: application/vnd.oxa+json' \
     --data 'Hello World' \
     http://localhost:3000/conversion

# Convert HTML to oxa.dev JSON
curl -X POST \
     -H 'Content-Type: text/html' \
     -H 'Accept: application/vnd.oxa+json' \
     --data '<p>Hello World</p>' \
     http://localhost:3000/conversion

# Check server health
curl http://localhost:3000/health

# HEAD health check (no body in response)
curl -I http://localhost:3000/health

# OPTIONS — see allowed methods
curl -X OPTIONS -i http://localhost:3000/conversion

Request / Response Headers

X-Request-ID: Generated (UUID v4) if the request omits it. Echoed back unchanged if present.

X-Trace-ID: Echoed back if present. Never generated by the server.

Cache-Control: max-age=0, private, must-revalidate on every response, including errors.

Error Responses

All errors use RFC 7807 Problem Details JSON (application/problem+json; charset=utf-8).

Code Meaning
400 Empty body or invalid UTF-8
404 Unknown path
405 Wrong method (response includes Allow header)
406 Accept header excludes all supported output types
415 Content-Type must be text/markdown or text/html
422 Input parse error (malformed markdown or HTML)
500 Internal conversion error

Accepted Accept values for /conversion: text/html (HTML), application/vnd.oxa+json (oxa.dev), application/vnd.docspec.blocknote+json, application/vnd.blocknote+json (BlockNote alias), application/*, or */*. Wildcards and missing Accept default to BlockNote for back-compat. Anything else returns 406.

Deployment Notes

TLS: Use a reverse proxy (nginx, Caddy). The server speaks plain HTTP.

CORS: Use a reverse proxy. No CORS headers are added.

Auth: Use a reverse proxy or upstream gateway.

Body size: No limit. Large documents are accepted. DoS risk is accepted. Both the request body and the conversion output are held in memory for the duration of the request.

Request timeout: No timeout. Slow clients can hang a connection indefinitely.

Logging

Logs go to stderr at INFO level in pretty format. There are no flags to change the log level or format.

Observability

docspec-http integrates with Sentry for error reporting. Activation is fully opt-in via environment variables — the binary has zero Sentry overhead when no DSN is configured.

Activation

Set ONE of the following to enable Sentry:

  • DOCSPEC_SENTRY_DSN — docspec-specific override (preferred)
  • SENTRY_DSN — Sentry's standard convention (fallback)

If both are set, DOCSPEC_SENTRY_DSN wins. An empty string or malformed DSN is treated as "not set" — the server starts normally and logs a warning to stderr.

Configuration (all optional)

These follow Sentry's standard conventions:

  • SENTRY_ENVIRONMENT — environment name (default: production)
  • SENTRY_RELEASE — release identifier (default: auto, docspec-http@<version>)
  • SENTRY_SAMPLE_RATE — error sample rate [0.0, 1.0] (default: 1.0)
  • SENTRY_TRACES_SAMPLE_RATE — performance trace sample rate [0.0, 1.0] (default: 0.0, traces disabled)

What is captured

Signal Captured?
500 Internal Server Error (HttpError::Internal) yes (event)
422 Unprocessable Entity (HttpError::Unprocessable) yes (event)
Other 4xx responses no
Panics yes (event)
tracing::error! calls yes (event)
tracing::warn! calls yes (breadcrumb)
tracing::info!/debug! calls yes (breadcrumb)
Performance transactions only if SENTRY_TRACES_SAMPLE_RATE > 0

Privacy

docspec-http does NOT send the following to Sentry:

  • Request bodies (markdown or HTML documents)
  • Response bodies (BlockNote JSON, HTML, or oxa.dev JSON)
  • PII (Sentry default: send_default_pii = false)
  • DSN values (never logged or echoed)

Sentry's default header redaction (Authorization, Cookie, etc.) is preserved.

Each captured event is tagged with request_id (UUID v4) and trace_id (X-Trace-ID header value, if present) for correlation with logs.

Wire Contract

Mirrors github.com/docspecio/api v3.0.2 where feasible: same endpoint path, RFC 7807 errors, X-Request-ID/X-Trace-ID header handling. Diverges in supported conversions.

Graceful Shutdown

The server handles SIGINT and SIGTERM. In-flight requests complete before the process exits.

Docker

Build

DOCKER_BUILDKIT=1 docker build \
  --build-arg IMAGE_VERSION=0.1.0 \
  --build-arg IMAGE_REVISION=$(git rev-parse HEAD) \
  -t docspec-http:local .

Supply IMAGE_VERSION and IMAGE_REVISION at build time to populate the OCI labels. Both default to 0.1.0 and unknown if omitted.

Run

docker run --rm -p 3000:3000 ghcr.io/docspec/api:0.1.0

The default CMD passes --host 0.0.0.0 --port 3000. Override it entirely to change the bind address or port:

docker run --rm -p 8080:8080 ghcr.io/docspec/api:0.1.0 --host 0.0.0.0 --port 8080

Healthcheck

The image ships a built-in HEALTHCHECK that probes GET http://127.0.0.1:3000/health every 30 seconds using busybox wget --spider. Docker reports the container status in docker ps and Compose surfaces it via healthcheck:.

The probe port is hardcoded to 3000 inside the image. If you override CMD to bind a different --port, the built-in healthcheck will keep probing 3000 and report the container as unhealthy even though the server is fine. To run on a non-default port, either:

  • Keep the in-container port at 3000 and only remap the host port (-p 8080:3000), or
  • Override the healthcheck at runtime, e.g. docker run --health-cmd='wget --no-verbose --tries=1 --spider http://127.0.0.1:8080/health || exit 1' …, or
  • Disable it with docker run --no-healthcheck … and rely on an external probe.

Kubernetes users should configure a Pod-level httpGet liveness probe on /health port 3000 instead of relying on the Docker HEALTHCHECK.

Image tags

Images are published to ghcr.io/docspec/api by the release workflow (managed by release-please). The following tags are maintained:

Tag Meaning
0.1.0 Exact version
0.1 Latest patch of 0.1
0 Latest minor of 0
latest Most recent released version

latest follows the most recent GitHub release, not the main branch. The publish workflow is documented contract; it is not implemented in this repository.

Architecture

The image is built for linux/amd64 only. No multi-platform manifest is published.

User

The container runs as non-root UID/GID 10001 (user docspec). No capabilities are required.

Reverse proxy

TLS termination, CORS headers, authentication, and rate limiting are intentionally absent from the binary. Place a reverse proxy (nginx, Caddy, etc.) in front of the container for these concerns. See Deployment Notes for details.

Metrics

docspec-http exposes a Prometheus metrics endpoint on the same port as the main API.

Endpoint: GET /metrics

Format: Prometheus exposition format 0.0.4 (text/plain; version=0.0.4; charset=utf-8)

Auth: None. The endpoint is internal-only. See Security below.

Metric Catalog

Name Type Labels Description Buckets
docspec_http_requests_total counter method, path, status Total HTTP requests received
docspec_http_request_duration_seconds histogram method, path, status HTTP request latency in seconds 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0
docspec_http_request_body_bytes histogram input_mime_type HTTP request body size in bytes, labeled by input MIME type 100, 200, 400, 800, 1600, 3200, 6400, 12800, 25600, 51200, 102400, 204800
docspec_conversions_total counter result, error_class, input_mime_type, output_mime_type Total document conversions, labeled by result, error class, and input/output MIME type
docspec_conversion_duration_seconds histogram result, input_mime_type, output_mime_type Document conversion duration in seconds, labeled by result and input/output MIME type 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0
docspec_conversion_output_bytes (NEW) histogram input_mime_type, output_mime_type Document conversion output size in bytes (success only) 100, 200, 400, 800, 1600, 3200, 6400, 12800, 25600, 51200, 102400, 204800

Label Values

result: success, client_error, server_error

error_class: body_not_utf8, empty_body, internal, method_not_allowed, not_acceptable, not_found, unprocessable, unsupported_media_type, none (only when result=success)

input_mime_type: text/markdown (the request's Content-Type matched the markdown reader), text/html (the request's Content-Type matched the HTML reader), unsupported (Content-Type header present but not a supported input format), none (Content-Type header absent).

output_mime_type: application/vnd.docspec.blocknote+json (conversion succeeded; output produced by the BlockNote writer), text/html (conversion succeeded; output produced by the HTML writer), application/vnd.oxa+json (conversion succeeded; output produced by the oxa.dev writer), none (no output produced — any error path).

path: matched route template (/conversion, /health) or unknown for fallback handlers

status: numeric HTTP status code as a string (e.g., "200", "422")

method: HTTP method as a string (e.g., "GET", "POST")

Cardinality Guarantees

path is bounded to {"/conversion", "/health", "unknown"}. error_class is bounded to 9 values. result is bounded to 3 values. Per-request identifiers (X-Request-ID, X-Trace-ID) are never used as labels. input_mime_type is bounded to 4 values (text/markdown, text/html, unsupported, none). output_mime_type is bounded to 4 values (application/vnd.docspec.blocknote+json, text/html, application/vnd.oxa+json, none). Both come from a fixed set of &'static str constants in the source — never from raw header values.

Scrape Model

Each pod maintains its own in-memory metrics. Prometheus scrapes each pod independently. No inter-pod communication is required. Aggregate across pods using PromQL.

Upkeep runs every 5 seconds, keeping histogram internal state bounded.

The /metrics route is mounted outside the API middleware stack, so it does not include the global Cache-Control header used by API responses.

The body-size histogram (docspec_http_request_body_bytes) only records bodies that passed Content-Type and Accept validation. Rejected requests are not counted.

The output-bytes histogram (docspec_conversion_output_bytes) only records observations for successful conversions. Failed conversions do not produce output, so no observation is recorded.

Example PromQL Queries

Per-pod request rate:

rate(docspec_http_requests_total[5m])

Aggregate p99 latency across all pods:

histogram_quantile(0.99, sum by (le) (rate(docspec_http_request_duration_seconds_bucket[5m])))

Error rate broken down by error class:

rate(docspec_conversions_total{result!="success"}[5m])

Body-size p95:

histogram_quantile(0.95, sum by (le) (rate(docspec_http_request_body_bytes_bucket[5m])))

Body-size p95 by input format:

histogram_quantile(0.95, sum by (le, input_mime_type) (rate(docspec_http_request_body_bytes_bucket[5m])))

Conversion success rate by input format:

sum by (input_mime_type) (rate(docspec_conversions_total{result="success"}[5m]))
  / sum by (input_mime_type) (rate(docspec_conversions_total[5m]))

Security

/metrics has no authentication. It's intended for internal scraping only. Deploy behind a private overlay network or a Kubernetes NetworkPolicy that restricts access to your Prometheus pods.