docspec-http 1.0.1

HTTP API server for DocSpec document conversion
Documentation

docspec-http

HTTP API server for DocSpec markdown to BlockNote JSON conversion.

Send markdown, receive BlockNote JSON. The underlying DocSpec pipeline is streaming, but this v1 HTTP wrapper buffers the request body and the conversion output in memory before responding. End-to-end streaming over HTTP is planned for a future version. For now, request size scales with available memory.

Quick Start

cargo build -p docspec-http --bin docspec-http --release
./target/release/docspec-http --port 3000

Default host is 127.0.0.1. Default port is 3000.

./target/release/docspec-http --host 0.0.0.0 --port 8080

Endpoints

Method Path Description
POST /conversion Convert markdown to BlockNote JSON
OPTIONS /conversion Preflight / allowed methods
GET /health Liveness check
HEAD /health Liveness check (no body)
OPTIONS /health Allowed methods

curl Examples

# Convert markdown to BlockNote JSON
curl -X POST \
     -H 'Content-Type: text/markdown' \
     --data '# Hello World' \
     http://localhost:3000/conversion

# Check server health
curl http://localhost:3000/health

# HEAD health check (no body in response)
curl -I http://localhost:3000/health

# OPTIONS — see allowed methods
curl -X OPTIONS -i http://localhost:3000/conversion

Request / Response Headers

X-Request-ID: Generated (UUID v4) if the request omits it. Echoed back unchanged if present.

X-Trace-ID: Echoed back if present. Never generated by the server.

Cache-Control: max-age=0, private, must-revalidate on every response, including errors.

Error Responses

All errors use RFC 7807 Problem Details JSON (application/problem+json; charset=utf-8).

Code Meaning
400 Empty body or invalid UTF-8
404 Unknown path
405 Wrong method (response includes Allow header)
406 Accept header excludes all supported output types
415 Content-Type must be text/markdown
422 Markdown parse error
500 Internal conversion error

Accepted Accept values for /conversion: */*, application/*, application/vnd.docspec.blocknote+json, application/vnd.blocknote+json. Anything else returns 406.

Deployment Notes

TLS: Use a reverse proxy (nginx, Caddy). The server speaks plain HTTP.

CORS: Use a reverse proxy. No CORS headers are added.

Auth: Use a reverse proxy or upstream gateway.

Body size: No limit. Large documents are accepted. DoS risk is accepted. Both the request body and the conversion output are held in memory for the duration of the request.

Request timeout: No timeout. Slow clients can hang a connection indefinitely.

Logging

Logs go to stderr at INFO level in pretty format. There are no flags to change the log level or format.

Observability

docspec-http integrates with Sentry for error reporting. Activation is fully opt-in via environment variables — the binary has zero Sentry overhead when no DSN is configured.

Activation

Set ONE of the following to enable Sentry:

  • DOCSPEC_SENTRY_DSN — docspec-specific override (preferred)
  • SENTRY_DSN — Sentry's standard convention (fallback)

If both are set, DOCSPEC_SENTRY_DSN wins. An empty string or malformed DSN is treated as "not set" — the server starts normally and logs a warning to stderr.

Configuration (all optional)

These follow Sentry's standard conventions:

  • SENTRY_ENVIRONMENT — environment name (default: production)
  • SENTRY_RELEASE — release identifier (default: auto, docspec-http@<version>)
  • SENTRY_SAMPLE_RATE — error sample rate [0.0, 1.0] (default: 1.0)
  • SENTRY_TRACES_SAMPLE_RATE — performance trace sample rate [0.0, 1.0] (default: 0.0, traces disabled)

What is captured

Signal Captured?
500 Internal Server Error (HttpError::Internal) yes (event)
422 Unprocessable Entity (HttpError::Unprocessable) yes (event)
Other 4xx responses no
Panics yes (event)
tracing::error! calls yes (event)
tracing::warn! calls yes (breadcrumb)
tracing::info!/debug! calls yes (breadcrumb)
Performance transactions only if SENTRY_TRACES_SAMPLE_RATE > 0

Privacy

docspec-http does NOT send the following to Sentry:

  • Request bodies (markdown documents)
  • Response bodies (BlockNote JSON)
  • PII (Sentry default: send_default_pii = false)
  • DSN values (never logged or echoed)

Sentry's default header redaction (Authorization, Cookie, etc.) is preserved.

Each captured event is tagged with request_id (UUID v4) and trace_id (X-Trace-ID header value, if present) for correlation with logs.

Wire Contract

Mirrors github.com/docspecio/api v3.0.2 where feasible: same endpoint path, RFC 7807 errors, X-Request-ID/X-Trace-ID header handling. Diverges in input MIME (text/markdown vs DOCX) and adds Cache-Control on all responses.

Graceful Shutdown

The server handles SIGINT and SIGTERM. In-flight requests complete before the process exits.

Docker

Build

DOCKER_BUILDKIT=1 docker build \
  --build-arg IMAGE_VERSION=0.1.0 \
  --build-arg IMAGE_REVISION=$(git rev-parse HEAD) \
  -t docspec-http:local .

Supply IMAGE_VERSION and IMAGE_REVISION at build time to populate the OCI labels. Both default to 0.1.0 and unknown if omitted.

Run

docker run --rm -p 3000:3000 ghcr.io/docspec/api:0.1.0

The default CMD passes --host 0.0.0.0 --port 3000. Override it entirely to change the bind address or port:

docker run --rm -p 8080:8080 ghcr.io/docspec/api:0.1.0 --host 0.0.0.0 --port 8080

Healthcheck

The image ships a built-in HEALTHCHECK that probes GET http://127.0.0.1:3000/health every 30 seconds using busybox wget --spider. Docker reports the container status in docker ps and Compose surfaces it via healthcheck:.

The probe port is hardcoded to 3000 inside the image. If you override CMD to bind a different --port, the built-in healthcheck will keep probing 3000 and report the container as unhealthy even though the server is fine. To run on a non-default port, either:

  • Keep the in-container port at 3000 and only remap the host port (-p 8080:3000), or
  • Override the healthcheck at runtime, e.g. docker run --health-cmd='wget --no-verbose --tries=1 --spider http://127.0.0.1:8080/health || exit 1' …, or
  • Disable it with docker run --no-healthcheck … and rely on an external probe.

Kubernetes users should configure a Pod-level httpGet liveness probe on /health port 3000 instead of relying on the Docker HEALTHCHECK.

Image tags

Images are published to ghcr.io/docspec/api by the release workflow (managed by release-please). The following tags are maintained:

Tag Meaning
0.1.0 Exact version
0.1 Latest patch of 0.1
0 Latest minor of 0
latest Most recent released version

latest follows the most recent GitHub release, not the main branch. The publish workflow is documented contract; it is not implemented in this repository.

Architecture

The image is built for linux/amd64 only. No multi-platform manifest is published.

User

The container runs as non-root UID/GID 10001 (user docspec). No capabilities are required.

Reverse proxy

TLS termination, CORS headers, authentication, and rate limiting are intentionally absent from the binary. Place a reverse proxy (nginx, Caddy, etc.) in front of the container for these concerns. See Deployment Notes for details.

Metrics

docspec-http exposes a Prometheus metrics endpoint on the same port as the main API.

Endpoint: GET /metrics

Format: Prometheus exposition format 0.0.4 (text/plain; version=0.0.4; charset=utf-8)

Auth: None. The endpoint is internal-only. See Security below.

Metric Catalog

Name Type Labels Description Buckets
docspec_http_requests_total counter method, path, status Total HTTP requests received
docspec_http_request_duration_seconds histogram method, path, status HTTP request latency in seconds 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0
docspec_http_request_body_bytes histogram input_mime_type HTTP request body size in bytes, labeled by input MIME type 100, 200, 400, 800, 1600, 3200, 6400, 12800, 25600, 51200, 102400, 204800
docspec_conversions_total counter result, error_class, input_mime_type, output_mime_type Total document conversions, labeled by result, error class, and input/output MIME type
docspec_conversion_duration_seconds histogram result, input_mime_type, output_mime_type Document conversion duration in seconds, labeled by result and input/output MIME type 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0
docspec_conversion_output_bytes (NEW) histogram input_mime_type, output_mime_type Document conversion output size in bytes (success only) 100, 200, 400, 800, 1600, 3200, 6400, 12800, 25600, 51200, 102400, 204800

Label Values

result: success, client_error, server_error

error_class: body_not_utf8, empty_body, internal, method_not_allowed, not_acceptable, not_found, unprocessable, unsupported_media_type, none (only when result=success)

input_mime_type: text/markdown (the request's Content-Type matched the markdown reader), unsupported (Content-Type header present but not a supported input format), none (Content-Type header absent).

output_mime_type: application/vnd.docspec.blocknote+json (conversion succeeded; output produced by the BlockNote writer), none (no output produced — any error path).

path: matched route template (/conversion, /health) or unknown for fallback handlers

status: numeric HTTP status code as a string (e.g., "200", "422")

method: HTTP method as a string (e.g., "GET", "POST")

Cardinality Guarantees

path is bounded to {"/conversion", "/health", "unknown"}. error_class is bounded to 9 values. result is bounded to 3 values. Per-request identifiers (X-Request-ID, X-Trace-ID) are never used as labels. input_mime_type is bounded to 3 values. output_mime_type is bounded to 2 values. Both come from a fixed set of &'static str constants in the source — never from raw header values.

Scrape Model

Each pod maintains its own in-memory metrics. Prometheus scrapes each pod independently. No inter-pod communication is required. Aggregate across pods using PromQL.

Upkeep runs every 5 seconds, keeping histogram internal state bounded.

The /metrics route is mounted outside the API middleware stack, so it does not include the global Cache-Control header used by API responses.

The body-size histogram (docspec_http_request_body_bytes) only records bodies that passed Content-Type and Accept validation. Rejected requests are not counted.

The output-bytes histogram (docspec_conversion_output_bytes) only records observations for successful conversions. Failed conversions do not produce output, so no observation is recorded.

Example PromQL Queries

Per-pod request rate:

rate(docspec_http_requests_total[5m])

Aggregate p99 latency across all pods:

histogram_quantile(0.99, sum by (le) (rate(docspec_http_request_duration_seconds_bucket[5m])))

Error rate broken down by error class:

rate(docspec_conversions_total{result!="success"}[5m])

Body-size p95:

histogram_quantile(0.95, sum by (le) (rate(docspec_http_request_body_bytes_bucket[5m])))

Body-size p95 by input format:

histogram_quantile(0.95, sum by (le, input_mime_type) (rate(docspec_http_request_body_bytes_bucket[5m])))

Conversion success rate by input format:

sum by (input_mime_type) (rate(docspec_conversions_total{result="success"}[5m]))
  / sum by (input_mime_type) (rate(docspec_conversions_total[5m]))

Security

/metrics has no authentication. It's intended for internal scraping only. Deploy behind a private overlay network or a Kubernetes NetworkPolicy that restricts access to your Prometheus pods.