docspec-http
HTTP API server for DocSpec markdown to BlockNote JSON conversion (oxa.dev JSON opt-in via Accept).
Send markdown, receive BlockNote JSON (default) or oxa.dev JSON (Accept: application/vnd.oxa+json). The underlying DocSpec pipeline is streaming, but this v1 HTTP wrapper buffers the request body and the conversion output in memory before responding. End-to-end streaming over HTTP is planned for a future version. For now, request size scales with available memory.
Quick Start
Default host is 127.0.0.1. Default port is 3000.
Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /conversion | Convert markdown to BlockNote (default) or oxa.dev JSON |
| OPTIONS | /conversion | Preflight / allowed methods |
| GET | /health | Liveness check |
| HEAD | /health | Liveness check (no body) |
| OPTIONS | /health | Allowed methods |
curl Examples
# Convert markdown to BlockNote JSON (default)
# Convert markdown to oxa.dev JSON (opt-in via Accept)
# Check server health
# HEAD health check (no body in response)
# OPTIONS — see allowed methods
Request / Response Headers
X-Request-ID: Generated (UUID v4) if the request omits it. Echoed back unchanged if present.
X-Trace-ID: Echoed back if present. Never generated by the server.
Cache-Control: max-age=0, private, must-revalidate on every response, including errors.
Error Responses
All errors use RFC 7807 Problem Details JSON (application/problem+json; charset=utf-8).
| Code | Meaning |
|---|---|
| 400 | Empty body or invalid UTF-8 |
| 404 | Unknown path |
| 405 | Wrong method (response includes Allow header) |
| 406 | Accept header excludes all supported output types |
| 415 | Content-Type must be text/markdown |
| 422 | Markdown parse error |
| 500 | Internal conversion error |
Accepted Accept values for /conversion: application/vnd.oxa+json (oxa.dev), application/vnd.docspec.blocknote+json, application/vnd.blocknote+json (BlockNote alias), application/*, or */*. Wildcards and missing Accept default to BlockNote for back-compat. Anything else returns 406.
Deployment Notes
TLS: Use a reverse proxy (nginx, Caddy). The server speaks plain HTTP.
CORS: Use a reverse proxy. No CORS headers are added.
Auth: Use a reverse proxy or upstream gateway.
Body size: No limit. Large documents are accepted. DoS risk is accepted. Both the request body and the conversion output are held in memory for the duration of the request.
Request timeout: No timeout. Slow clients can hang a connection indefinitely.
Logging
Logs go to stderr at INFO level in pretty format. There are no flags to change the log level or format.
Observability
docspec-http integrates with Sentry for error reporting.
Activation is fully opt-in via environment variables — the binary has zero Sentry
overhead when no DSN is configured.
Activation
Set ONE of the following to enable Sentry:
DOCSPEC_SENTRY_DSN— docspec-specific override (preferred)SENTRY_DSN— Sentry's standard convention (fallback)
If both are set, DOCSPEC_SENTRY_DSN wins. An empty string or malformed DSN is
treated as "not set" — the server starts normally and logs a warning to stderr.
Configuration (all optional)
These follow Sentry's standard conventions:
SENTRY_ENVIRONMENT— environment name (default:production)SENTRY_RELEASE— release identifier (default: auto,docspec-http@<version>)SENTRY_SAMPLE_RATE— error sample rate[0.0, 1.0](default:1.0)SENTRY_TRACES_SAMPLE_RATE— performance trace sample rate[0.0, 1.0](default:0.0, traces disabled)
What is captured
| Signal | Captured? |
|---|---|
500 Internal Server Error (HttpError::Internal) |
yes (event) |
422 Unprocessable Entity (HttpError::Unprocessable) |
yes (event) |
| Other 4xx responses | no |
| Panics | yes (event) |
tracing::error! calls |
yes (event) |
tracing::warn! calls |
yes (breadcrumb) |
tracing::info!/debug! calls |
yes (breadcrumb) |
| Performance transactions | only if SENTRY_TRACES_SAMPLE_RATE > 0 |
Privacy
docspec-http does NOT send the following to Sentry:
- Request bodies (markdown documents)
- Response bodies (BlockNote JSON)
- PII (Sentry default:
send_default_pii = false) - DSN values (never logged or echoed)
Sentry's default header redaction (Authorization, Cookie, etc.) is preserved.
Each captured event is tagged with request_id (UUID v4) and trace_id
(X-Trace-ID header value, if present) for correlation with logs.
Wire Contract
Mirrors github.com/docspecio/api v3.0.2 where feasible: same endpoint path, RFC 7807 errors, X-Request-ID/X-Trace-ID header handling. Diverges in input MIME (text/markdown vs DOCX) and adds Cache-Control on all responses.
Graceful Shutdown
The server handles SIGINT and SIGTERM. In-flight requests complete before the process exits.
Docker
Build
DOCKER_BUILDKIT=1
Supply IMAGE_VERSION and IMAGE_REVISION at build time to populate the OCI labels. Both default to 0.1.0 and unknown if omitted.
Run
The default CMD passes --host 0.0.0.0 --port 3000. Override it entirely to change the bind address or port:
Healthcheck
The image ships a built-in HEALTHCHECK that probes GET http://127.0.0.1:3000/health every 30 seconds using busybox wget --spider. Docker reports the container status in docker ps and Compose surfaces it via healthcheck:.
The probe port is hardcoded to 3000 inside the image. If you override CMD to bind a different --port, the built-in healthcheck will keep probing 3000 and report the container as unhealthy even though the server is fine. To run on a non-default port, either:
- Keep the in-container port at
3000and only remap the host port (-p 8080:3000), or - Override the healthcheck at runtime, e.g.
docker run --health-cmd='wget --no-verbose --tries=1 --spider http://127.0.0.1:8080/health || exit 1' …, or - Disable it with
docker run --no-healthcheck …and rely on an external probe.
Kubernetes users should configure a Pod-level httpGet liveness probe on /health port 3000 instead of relying on the Docker HEALTHCHECK.
Image tags
Images are published to ghcr.io/docspec/api by the release workflow (managed by release-please). The following tags are maintained:
| Tag | Meaning |
|---|---|
0.1.0 |
Exact version |
0.1 |
Latest patch of 0.1 |
0 |
Latest minor of 0 |
latest |
Most recent released version |
latest follows the most recent GitHub release, not the main branch. The publish workflow is documented contract; it is not implemented in this repository.
Architecture
The image is built for linux/amd64 only. No multi-platform manifest is published.
User
The container runs as non-root UID/GID 10001 (user docspec). No capabilities are required.
Reverse proxy
TLS termination, CORS headers, authentication, and rate limiting are intentionally absent from the binary. Place a reverse proxy (nginx, Caddy, etc.) in front of the container for these concerns. See Deployment Notes for details.
Metrics
docspec-http exposes a Prometheus metrics endpoint on the same port as the main API.
Endpoint: GET /metrics
Format: Prometheus exposition format 0.0.4 (text/plain; version=0.0.4; charset=utf-8)
Auth: None. The endpoint is internal-only. See Security below.
Metric Catalog
| Name | Type | Labels | Description | Buckets |
|---|---|---|---|---|
docspec_http_requests_total |
counter | method, path, status |
Total HTTP requests received | — |
docspec_http_request_duration_seconds |
histogram | method, path, status |
HTTP request latency in seconds | 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0 |
docspec_http_request_body_bytes |
histogram | input_mime_type |
HTTP request body size in bytes, labeled by input MIME type | 100, 200, 400, 800, 1600, 3200, 6400, 12800, 25600, 51200, 102400, 204800 |
docspec_conversions_total |
counter | result, error_class, input_mime_type, output_mime_type |
Total document conversions, labeled by result, error class, and input/output MIME type | — |
docspec_conversion_duration_seconds |
histogram | result, input_mime_type, output_mime_type |
Document conversion duration in seconds, labeled by result and input/output MIME type | 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0 |
docspec_conversion_output_bytes (NEW) |
histogram | input_mime_type, output_mime_type |
Document conversion output size in bytes (success only) | 100, 200, 400, 800, 1600, 3200, 6400, 12800, 25600, 51200, 102400, 204800 |
Label Values
result: success, client_error, server_error
error_class: body_not_utf8, empty_body, internal, method_not_allowed, not_acceptable, not_found, unprocessable, unsupported_media_type, none (only when result=success)
input_mime_type: text/markdown (the request's Content-Type matched the markdown reader), unsupported (Content-Type header present but not a supported input format), none (Content-Type header absent).
output_mime_type: application/vnd.docspec.blocknote+json (conversion succeeded; output produced by the BlockNote writer), application/vnd.oxa+json (conversion succeeded; output produced by the oxa.dev writer), none (no output produced — any error path).
path: matched route template (/conversion, /health) or unknown for fallback handlers
status: numeric HTTP status code as a string (e.g., "200", "422")
method: HTTP method as a string (e.g., "GET", "POST")
Cardinality Guarantees
path is bounded to {"/conversion", "/health", "unknown"}. error_class is bounded to 9 values. result is bounded to 3 values. Per-request identifiers (X-Request-ID, X-Trace-ID) are never used as labels. input_mime_type is bounded to 3 values. output_mime_type is bounded to 3 values. Both come from a fixed set of &'static str constants in the source — never from raw header values.
Scrape Model
Each pod maintains its own in-memory metrics. Prometheus scrapes each pod independently. No inter-pod communication is required. Aggregate across pods using PromQL.
Upkeep runs every 5 seconds, keeping histogram internal state bounded.
The /metrics route is mounted outside the API middleware stack, so it does not include the global Cache-Control header used by API responses.
The body-size histogram (docspec_http_request_body_bytes) only records bodies that passed Content-Type and Accept validation. Rejected requests are not counted.
The output-bytes histogram (docspec_conversion_output_bytes) only records observations for successful conversions. Failed conversions do not produce output, so no observation is recorded.
Example PromQL Queries
Per-pod request rate:
rate(docspec_http_requests_total[5m])
Aggregate p99 latency across all pods:
histogram_quantile(0.99, sum by (le) (rate(docspec_http_request_duration_seconds_bucket[5m])))
Error rate broken down by error class:
rate(docspec_conversions_total{result!="success"}[5m])
Body-size p95:
histogram_quantile(0.95, sum by (le) (rate(docspec_http_request_body_bytes_bucket[5m])))
Body-size p95 by input format:
histogram_quantile(0.95, sum by (le, input_mime_type) (rate(docspec_http_request_body_bytes_bucket[5m])))
Conversion success rate by input format:
sum by (input_mime_type) (rate(docspec_conversions_total{result="success"}[5m]))
/ sum by (input_mime_type) (rate(docspec_conversions_total[5m]))
Security
/metrics has no authentication. It's intended for internal scraping only. Deploy behind a private overlay network or a Kubernetes NetworkPolicy that restricts access to your Prometheus pods.