Module operations

Expand description

§Operating tsoracle

§Sizing window_ahead

Default is 3 seconds. Each window extension costs one persist_high_water round-trip — for the file driver that is write + fsync + rename + dir-fsync, roughly 1-5 ms on a modern SSD. At 3-second window-ahead, extension rate is well under 1/sec in steady state. Lower values trade more frequent fsyncs for tighter bounds on stale-window timestamps after a clock skip.

Do not run window_ahead below 100ms with the file driver. The fsync rate dominates throughput at that point. If you need tighter window bounds, use a consensus driver with batched log appends instead.

§Sizing failover_advance

Default is 1 second. On leadership gain, the new leader first computes serving_floor = max(prior_max + 1, now_ms) and then persists requested = serving_floor + failover_advance. The +1 is mandatory because prior_max is an inclusive high-water: the prior leader could have served (prior_max, LOGICAL_MAX). Larger failover_advance values give more headroom against clock skew between old and new leaders; smaller values reduce timestamp “jumps” visible to clients. 1 second is appropriate for most deployments; consider 5-10 seconds if your nodes’ clocks may differ by more than a second.

§Migrating from a prior timestamp system

tsoracle serve against an empty state directory starts at high-water 0. If you are migrating from any prior timestamp source (a previous TSO, snapshot of max-observed commit timestamps in your data, etc.), seed the state file once:

tsoracle init --seed-physical-ms <MAX_OBSERVED_MILLIS> --state-dir ./tsoracle-data
tsoracle serve --state-dir ./tsoracle-data

init refuses to overwrite an existing state file, so accidental rollback is prevented. Pick MAX_OBSERVED_MILLIS to be the largest physical_ms you have ever served from the prior system, plus a safety margin to account for any timestamps you may have issued but not yet checkpointed. The seed must fit the timestamp layout’s 46-bit physical field (<= PHYSICAL_MS_MAX).

§Monitoring hooks

The server emits the following signals through the metrics crate facade. Emission is gated behind the metrics Cargo feature on tsoracle-server (off by default so the dependency stays opt-in for embedders who do not want it):

tsoracle.get_ts.total — total GetTs RPCs handled (counter)
tsoracle.get_ts.timestamps_issued — sum of count across all GetTs responses (counter)
tsoracle.window.extensions.total — number of persist_high_water calls (counter)
tsoracle.window.extension_latency — duration of persist_high_water (histogram, seconds)
tsoracle.leader_transition.total — leader-watch saw a state change (counter)
tsoracle.leader_transition.fence_latency — duration of the failover fence (histogram, seconds)
tsoracle.not_leader.total — RPCs rejected with NOT_LEADER (counter)

The library is exporter-agnostic: embedders install whichever recorder they want (metrics-exporter-prometheus, metrics-exporter-influx, a custom sink) before constructing the [Server]. The example below wires Prometheus over an HTTP listener:

[dependencies]
tsoracle-server             = { version = "0.1", features = ["metrics"] }
metrics-exporter-prometheus = "0.16"

use metrics_exporter_prometheus::PrometheusBuilder;

PrometheusBuilder::new()
    .with_http_listener(([0, 0, 0, 0], 9100))
    .install()
    .expect("install Prometheus recorder");

// Build and serve `tsoracle_server::Server` as usual; emissions now flow
// through the installed recorder.

§Client retry behavior

The client gives FAILED_PRECONDITION special handling: it parses the tsoracle-leader-hint-bin trailer and moves the hinted leader to the front of the current retry worklist. Other gRPC errors, including UNAVAILABLE and INTERNAL, are recorded and the client continues through the configured endpoints once for that call. Configure endpoints with all known servers so cold-start works even when the cached leader is unreachable.

§Advertised endpoints in multi-node deployments

The consensus driver owns the mapping from consensus leader identity to tsoracle endpoint. The source of that mapping is the driver’s choice — explicit configuration, consensus membership metadata, service discovery, or anything else. Drivers report the resolved endpoint to the server via LeaderState::Follower { leader_endpoint }; the server forwards it in LeaderHint trailers so clients can redirect. The library itself never sees the mapping and exposes no flag for it. Single-node deployments (tsoracle-driver-file) have no peers to advertise to.

§Deployment topologies

Single-node: one tsoracle serve process, tsoracle-driver-file. No HA. Good for dev, small services, deployments where TSO availability is not in the critical path.

HA via your own consensus: N nodes (typically 3 or 5), each running tsoracle serve embedded in a binary that supplies a custom ConsensusDriver over your consensus library. Clients configure all N endpoints. Leader handles GetTs; followers redirect.

Sharded TSO domains: for systems wanting separate monotonic sequences per keyspace, run one tsoracle cluster per shard. The library has no opinion on sharding.