# Migration guide
`tsink-migrate` is the official tool for importing historical data into tsink from an existing metrics system. It handles the full migration lifecycle: backfilling raw data, verifying correctness, and performing a final readiness check before you cut traffic over.
---
## Contents
1. [Overview](#1-overview)
2. [Supported sources](#2-supported-sources)
3. [Installation & invocation](#3-installation--invocation)
4. [Migration plan file](#4-migration-plan-file)
- [Source configuration](#41-source-configuration)
- [Target configuration](#42-target-configuration)
- [Selectors](#43-selectors)
- [Optional fields](#44-optional-fields)
- [Batch tuning](#45-batch-tuning)
- [Comparison tolerances](#46-comparison-tolerances)
5. [Commands](#5-commands)
- [backfill](#51-backfill)
- [verify](#52-verify)
- [cutover-check](#53-cutover-check)
6. [Capture manifests](#6-capture-manifests)
7. [Artifacts & reports](#7-artifacts--reports)
8. [Per-source examples](#8-per-source-examples)
- [Prometheus](#81-prometheus)
- [VictoriaMetrics](#82-victoriametrics)
- [OTLP](#83-otlp)
- [InfluxDB line protocol](#84-influxdb-line-protocol)
- [StatsD](#85-statsd)
- [Graphite](#86-graphite)
9. [Recommended workflow](#9-recommended-workflow)
---
## 1. Overview
Migration happens in three sequential steps:
1. **Backfill** — pull historical data from the source and write it into tsink via Prometheus remote write.
2. **Verify** — compare series counts, sample counts, metadata, and exemplars between source and destination over the same time window.
3. **Cutover check** — re-run verification plus validate that tsink's ingest surface supports every payload type required by the source.
All three steps are driven by a single JSON plan file that describes the source, the target, and what to check.
---
## 2. Supported sources
| Prometheus | `prometheus` | Live — Prometheus remote read API |
| VictoriaMetrics | `victoriametrics` | Live — `/api/v1/export` NDJSON endpoint |
| OTLP | `otlp` | Capture manifest — pre-recorded protobuf payloads |
| InfluxDB line protocol | `influx_line_protocol` | Capture manifest — pre-recorded text payloads |
| StatsD | `statsd` | Capture manifest — pre-recorded UDP packet text |
| Graphite plaintext | `graphite_plaintext` | Capture manifest — pre-recorded plaintext lines |
Prometheus and VictoriaMetrics pull data directly from the live source system over HTTP. OTLP, InfluxDB, StatsD, and Graphite require you to capture a representative sample of payloads ahead of time and point the plan at that capture manifest file.
---
## 3. Installation & invocation
The tool ships as a separate binary in the `tsink-server` crate. Build it alongside the server:
```bash
cargo build -p tsink-server --bin tsink-migrate --release
```
A convenience wrapper is provided at [`scripts/tsink_migrate.sh`](../scripts/tsink_migrate.sh) that proxies arguments directly:
```bash
./scripts/tsink_migrate.sh <command> --config plan.json --start-ms <ms> --end-ms <ms>
```
Or invoke the binary directly:
```
--start-ms <unix_ms> required start of the time window (milliseconds)
--end-ms <unix_ms> required end of the time window (milliseconds)
--artifact-dir <dir> optional directory to write JSON + Markdown reports
```
`--end-ms` must be greater than or equal to `--start-ms`. On success the tool exits with code `0`; on failure (verify issues, cutover issues, or an unrecoverable error) it exits with code `1`.
---
## 4. Migration plan file
The plan is a JSON object. All paths in the plan are interpreted relative to the plan file's directory unless they are absolute.
### 4.1 Source configuration
```json
"source": {
"kind": "<source_kind>",
"headers": { "Authorization": "Bearer <token>" },
"remote_read_url": "http://prometheus:9090/api/v1/read",
"export_url": "http://victoria:8428/api/v1/export",
"query_range_url": "http://prometheus:9090/api/v1/query_range",
"metadata_url": "http://prometheus:9090/api/v1/metadata",
"exemplar_url": "http://prometheus:9090/api/v1/query_exemplars",
"capture_manifest_path": "capture.json"
}
```
| `kind` | all | Source system identifier. See [Supported sources](#2-supported-sources). |
| `headers` | optional | HTTP headers added to every request to the source. Use for authentication tokens. |
| `remote_read_url` | `prometheus` | Prometheus remote read endpoint. |
| `export_url` | `victoriametrics` | VictoriaMetrics `/api/v1/export` endpoint (NDJSON format). |
| `query_range_url` | optional | Used by `cutover-check` PromQL parity checks when the source is Prometheus or VictoriaMetrics. |
| `metadata_url` | optional | Source metadata endpoint (`/api/v1/metadata`). If omitted, metadata backfill and verification are skipped. |
| `exemplar_url` | optional (Prometheus only) | Source exemplar query endpoint. If omitted, exemplar backfill is skipped. |
| `capture_manifest_path` | `otlp`, `influx_line_protocol`, `statsd`, `graphite_plaintext` | Path to the capture manifest JSON file. See [Capture manifests](#6-capture-manifests). |
### 4.2 Target configuration
```json
"target": {
"write_url": "http://tsink:9201/api/v1/write",
"read_url": "http://tsink:9201/api/v1/read",
"query_range_url": "http://tsink:9201/api/v1/query_range",
"metadata_url": "http://tsink:9201/api/v1/metadata",
"exemplar_url": "http://tsink:9201/api/v1/query_exemplars",
"status_url": "http://tsink:9201/api/v1/status/tsdb",
"tenant": "default",
"headers": { "Authorization": "Bearer <token>" }
}
```
| `write_url` | yes | tsink Prometheus remote write endpoint. Used by `backfill`. |
| `read_url` | yes | tsink Prometheus remote read endpoint. Used by `verify` and `cutover-check`. |
| `query_range_url` | optional | Used by `cutover-check` PromQL parity checks. |
| `metadata_url` | optional | Used when metadata verification or backfill is enabled. |
| `exemplar_url` | optional | Used when exemplar verification is configured. |
| `status_url` | optional | tsink `/api/v1/status/tsdb` endpoint. When provided, `cutover-check` probes whether required ingest payload types (metadata, exemplars, histograms, OTLP, InfluxDB, StatsD, Graphite) are enabled on the target. |
| `tenant` | optional | Target tenant name. Sent as `X-Tsink-Tenant` on every request. Defaults to `default`. |
| `headers` | optional | Additional HTTP headers for every request to tsink (for per-tenant auth tokens, etc.). |
### 4.3 Selectors
```json
"selectors": [
"http_requests_total{job=\"api\"}",
"node_cpu_seconds_total{mode=~\"idle|iowait\"}"
]
```
A list of one or more PromQL-style label selectors. At least one selector is required. During backfill the tool fetches all series matching any selector. During verify each selector is checked independently. Selectors support `=`, `!=`, `=~`, `!~` matcher operators.
### 4.4 Optional fields
```json
"metadata_metrics": ["http_requests_total", "node_cpu_seconds_total"],
"exemplar_checks": [
{ "query": "http_request_duration_seconds_bucket{job=\"api\"}", "limit": 200 }
],
"promql_checks": [
{ "query": "sum(rate(http_requests_total[5m])) by (job)", "step": "30s" }
]
```
**`metadata_metrics`** — metric names whose type/help/unit metadata should be backfilled and verified. When omitted the tool derives a list automatically from metric names it can extract from the `selectors`.
**`exemplar_checks`** — exemplar queries to run during verify and cutover-check. Each entry has:
- `query` — a PromQL selector to identify the set of series (required)
- `limit` — maximum number of exemplars to retrieve per series (default: `200`)
When no `exemplar_checks` are specified the tool falls back to running one check per selector.
**`promql_checks`** — PromQL expressions compared between source and tsink during `cutover-check`. Each entry has:
- `query` — any PromQL expression (required)
- `step` — query range step (default: `"30s"`)
PromQL checks require `query_range_url` to be set on both source and target, and are only meaningful for Prometheus and VictoriaMetrics sources.
### 4.5 Batch tuning
```json
"batch": {
"max_series_per_write": 250,
"max_points_per_write": 25000,
"http_timeout_secs": 30
}
```
Controls how data is batched during backfill writes.
| `max_series_per_write` | `250` | Maximum number of series per remote write request. |
| `max_points_per_write` | `25000` | Maximum total data points across all series per request. |
| `http_timeout_secs` | `30` | HTTP request timeout for all source and target calls. |
### 4.6 Comparison tolerances
```json
"compare": {
"max_absolute_value_delta": 1e-12,
"max_relative_value_delta": 1e-9
}
```
Sample values are compared within these tolerances during `verify` and `cutover-check`.
| `max_absolute_value_delta` | `1e-12` | Maximum allowed absolute difference between source and target sample values. |
| `max_relative_value_delta` | `1e-9` | Maximum allowed relative difference (as a fraction of the source value). |
---
## 5. Commands
### 5.1 backfill
```bash
tsink-migrate backfill --config plan.json --start-ms 1700000000000 --end-ms 1700086400000
```
Fetches all series matching the plan selectors from the source within `[start-ms, end-ms]` and writes them to tsink in batched Prometheus remote write requests. Additionally backfills metric metadata (if `metadata_url` is configured on the source) and exemplars (if `exemplar_url` is configured and the source is Prometheus).
**Exit code:** `0` on success, `1` on any transport or write error.
**Console output on success:**
```
tsink-migrate backfill: pass
source_kind: Prometheus
selectors: 2
series: 18
samples: 43200
histograms: 0
exemplars: 0
metadata: 2
write_batches: 3
```
### 5.2 verify
```bash
tsink-migrate verify --config plan.json --start-ms 1700000000000 --end-ms 1700086400000
```
Compares source and target data for the specified window. The check has three layers:
1. **Raw checks** — per selector: series count, row count, sample count, histogram count, missing/extra series, and per-sample value mismatch within the configured tolerances.
2. **Metadata checks** — per metric: whether the type/help/unit metadata entries match.
3. **Exemplar checks** — per query: series count and per-exemplar value comparison.
Any discrepancy is recorded as an issue. The command exits with `1` if any issues are found.
**Console output on pass:**
```
tsink-migrate verify: pass
raw_checks: 2
metadata_checks: 2
exemplar_checks: 2
issues: 0
```
### 5.3 cutover-check
```bash
tsink-migrate cutover-check --config plan.json --start-ms 1700000000000 --end-ms 1700086400000
```
Runs the full `verify` check and then two additional layers:
1. **Target payload capability probes** — when `status_url` is configured, queries tsink's status endpoint to confirm that the ingest features required by the source are enabled: metadata, exemplars, histograms, OTLP, InfluxDB line protocol, StatsD, or Graphite (depending on the plan's source kind and payload usage). Issues are raised for any feature that is disabled.
2. **PromQL parity checks** — for each entry in `promql_checks`, executes the query range against both source and target and compares the result sets. Mismatches, partial responses, or target warnings are all recorded as issues.
The command exits with `1` if any verify issues, capability issues, or PromQL parity mismatches are found.
---
## 6. Capture manifests
OTLP, InfluxDB line protocol, StatsD, and Graphite sources cannot be queried retroactively via a read API. Instead you record the original inbound payloads to a capture manifest ahead of the migration and replay them locally.
A capture manifest is a JSON array. Each entry describes one payload:
```json
[
{
"path": "payloads/batch-001.bin",
"received_at_ms": 1700000000000,
"query_params": {}
},
{
"body": "cpu,host=node-a value=1.5 1700000000000",
"received_at_ms": 1700000000000,
"query_params": { "db": "telegraf", "precision": "ms" }
},
{
"body_base64": "<base64-encoded protobuf>",
"received_at_ms": 1700000000000,
"query_params": {}
}
]
```
Each entry must have exactly one of:
| `path` | Path to the raw payload file (relative to the plan file, or absolute). |
| `body` | Inline payload text (UTF-8). |
| `body_base64` | Base64-encoded payload bytes. Useful for binary formats like OTLP protobuf. |
Additional fields:
| `received_at_ms` | `influx_line_protocol`, `statsd`, `graphite_plaintext` | Wall-clock time when the payload arrived. Used as the fallback timestamp for lines that carry no timestamp of their own. |
| `query_params` | InfluxDB only | Query parameters from the original HTTP request. Recognized keys: `db`, `rp`, `bucket`, `org`, `precision`. The `db`, `rp`, `bucket`, and `org` values are promoted to labels (`influx_db`, `influx_rp`, `influx_bucket`, `influx_org`) on ingested series. |
Capture manifests are processed locally — the tool normalizes each payload using the same logic as the live ingest path and then writes the resulting series to tsink via remote write. The `backfill` command imports from the capture manifest; `verify` and `cutover-check` use the same normalized data as the source-side reference.
---
## 7. Artifacts & reports
Pass `--artifact-dir <dir>` to any command to write structured output files into that directory. The directory is created if it does not exist.
```bash
tsink-migrate backfill \
--config plan.json \
--start-ms 1700000000000 \
--end-ms 1700086400000 \
--artifact-dir ./migration-reports
```
Two files are written per run:
| `report.json` | JSON | Full structured report with all check details, counters, and issue lists. |
| `report.md` | Markdown | Human-readable summary suitable for attaching to a PR or ticket. |
The JSON report can be used for programmatic validation in CI pipelines. Exit code `1` is raised before any artifact files are written if the command itself fails (e.g. a transport error); artifact files are written only after the command completes.
---
## 8. Per-source examples
### 8.1 Prometheus
Prometheus uses its remote read API for backfill and verification, and optionally its metadata and exemplar APIs.
```json
{
"source": {
"kind": "prometheus",
"remote_read_url": "http://prometheus:9090/api/v1/read",
"query_range_url": "http://prometheus:9090/api/v1/query_range",
"metadata_url": "http://prometheus:9090/api/v1/metadata",
"exemplar_url": "http://prometheus:9090/api/v1/query_exemplars",
"headers": {
"Authorization": "Bearer <source-token>"
}
},
"target": {
"write_url": "http://tsink:9201/api/v1/write",
"read_url": "http://tsink:9201/api/v1/read",
"query_range_url": "http://tsink:9201/api/v1/query_range",
"metadata_url": "http://tsink:9201/api/v1/metadata",
"exemplar_url": "http://tsink:9201/api/v1/query_exemplars",
"status_url": "http://tsink:9201/api/v1/status/tsdb",
"tenant": "default",
"headers": {
"Authorization": "Bearer <target-token>"
}
},
"selectors": [
"up{job=\"node\"}",
"http_requests_total{job=~\"api|worker\"}"
],
"metadata_metrics": ["up", "http_requests_total"],
"exemplar_checks": [
{ "query": "http_request_duration_seconds_bucket{job=\"api\"}", "limit": 200 }
],
"promql_checks": [
{ "query": "sum(rate(http_requests_total[5m])) by (job)", "step": "30s" },
{
"query": "histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))",
"step": "30s"
}
]
}
```
### 8.2 VictoriaMetrics
VictoriaMetrics uses the `/api/v1/export` endpoint (NDJSON format) for backfill and verification. The Prometheus-compatible `/prometheus/api/v1/query_range` and `/prometheus/api/v1/metadata` endpoints are used for PromQL parity and metadata checks.
```json
{
"source": {
"kind": "victoriametrics",
"export_url": "http://victoria:8428/api/v1/export",
"query_range_url": "http://victoria:8428/prometheus/api/v1/query_range",
"metadata_url": "http://victoria:8428/prometheus/api/v1/metadata"
},
"target": {
"write_url": "http://tsink:9201/api/v1/write",
"read_url": "http://tsink:9201/api/v1/read",
"query_range_url": "http://tsink:9201/api/v1/query_range",
"metadata_url": "http://tsink:9201/api/v1/metadata",
"status_url": "http://tsink:9201/api/v1/status/tsdb",
"tenant": "default"
},
"selectors": [
"node_cpu_seconds_total{mode=\"idle\"}",
"http_requests_total{job=\"edge\"}"
],
"metadata_metrics": ["node_cpu_seconds_total", "http_requests_total"],
"promql_checks": [
{ "query": "sum(rate(node_cpu_seconds_total[5m])) by (instance)", "step": "30s" }
]
}
```
### 8.3 OTLP
OTLP sources require a capture manifest containing the raw protobuf `ExportMetricsServiceRequest` payloads. Payloads are typically base64-encoded since they are binary. The `cutover-check` command additionally verifies that the tsink OTLP ingest endpoint is enabled and that it supports the OTLP metric shapes present in the captured payloads (gauge, sum, histogram, summary).
**`otlp-capture.json`:**
```json
[
{
"body_base64": "<base64-encoded ExportMetricsServiceRequest>"
}
]
```
**`otlp-plan.json`:**
```json
{
"source": {
"kind": "otlp",
"capture_manifest_path": "otlp-capture.json"
},
"target": {
"write_url": "http://tsink:9201/api/v1/write",
"read_url": "http://tsink:9201/api/v1/read",
"metadata_url": "http://tsink:9201/api/v1/metadata",
"status_url": "http://tsink:9201/api/v1/status/tsdb",
"tenant": "default"
},
"selectors": ["system_x2e_cpu_x2e_time"],
"metadata_metrics": ["system_x2e_cpu_x2e_time"]
}
```
Note: OTLP metric names that contain `.` or other characters unsafe in Prometheus labels are percent-encoded (`.` → `_x2e_`) by the normalizer. Use the encoded form in selectors and `metadata_metrics`.
### 8.4 InfluxDB line protocol
InfluxDB sources require a capture manifest of raw line protocol HTTP request bodies. The `db`, `rp`, `bucket`, and `org` values from `query_params` are promoted to labels so selectors can filter by database or bucket.
**`influx-capture.json`:**
```json
[
{
"received_at_ms": 1700000000000,
"query_params": { "db": "telegraf", "precision": "ms" },
"body": "cpu,host=node-a value=1.5,temp=3.0 1700000000000\nmem,host=node-a used=42 1700000005000"
}
]
```
**`influx-plan.json`:**
```json
{
"source": {
"kind": "influx_line_protocol",
"capture_manifest_path": "influx-capture.json"
},
"target": {
"write_url": "http://tsink:9201/api/v1/write",
"read_url": "http://tsink:9201/api/v1/read",
"metadata_url": "http://tsink:9201/api/v1/metadata",
"status_url": "http://tsink:9201/api/v1/status/tsdb",
"tenant": "default"
},
"selectors": [
"cpu{host=\"node-a\",influx_db=\"telegraf\"}",
"cpu_temp{host=\"node-a\",influx_db=\"telegraf\"}"
],
"metadata_metrics": ["cpu", "cpu_temp"]
}
```
### 8.5 StatsD
StatsD sources require a capture manifest of UDP packet text (one or more `metric:value|type` lines per entry). StatsD has no native timestamps; every entry must supply `received_at_ms`.
**`statsd-capture.json`:**
```json
[
{
"received_at_ms": 1700000000000,
"body": "jobs.completed:4|c|#env:prod\nworkers.active:2|g|#env:prod"
}
]
```
**`statsd-plan.json`:**
```json
{
"source": {
"kind": "statsd",
"capture_manifest_path": "statsd-capture.json"
},
"target": {
"write_url": "http://tsink:9201/api/v1/write",
"read_url": "http://tsink:9201/api/v1/read",
"metadata_url": "http://tsink:9201/api/v1/metadata",
"status_url": "http://tsink:9201/api/v1/status/tsdb",
"tenant": "default"
},
"selectors": [
"jobs_completed{env=\"prod\"}",
"workers_active{env=\"prod\"}"
],
"metadata_metrics": ["jobs_completed", "workers_active"]
}
```
### 8.6 Graphite
Graphite sources require a capture manifest of TCP plaintext lines. Graphite timestamps are in Unix seconds; `received_at_ms` is used as a fallback only for lines that omit the timestamp field.
**`graphite-capture.json`:**
```json
[
{
"received_at_ms": 1700000000999,
"body": "servers.api.latency;env=prod;region=us-west 42.5 1700000000\nservers.api.errors 1 1700000001"
}
]
```
**`graphite-plan.json`:**
```json
{
"source": {
"kind": "graphite_plaintext",
"capture_manifest_path": "graphite-capture.json"
},
"target": {
"write_url": "http://tsink:9201/api/v1/write",
"read_url": "http://tsink:9201/api/v1/read",
"metadata_url": "http://tsink:9201/api/v1/metadata",
"status_url": "http://tsink:9201/api/v1/status/tsdb",
"tenant": "default"
},
"selectors": [
"servers_api_latency{env=\"prod\",region=\"us-west\"}",
"servers_api_errors"
],
"metadata_metrics": ["servers_api_latency", "servers_api_errors"]
}
```
---
## 9. Recommended workflow
A safe migration sequence from any supported source:
**Step 1 — Prepare the plan.** Write a plan JSON for your source and target. For live sources (Prometheus, VictoriaMetrics) verify the API URLs are reachable. For capture-manifest sources, record a representative set of payloads.
**Step 2 — Backfill historical data.** Choose a time window that covers the retention you want to import. For large windows run backfill in overlapping slices if memory or network constraints require it.
```bash
tsink-migrate backfill \
--config plan.json \
--start-ms 1696118400000 \
--end-ms 1700000000000 \
--artifact-dir ./artifacts/backfill
```
**Step 3 — Verify a recent window.** Pick a short window near the present where the source still has fresh data and tsink has just received the backfill. Check that series counts and sample values agree.
```bash
tsink-migrate verify \
--config plan.json \
--start-ms 1699913600000 \
--end-ms 1700000000000 \
--artifact-dir ./artifacts/verify
```
Review any issues in the console output or in `artifacts/verify/report.md`. Common causes:
- **Missing series on target** — the series was created during the backfill window but not matched by the selector; broaden the selector or re-run backfill.
- **Sample count mismatch** — the source had data outside the window used for backfill; adjust `--start-ms` / `--end-ms`.
- **Metadata mismatch** — the source metadata endpoint was unreachable; check `source.metadata_url`.
**Step 4 — Run cutover-check.** Before switching write traffic, confirm that tsink is ready to accept every payload type the source uses.
```bash
tsink-migrate cutover-check \
--config plan.json \
--start-ms 1699913600000 \
--end-ms 1700000000000 \
--artifact-dir ./artifacts/cutover
```
A `pass` result means:
- Data verification passed.
- All required tsink ingest features (metadata, exemplars, histograms, protocol-specific endpoints) are enabled.
- PromQL query results match between source and tsink (when `promql_checks` are configured).
**Step 5 — Switch write traffic.** Reconfigure your instrumentation, scrape targets, or upstream forwarders to send new data directly to tsink. The source can be kept running in read-only mode for a grace period while operators confirm the cutover is clean.