tailtriage-controller

tailtriage-controller manages repeated, bounded capture windows for long-lived services.

Use it when you want to turn capture on, collect one generation, turn capture off, and later start a fresh generation without restarting the process.

Analysis is still done by tailtriage-cli.

When to use this crate

Use tailtriage-controller when you need repeated arm/disarm windows in one process.

Use tailtriage-core for a single explicit build -> capture -> shutdown run.

Use tailtriage when you want the default entry point with controller support enabled by default (or disabled via Cargo features).

Installation

cargo add tailtriage-controller

Quick start

output("tailtriage-run.json") configures the base artifact path template. Each activation writes a per-generation artifact with -generation-N in the file name (for example, generation 1 writes tailtriage-run-generation-1.json).

use tailtriage_controller::TailtriageController;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let controller = TailtriageController::builder("checkout-service")
        .initially_enabled(false)
        .output("tailtriage-run.json")
        .build()?;

    let _generation = controller.enable()?;

    let started = controller.begin_request("/checkout");
    started.completion.finish_ok();

    let _ = controller.disable()?;
    Ok(())
}

Mental model

A controller owns a template plus at most one active generation.

enable() creates a fresh generation from the current template.
disable() stops new admissions for that generation.
If no captured requests are still in flight, the generation finalizes immediately.
Otherwise the generation enters closing and finalizes after its already-admitted captured requests drain.
The next enable() creates a new generation with a new artifact path.

Requests started while the controller is disabled or closing are inert:

they preserve request metadata
they record no capture events
they never join a later generation

Each activation writes a per-generation artifact whose file name includes -generation-N.

Minimal TOML example

Use TOML when you want repeatable operational settings, including mode selection.

[controller]

service_name = "checkout-service"



[controller.activation]

mode = "light"



[controller.activation.sink]

type = "local_json"

output_path = "tailtriage-run.json"

Expanded TOML example

[controller]

service_name = "checkout-service"

initially_enabled = false



[controller.activation]

mode = "investigation"

strict_lifecycle = true



[controller.activation.capture_limits_override]

max_requests = 150000

max_stages = 300000

max_queues = 300000

max_inflight_snapshots = 300000

max_runtime_snapshots = 150000



[controller.activation.sink]

type = "local_json"

output_path = "tailtriage-run.json"



[controller.activation.runtime_sampler]

enabled_for_armed_runs = true

mode_override = "investigation"

interval_ms = 250

max_runtime_snapshots = 20000



[controller.activation.run_end_policy]

kind = "auto_seal_on_limits_hit"

Config precedence and reload rules

When TOML is loaded with config_path(...):

service_name from TOML overrides the builder value when present.
builder service_name is a fallback only when TOML omits service_name.
initially_enabled falls back to the builder value when omitted.
activation template settings come from TOML.
omitted optional activation subfields use TOML contract defaults.

reload_config() updates the template for future generations only.

It does not mutate a generation that is already active.

Run-end policies

Supported policies:

continue_after_limits_hit (default)
auto_seal_on_limits_hit

Behavior:

continue_after_limits_hit: generation stays active after the first truncation
auto_seal_on_limits_hit: on the first limits_hit, new admissions stop and the generation moves to closing; finalization happens immediately if no captured requests are still in flight, otherwise after they drain

TOML contract:

[controller.activation.run_end_policy] is optional
if that table is present, kind is required

Runtime sampler template

The controller can start a Tokio runtime sampler automatically for armed generations.

Important constraints:

sampler startup still requires an active Tokio runtime
sampler settings are fixed at activation time
runtime snapshot retention is still bounded by the resolved core capture limits

TOML field reference

`[controller]`

service_name (optional string): overrides the builder service name when present; must not be empty
initially_enabled (optional bool): when true, build() starts generation 1

`[controller.activation]`

mode (required string): light or investigation
strict_lifecycle (optional bool, default false)

`[controller.activation.sink]`

type (required string): local_json
output_path (required string for local_json): base path template for per-generation files

`[controller.activation.capture_limits_override]`

All fields are optional:

max_requests
max_stages
max_queues
max_inflight_snapshots
max_runtime_snapshots

`[controller.activation.runtime_sampler]`

Optional table. Default is disabled.

enabled_for_armed_runs
mode_override
interval_ms
max_runtime_snapshots

`[controller.activation.run_end_policy]`

Optional table. If present, kind is required.

kind = "continue_after_limits_hit"
kind = "auto_seal_on_limits_hit"

Important constraints

at most one generation is active at a time
active generation settings do not change after activation
requests remain bound to the generation that admitted them
controller capture and artifact analysis are separate; analysis happens in tailtriage-cli

Related crates

tailtriage: default entry point
tailtriage-core: direct instrumentation lifecycle
tailtriage-tokio: runtime-pressure sampling
tailtriage-cli: artifact analysis

tailtriage-controller 0.1.2