tailtriage-controller 0.1.2

Configurable control layer for repeated bounded capture windows in long-lived services
Documentation

tailtriage-controller

tailtriage-controller manages repeated, bounded capture windows for long-lived services.

Use it when you want to turn capture on, collect one generation, turn capture off, and later start a fresh generation without restarting the process.

Analysis is still done by tailtriage-cli.

When to use this crate

Use tailtriage-controller when you need repeated arm/disarm windows in one process.

Use tailtriage-core for a single explicit build -> capture -> shutdown run.

Use tailtriage when you want the default entry point with controller support enabled by default (or disabled via Cargo features).

Installation

cargo add tailtriage-controller

Quick start

output("tailtriage-run.json") configures the base artifact path template. Each activation writes a per-generation artifact with -generation-N in the file name (for example, generation 1 writes tailtriage-run-generation-1.json).

use tailtriage_controller::TailtriageController;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let controller = TailtriageController::builder("checkout-service")
        .initially_enabled(false)
        .output("tailtriage-run.json")
        .build()?;

    let _generation = controller.enable()?;

    let started = controller.begin_request("/checkout");
    started.completion.finish_ok();

    let _ = controller.disable()?;
    Ok(())
}

Mental model

A controller owns a template plus at most one active generation.

  • enable() creates a fresh generation from the current template.
  • disable() stops new admissions for that generation.
  • If no captured requests are still in flight, the generation finalizes immediately.
  • Otherwise the generation enters closing and finalizes after its already-admitted captured requests drain.
  • The next enable() creates a new generation with a new artifact path.

Requests started while the controller is disabled or closing are inert:

  • they preserve request metadata
  • they record no capture events
  • they never join a later generation

Each activation writes a per-generation artifact whose file name includes -generation-N.

Minimal TOML example

Use TOML when you want repeatable operational settings, including mode selection.

[controller]

service_name = "checkout-service"



[controller.activation]

mode = "light"



[controller.activation.sink]

type = "local_json"

output_path = "tailtriage-run.json"

Expanded TOML example

[controller]

service_name = "checkout-service"

initially_enabled = false



[controller.activation]

mode = "investigation"

strict_lifecycle = true



[controller.activation.capture_limits_override]

max_requests = 150000

max_stages = 300000

max_queues = 300000

max_inflight_snapshots = 300000

max_runtime_snapshots = 150000



[controller.activation.sink]

type = "local_json"

output_path = "tailtriage-run.json"



[controller.activation.runtime_sampler]

enabled_for_armed_runs = true

mode_override = "investigation"

interval_ms = 250

max_runtime_snapshots = 20000



[controller.activation.run_end_policy]

kind = "auto_seal_on_limits_hit"

Config precedence and reload rules

When TOML is loaded with config_path(...):

  • service_name from TOML overrides the builder value when present.
  • builder service_name is a fallback only when TOML omits service_name.
  • initially_enabled falls back to the builder value when omitted.
  • activation template settings come from TOML.
  • omitted optional activation subfields use TOML contract defaults.

reload_config() updates the template for future generations only.

It does not mutate a generation that is already active.

Run-end policies

Supported policies:

  • continue_after_limits_hit (default)
  • auto_seal_on_limits_hit

Behavior:

  • continue_after_limits_hit: generation stays active after the first truncation
  • auto_seal_on_limits_hit: on the first limits_hit, new admissions stop and the generation moves to closing; finalization happens immediately if no captured requests are still in flight, otherwise after they drain

TOML contract:

  • [controller.activation.run_end_policy] is optional
  • if that table is present, kind is required

Runtime sampler template

The controller can start a Tokio runtime sampler automatically for armed generations.

Important constraints:

  • sampler startup still requires an active Tokio runtime
  • sampler settings are fixed at activation time
  • runtime snapshot retention is still bounded by the resolved core capture limits

TOML field reference

[controller]

  • service_name (optional string): overrides the builder service name when present; must not be empty
  • initially_enabled (optional bool): when true, build() starts generation 1

[controller.activation]

  • mode (required string): light or investigation
  • strict_lifecycle (optional bool, default false)

[controller.activation.sink]

  • type (required string): local_json
  • output_path (required string for local_json): base path template for per-generation files

[controller.activation.capture_limits_override]

All fields are optional:

  • max_requests
  • max_stages
  • max_queues
  • max_inflight_snapshots
  • max_runtime_snapshots

[controller.activation.runtime_sampler]

Optional table. Default is disabled.

  • enabled_for_armed_runs
  • mode_override
  • interval_ms
  • max_runtime_snapshots

[controller.activation.run_end_policy]

Optional table. If present, kind is required.

  • kind = "continue_after_limits_hit"
  • kind = "auto_seal_on_limits_hit"

Important constraints

  • at most one generation is active at a time
  • active generation settings do not change after activation
  • requests remain bound to the generation that admitted them
  • controller capture and artifact analysis are separate; analysis happens in tailtriage-cli

Related crates

  • tailtriage: default entry point
  • tailtriage-core: direct instrumentation lifecycle
  • tailtriage-tokio: runtime-pressure sampling
  • tailtriage-cli: artifact analysis