tailtriage-cli
tailtriage-cli loads tailtriage run artifacts and turns them into a triage report.
Install it after capture instrumentation is in place.
The binary name is:
What this tool does
tailtriage-cli owns the analysis-side contract:
- load a captured artifact
- validate schema compatibility
- produce JSON or human-readable triage output
- rank likely bottleneck families
- emit evidence and next checks
The output is intended to guide the next investigation step. It does not prove root cause on its own.
Installation
Minimal usage
Default text output:
Machine-readable JSON output:
The CLI artifact loader requires at least one request event in requests.
How to read the result
Read output in this order:
primary_suspect.kindprimary_suspect.evidence[]primary_suspect.next_checks[]
Then run one targeted check, change one thing, and re-run under comparable load.
Representative output shape
inflight_trend may be null when no in-flight gauges were captured.
What the report contains
A report can include:
- request count
- request latency percentiles (
p50,p95,p99) - p95 queue/service share summaries
- optional in-flight trend summary
- report warnings from analysis/report generation (for example truncation-related)
- primary and secondary suspects
tailtriage analyze also prints loader/lifecycle warnings to stderr before the report. Those warnings are surfaced separately; they are not merged into the report warnings field.
Each suspect includes:
kindscoreconfidenceevidence[]next_checks[]
Artifact compatibility contract
The tailtriage analyze workflow expects a supported tailtriage run artifact with minimum required content.
Current contract:
- top-level
schema_versionis required - missing
schema_versionis rejected - non-integer
schema_versionis rejected - unsupported
schema_versionis rejected - current supported schema version is
1 requestsmust contain at least one request event- artifacts with an empty
requestsarray are rejected by the CLI loader
Library note:
- this crate's library analyzer API,
analyze::analyze_run(&Run), can analyze an in-memoryRunwith zero requests - the stricter non-empty
requestsrule applies to CLI artifact loading from disk
Important interpretation notes
- suspects are investigation leads, not proof of root cause
- truncation warnings mean the diagnosis is based on partial retained data
- unfinished lifecycle warnings printed by the CLI indicate some requests were not completed cleanly
p95_queue_share_permilleandp95_service_share_permilleare independent percentile summaries and do not need to sum to1000
Suspect kinds
The current report surface includes these suspect kinds:
application_queue_saturationblocking_pool_pressureexecutor_pressure_suspecteddownstream_stage_dominatesinsufficient_evidence
When the result is insufficient_evidence
Usually the next step is to add more structure to capture:
- add queue wrappers around suspected waits
- add stage wrappers around suspected downstream work
- optionally add runtime sampling if runtime pressure is unclear
- re-run under comparable load
What this tool does not do
tailtriage-cli does not capture instrumentation data.
Use capture-side crates for that:
tailtriage: recommended capture-side entry pointtailtriage-core: direct instrumentation primitivestailtriage-controller: repeated bounded windowstailtriage-tokio: runtime-pressure samplingtailtriage-axum: Axum request-boundary integration