scepter 0.1.5

Composable primitives for planet-scale time-series routing, indexing, and aggregation.
Documentation
# Scepter

[![Crates.io](https://img.shields.io/crates/v/scepter.svg)](https://crates.io/crates/scepter)
[![Documentation](https://docs.rs/scepter/badge.svg)](https://docs.rs/scepter)
[![Book](https://img.shields.io/badge/book-GitHub%20Pages-0969da)](https://copyleftdev.github.io/scepter/)
[![CI](https://github.com/copyleftdev/scepter/actions/workflows/ci.yml/badge.svg)](https://github.com/copyleftdev/scepter/actions/workflows/ci.yml)

Composable Rust primitives for large-scale time-series routing, indexing,
aggregation, and query planning.

`scepter` is a small library for building observability, metrics, stream
processing, and distributed query systems. It packages the kinds of low-level
building blocks described in the public Monarch paper into independent Rust
APIs: ordered keys, range assignment, field-hint indexes, histogram-like
distributions, collection aggregation, ingest routing, pushdown planning, and
standing-query sharding.

The crate is inspired by Monarch-style system design, but it is not Google
Monarch and does not depend on any Google service or internal API. The name
keeps the theme without claiming ownership: a scepter is the control instrument
of a monarch; this crate focuses on control-plane and data-plane primitives for
distributed time-series systems.

## Use Cases

- Route metric writes to shards by lexicographic target ranges.
- Prune distributed query fanout with compact field-hint indexes.
- Represent and merge distribution-valued metrics with exemplars.
- Turn cumulative points into delta windows.
- Aggregate ingestion streams before storage.
- Finalize bucketed delta aggregates behind an admission window.
- Select query replicas by coverage, density, completeness, and recovery state.
- Attach partial-result health metadata to degraded distributed queries.
- Export hot-path data as Apache Arrow batches behind an optional feature.
- Use Roaring-backed numeric field hints for compressed postings.
- Encode compact CBOR/Zstd wire payloads behind optional features.
- Split logical query plans into leaf, zone, and root execution fragments.
- Shard periodic standing queries across evaluator workers.

## Install

```sh
cargo add scepter
```

Or add it manually:

```toml
[dependencies]
scepter = "0.1"
```

## Quick Example

```rust
use scepter::{FieldHintIndex, FieldPredicate, RangeAssigner};

let mut ranges = RangeAssigner::new();
ranges.assign(b"a".to_vec()..b"m".to_vec(), "leaf-1")?;
ranges.assign(b"m".to_vec()..b"z".to_vec(), "leaf-2")?;

assert_eq!(ranges.worker_for("latency"), Some(&"leaf-1"));

let mut index = FieldHintIndex::new();
index.insert_value("ComputeTask", "job", "monarch", "leaf-1");

let candidates = index.candidates(
    "ComputeTask",
    "job",
    &FieldPredicate::Equals("monarch".to_owned()),
);
assert!(candidates.contains("leaf-1"));

# Ok::<(), scepter::ShardError>(())
```

## Distribution Metrics

```rust
use scepter::{BucketLayout, Distribution};

let layout = BucketLayout::fixed_width(0.0, 10.0, 3)?;
let mut latency = Distribution::<()>::from_layout(&layout);

latency.record(4.2, 1)?;
latency.record(18.0, 2)?;

assert_eq!(latency.total_count(), 3);
assert!(latency.percentile(99.0)?.is_some());

# Ok::<(), scepter::DistributionError>(())
```

## Collection Aggregation

```rust
use scepter::{AdmissionWindow, BucketedAggregator, DeltaPoint, SumReducer};

let mut iops =
    BucketedAggregator::with_reducer(60, AdmissionWindow::new(10), SumReducer)?;

iops.ingest(DeltaPoint {
    key: ("cluster-a", "storage-user"),
    end_time: 12,
    value: 3_000_u64,
})?;
iops.ingest(DeltaPoint {
    key: ("cluster-a", "storage-user"),
    end_time: 38,
    value: 2_000_u64,
})?;

let finalized = iops.advance_to(70);

assert_eq!(finalized[0].start, 0);
assert_eq!(finalized[0].end, 60);
assert_eq!(finalized[0].value, 5_000);

# Ok::<(), scepter::CollectionError>(())
```

## Query Reliability

```rust
use scepter::{
    IssueKind, QueryHealth, ReplicaCandidate, ReplicaQuality, ReplicaResolver,
    ReplicaState,
};

let resolved = ReplicaResolver::with_max_fallbacks(1).resolve(vec![
    ReplicaCandidate::new(
        b"a".to_vec()..b"m".to_vec(),
        "leaf-a",
        ReplicaQuality::new(0, 60, 60, 60, true, ReplicaState::Available),
    ),
    ReplicaCandidate::new(
        b"a".to_vec()..b"m".to_vec(),
        "leaf-b",
        ReplicaQuality::new(0, 60, 55, 60, true, ReplicaState::Recovering),
    ),
]);

assert_eq!(resolved[0].primary, "leaf-a");
assert_eq!(resolved[0].fallbacks, vec!["leaf-b"]);

let mut health = QueryHealth::with_expected_children(2);
health.record_completed();
health.push_issue("zone-west", IssueKind::PrunedZone, "soft deadline elapsed");

assert!(health.is_partial());
assert_eq!(health.completeness(), 0.5);
```

## Primitive Families

`scepter` is organized by concern:

- `key`: ordered key encoding with `LexicographicKey` and `KeyEncoder`.
- `model`: schema-rich time-series types such as `TargetSchema`,
  `MetricSchema`, `TimeSeriesKey`, `MetricKind`, and `LocationResolver`.
- `shard`: range assignment, range load scoring, worker lookup, reassignment,
  and range splitting.
- `hint`: compact field-hint indexing with trigrams, n-grams, full-string
  excerpts, predicates, intersections, and unions.
- `distribution`: bucket layouts, distribution values, exemplars, cumulative
  points, percentile estimates, and delta windows.
- `aggregate`: distributed aggregation traits, `Sum`, and
  `CollectionAggregator`.
- `collect`: bucketed delta aggregation with admission windows, finalized
  buckets, load-smearing offsets, and pluggable reducers.
- `ingest`: write envelopes, stale-write drop policy, and range-based ingest
  routing.
- `query`: logical plans, execution levels, fanout plans, and pushdown
  fragments.
- `reliability`: replica quality ranking, primary/fallback replica resolution,
  and partial-result query health metadata.
- `standing`: periodic standing queries and stable evaluator sharding.

## Performance Features

The default crate has no runtime dependencies. Optional features enable
interchange and compression formats for high-volume paths:

```toml
[dependencies]
scepter = { version = "0.1", features = ["arrow", "compressed-postings", "wire"] }
```

- `arrow`: Apache Arrow `RecordBatch` exporters for finalized buckets, replica
  candidates, and query-health issues.
- `compressed-postings`: `NumericFieldHintIndex`, a Roaring-backed field-hint
  index for stable `u32` child ordinals and dense set operations. The generic
  `FieldHintIndex` remains faster for tiny singleton-style candidate lookups.
- `cbor`: compact CBOR helpers for wire payloads.
- `zstd`: Zstandard compression helpers.
- `wire`: convenience feature enabling `cbor` and `zstd` together.

## Design Principles

- No runtime dependencies for the library itself.
- Recoverable invalid input is returned as `Result`, not exposed as public API
  panics.
- `unsafe` is forbidden.
- Public API documentation is required by `#![deny(missing_docs)]`.
- Public error types implement `std::error::Error`.
- APIs are small and composable instead of tied to one database runtime.
- Tests include unit tests, property tests, and mutation-regression tests.

## What This Crate Is Not

`scepter` is not a complete time-series database. It does not provide storage,
replication, networking, persistence, query parsing, or a hosted service. It is
the reusable primitive layer you can use while building those systems.

## Test Strategy

Scepter uses three layers of tests:

- Example-sized unit tests live beside each module.
- Property tests in `tests/properties.rs` check invariants over generated
  inputs.
- Mutation-regression tests in `tests/mutation_regression.rs` lock down
  behavior that mutation testing has proven easy to under-specify.

The current mutation suite catches every viable mutant reported by
`cargo-mutants`.

Quality gates used before publishing:

```sh
cargo fmt --check
cargo test
cargo clippy --all-targets -- -D warnings
cargo bench --bench critical_paths
cargo audit
cargo deny check advisories bans licenses sources
cargo mutants --timeout 120 --jobs 2
cargo publish --dry-run
```

## Benchmarking

Critical-path benchmarks live in `benches/critical_paths.rs` and use
Criterion. They cover:

- sortable key encoding
- range assignment lookup and splitting
- field-hint indexing and candidate pruning
- distribution record, merge, percentile, and delta operations
- collection aggregation and bucketed delta finalization
- query pushdown planning
- replica resolution
- standing-query evaluator sharding

Run them with:

```sh
cargo bench --bench critical_paths
```

The first benchmark pass exposed a slow field-hint candidate path. Switching
candidate intersection to start from the smallest posting list improved exact
trigram lookup from millisecond-scale to low-microsecond-scale in the 10k-value
benchmark.

## Worked Example

The `mini_engine` example wires the primitives together into a small
observability-engine flow: route an encoded write, index field hints, prune
query fanout, split a logical query into pushdown fragments, and merge
distribution-valued leaf results.

Run it with:

```sh
cargo run --example mini_engine
```

## Book

The mdBook guide is published at <https://copyleftdev.github.io/scepter/>.

Local docs workflow:

```sh
just book
just book-serve
```

## License

Licensed under either of:

- Apache License, Version 2.0
- MIT license

at your option.