Skip to main content

Module adaptive

Module adaptive 

Source
Expand description

Adaptive concurrency controller for client data operations.

Replaces hard-coded quote_concurrency / store_concurrency / download fan-out with a per-channel AIMD limiter that ramps up when the network is healthy and ramps down on stress signals (timeouts, errors, latency inflation). The goal is to give every machine and every connection profile a single client codebase that finds its own steady state without the user tweaking flags.

§Channels

Three independent limiters share the same algorithm but track state separately, because their workloads have different cost profiles:

  • quote — small DHT request/response messages, cheap per op
  • store — multi-MB chunk PUTs to a close group, expensive per op
  • fetch — multi-MB chunk GETs from peers, asymmetric to store

§Algorithm

TCP-style AIMD with slow-start:

  • Slow-start: starting concurrency doubles after each healthy window until first stress signal or until the configured ceiling.
  • Steady state: additive +1 per healthy window (>= success_target success rate AND p95 latency within latency_inflation_factor of the rolling baseline).
  • Stress: multiplicative decrease (current / 2, floor 1) on any of: success rate < success_target, timeout rate > timeout_ceiling, or p95 latency above latency_inflation_factor * baseline.

Decisions evaluate over a sliding window of the last window_ops observed outcomes per channel. Below min_window_ops outcomes the controller holds steady — too few samples to act on.

§What this is not

  • Not a payment-batching controller. Wave / batch sizes are orthogonal (gas-economics tradeoff, not throughput).
  • Not a peer-quality scorer. That lives in peer_cache and feeds BootstrapManager. Outcomes flow into both, separately.

Structs§

AdaptiveConfig
Tunable knobs for the adaptive controller. Defaults are picked so that the controller behaves at least as well as the prior static defaults on a healthy network: starts at the previous static value and only deviates when signals demand it.
AdaptiveController
Bundle of per-channel limiters owned by the Client.
ChannelMax
Per-channel concurrency ceilings. Each channel has its own cap so that constraining one (e.g. user pinned a low store concurrency for a slow uplink) never bleeds into another (download).
ChannelStart
Suggested starting concurrency per channel for a brand-new client with no persisted state. Intentionally matches or exceeds the prior static defaults so the cold path is not slower:
Limiter
Per-channel adaptive limiter.
LimiterConfig
Per-limiter configuration. Carries the shared adaptive parameters plus the channel-specific max_concurrency. Held behind an Arc so cloning a Limiter is a refcount bump rather than a struct copy (avoids allocating AdaptiveConfig-worth of bytes per chunk in hot loops).

Enums§

Outcome
Outcome of a single observed operation on one channel.

Functions§

default_persist_path
Default persistence path: <data_dir>/client_adaptive.json. Falls back to None if the platform data dir is not resolvable; in that case the controller still works, it just won’t persist.
load_snapshot
Load a persisted snapshot from disk, returning None if the file does not exist, is unreadable, contains malformed JSON, or has a schema version this build does not understand. Persistence is best effort — never propagate errors that would block the user’s operation.
observe_op
Helper for instrumented call sites: time an async op, classify the result, and report to a Limiter. Returns the original result.
rebucketed
Backward-compatible wrapper. ordered = false -> rolling unordered. ordered = true -> the OLD batch-fence ordered path (kept for tests that explicitly assert batch-fence semantics). New call sites should use rebucketed_unordered or rebucketed_ordered directly.
rebucketed_ordered
Ordered variant: items are tagged with a usize index by the caller (typically by iter.enumerate()); after rolling completion, results are sorted by index so output preserves input order. Use this for callers that pass to APIs which consume positionally (e.g. self_encryption’s get_root_data_map_parallel zips Vec<(idx, Bytes)> with input hashes positionally and discards the idx — without a final sort the bytes pair with the wrong hashes).
rebucketed_unordered
Process an iterator of items with a rolling scheduler whose cap is re-read from the limiter as each slot frees. Replaces the “snapshot the cap once at pipeline build” behavior of plain buffer_unordered(N) so a long pipeline (e.g. 10 GB download = ~2500 chunks) sees adaptive growth/decay mid-flight.
save_snapshot
Save a snapshot to disk atomically (write to <path>.tmp, then rename). Best effort — failures are logged at warn and discarded.
save_snapshot_with_timeout
Save with a wall-clock deadline. Spawns the synchronous save_snapshot on a detached thread and waits up to timeout for it to finish. If the thread is still running past the deadline (e.g. because the data dir is on a hung NFS mount), returns without joining — the OS will clean up the thread when the process exits.