Skip to main content

SamplingDispatcher

Struct SamplingDispatcher 

Source
pub struct SamplingDispatcher {
    pub default: CodecKind,
    pub entropy_threshold: f64,
    pub prefer_gpu: bool,
    pub gpu_min_bytes: usize,
}
Expand description

入力 sample を見て codec を選ぶ dispatcher。

判定順 (上位優先):

  1. 短すぎる入力 (<128 byte) → default
  2. magic bytes が既圧縮フォーマット (gzip / zstd / png / jpeg / mp4 / zip / pdf / 7z / xz / bzip2) → Passthrough (再圧縮しても意味がない)
  3. Shannon entropy が entropy_threshold (default 7.5 bits/byte) 以上 → Passthrough (高エントロピー = ほぼランダム = 圧縮余地なし)
  4. それ以外 → default (text / log / parquet 数値列等、圧縮余地あり)

Phase 1 では default = CpuZstd 想定。Phase 1 後半で integer-column 検出を加え、 default 分岐を「数値列なら NvcompBitcomp、そうでなければ CpuZstd」に拡張する。

§v0.8 #56: GPU auto-detect at boot

with_gpu_preference(true, gpu_min_bytes) を呼ぶと、boot 時に s4_codec::nvcomp::is_gpu_available() が true を返した場合に限り、 「default が CpuZstd でかつ total size >= gpu_min_bytes の object」を NvcompZstd に昇格させる。size hint が None (chunked transfer)、 または閾値未満の小オブジェクトでは GPU upload overhead を避けるため CPU codec のままにする。

nvcomp-gpu feature が build-time で off の場合、NvcompZstd への昇格は 行わない (registry に居ない codec を指すと dispatch 時に UnregisteredCodec で fail するため)。orchestrator は main.rs 側で prefer_gpu = false を強制することでこれを担保する。

Fields§

§default: CodecKind§entropy_threshold: f64§prefer_gpu: bool

v0.8 #56: when set, route large CpuZstd picks through NvcompZstd.

§gpu_min_bytes: usize

v0.8 #56: GPU promotion only fires when the caller can prove total_size >= gpu_min_bytes via pick_with_size_hint. Below this threshold the GPU upload overhead exceeds the compress time so CPU wins; the default 1 MiB is the empirical break-even point on common text / log payloads with PCIe 4.0 + an A10G-class GPU.

Implementations§

Source§

impl SamplingDispatcher

Source

pub const DEFAULT_ENTROPY_THRESHOLD: f64 = 7.5

Source

pub const MIN_SAMPLE_BYTES: usize = 128

Source

pub const DEFAULT_GPU_MIN_BYTES: usize = 1_048_576

v0.8 #56: 1 MiB. The empirical break-even point — below this, the PCIe upload + kernel launch overhead dominates the GPU’s compress throughput advantage.

Source

pub fn new(default: CodecKind) -> Self

Source

pub fn with_entropy_threshold(self, t: f64) -> Self

Source

pub fn with_gpu_preference(self, prefer_gpu: bool, gpu_min_bytes: usize) -> Self

v0.8 #56: enable GPU promotion. When prefer_gpu = true, a CpuZstd pick on a body whose total_size >= gpu_min_bytes is rewritten to NvcompZstd. Pass prefer_gpu = false (the default) to disable. The threshold is in bytes; 1_048_576 (1 MiB) is the recommended default for PCIe 4.0 hosts.

Trait Implementations§

Source§

impl Clone for SamplingDispatcher

Source§

fn clone(&self) -> SamplingDispatcher

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl CodecDispatcher for SamplingDispatcher

Source§

fn pick<'life0, 'life1, 'async_trait>( &'life0 self, sample: &'life1 [u8], ) -> Pin<Box<dyn Future<Output = CodecKind> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

Source§

fn pick_with_size_hint<'life0, 'life1, 'async_trait>( &'life0 self, sample: &'life1 [u8], total_size: Option<u64>, ) -> Pin<Box<dyn Future<Output = CodecKind> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

v0.8 #56: size-hint aware pick. 既定実装は pick(sample) に委譲する ので、追加情報を活用する dispatcher (SamplingDispatcher) のみ override すればよい。total_size = None は「chunked transfer で content-length が無い」ケースを表す。
Source§

impl Debug for SamplingDispatcher

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.