Skip to main content

ConfigValueGroup

Struct ConfigValueGroup 

Source
pub struct ConfigValueGroup {
Show 14 fields pub min_spacing_between_global_dedup_queries: usize, pub local_cas_scheme: String, pub max_concurrent_file_ingestion: usize, pub max_concurrent_file_downloads: usize, pub ingestion_block_size: ByteSize, pub progress_update_interval: Duration, pub progress_update_speed_sampling_window: Duration, pub progress_update_speed_min_observations: u32, pub session_xorb_metadata_flush_interval: Duration, pub session_xorb_metadata_flush_max_count: usize, pub default_cas_endpoint: String, pub aggregate_progress: bool, pub default_prefix: String, pub staging_subdir: String,
}
Expand description

ConfigValueGroup struct containing all configurable values

Fields§

§min_spacing_between_global_dedup_queries: usize

Gives the minimum spacing in number of chunks between global dedup queries sent to the server to limit the number of simultaneous queries.

The default value is 256, which means that the server will receive a query at most for every 256 chunks or 4MB of data.

Use the environment variable HF_XET_DATA_MIN_SPACING_BETWEEN_GLOBAL_DEDUP_QUERIES to set this value.

§local_cas_scheme: String

scheme for a local filesystem based CAS server

The default value is “local://”.

Use the environment variable HF_XET_DATA_LOCAL_CAS_SCHEME to set this value.

§max_concurrent_file_ingestion: usize

The maximum number of files to ingest at once on the upload path. High performance mode (enabled via HF_XET_HIGH_PERFORMANCE or HF_XET_HP) automatically sets this to 100 via XetConfig::with_high_performance().

The default value is 8.

Use the environment variable HF_XET_DATA_MAX_CONCURRENT_FILE_INGESTION to set this value.

§max_concurrent_file_downloads: usize

The maximum number of files to ingest at once on the download path.

The default value is 8.

Use the environment variable HF_XET_DATA_MAX_CONCURRENT_FILE_DOWNLOADS to set this value.

§ingestion_block_size: ByteSize

The maximum block size from a file to process at once.

The default value is 8mb.

Use the environment variable HF_XET_DATA_INGESTION_BLOCK_SIZE to set this value.

§progress_update_interval: Duration

How often to send updates on file progress, in milliseconds. Disables batching if set to 0.

The default value is 200ms.

Use the environment variable HF_XET_DATA_PROGRESS_UPDATE_INTERVAL to set this value.

§progress_update_speed_sampling_window: Duration

Half-life duration for the exponentially weighted moving average used to estimate progress completion speed. Older rate observations are exponentially decayed with this half-life.

The default value is 10sec.

Use the environment variable HF_XET_DATA_PROGRESS_UPDATE_SPEED_SAMPLING_WINDOW to set this value.

§progress_update_speed_min_observations: u32

Minimum number of speed observations before reporting a rate. Until this many updates have been recorded, the completion rate is reported as unknown (None). This avoids displaying noisy initial estimates.

The default value is 4.

Use the environment variable HF_XET_DATA_PROGRESS_UPDATE_SPEED_MIN_OBSERVATIONS to set this value.

§session_xorb_metadata_flush_interval: Duration

How often do we flush new xorb data to disk on a long running upload session?

The default value is 20sec.

Use the environment variable HF_XET_DATA_SESSION_XORB_METADATA_FLUSH_INTERVAL to set this value.

§session_xorb_metadata_flush_max_count: usize

Force a flush of the xorb metadata every this many xorbs, if more are created in this time window.

The default value is 64.

Use the environment variable HF_XET_DATA_SESSION_XORB_METADATA_FLUSH_MAX_COUNT to set this value.

§default_cas_endpoint: String

Default CAS endpoint

The default value is “http://localhost:8080”.

Use the environment variable HF_XET_DATA_DEFAULT_CAS_ENDPOINT to set this value.

§aggregate_progress: bool

Whether to aggregate progress updates before sending them. When enabled, progress updates are batched and sent at regular intervals to reduce overhead.

The default value is true.

Use the environment variable HF_XET_DATA_AGGREGATE_PROGRESS to set this value.

§default_prefix: String

Default prefix used for CAS and shard operations.

The default value is “default”.

Use the environment variable HF_XET_DATA_DEFAULT_PREFIX to set this value.

§staging_subdir: String

Subdirectory name for staging data within the endpoint cache directory.

The default value is “staging”.

Use the environment variable HF_XET_DATA_STAGING_SUBDIR to set this value.

Implementations§

Source§

impl ConfigValueGroup

Source

pub fn new() -> Self

Create a new instance with default values only (no environment variable overrides).

Source

pub fn apply_env_overrides(&mut self)

Apply environment variable overrides to this configuration group.

The group name is derived from the module path. For example, in module xet_config::groups::data, the env var for TEST_INT would be HF_XET_DATA_TEST_INT.

Source

pub fn field_names() -> &'static [&'static str]

Returns the list of field names in this configuration group.

Source

pub fn get(&self, name: &str) -> Result<String, ConfigError>

Get a configuration field’s string representation by name.

Trait Implementations§

Source§

impl AsRef<ConfigValueGroup> for ConfigValueGroup

Source§

fn as_ref(&self) -> &ConfigValueGroup

Converts this type into a shared reference of the (usually inferred) input type.
Source§

impl Clone for ConfigValueGroup

Source§

fn clone(&self) -> ConfigValueGroup

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ConfigValueGroup

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for ConfigValueGroup

Source§

fn default() -> Self

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> DropFlavorWrapper<T> for T

Source§

type Flavor = MayDrop

The DropFlavor that wraps T into Self
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, W> HasTypeWitness<W> for T
where W: MakeTypeWitness<Arg = T>, T: ?Sized,

Source§

const WITNESS: W = W::MAKE

A constant of the type witness
Source§

impl<T> Identity for T
where T: ?Sized,

Source§

const TYPE_EQ: TypeEq<T, <T as Identity>::Type> = TypeEq::NEW

Proof that Self is the same type as Self::Type, provides methods for casting between Self and Self::Type.
Source§

type Type = T

The same type as Self, used to emulate type equality bounds (T == U) with associated type equality constraints (T: Identity<Type = U>).
Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<E> ResultError for E
where E: Send + Debug + Sync,

Source§

impl<T> ResultType for T
where T: Send + Clone + Sync + Debug,