Skip to main content

MdbShardConfig

Type Alias MdbShardConfig 

Source
pub type MdbShardConfig = ConfigValueGroup;

Aliased Type§

pub struct MdbShardConfig {
    pub target_size: u64,
    pub max_target_size: u64,
    pub cache_size_limit: ByteSize,
    pub chunk_index_table_max_size: usize,
    pub cache_subdir: String,
}

Fields§

§target_size: u64

The target shard size in bytes.

The default value is 67108864.

Use the environment variable HF_XET_SHARD_TARGET_SIZE to set this value.

§max_target_size: u64

Maximum shard size; small shards are aggregated until they are at most this.

The default value is 67108864.

Use the environment variable HF_XET_SHARD_MAX_TARGET_SIZE to set this value.

§cache_size_limit: ByteSize

The (soft) maximum size in bytes of the shard cache. Default is 16 GB.

As a rough calculation, a cache of size X will allow for dedup against data of size 1000 * X. The default would allow a 16 TB repo to be deduped effectively.

Note the cache is pruned to below this value at the beginning of a session, but during a single session new shards may be added such that this limit is exceeded.

The default value is 16gb.

Use the environment variable HF_XET_SHARD_CACHE_SIZE_LIMIT to set this value.

§chunk_index_table_max_size: usize

The maximum size of the chunk index table that’s stored in memory. After this, no new chunks are loaded for deduplication.

The default value is 67108864.

Use the environment variable HF_XET_SHARD_CHUNK_INDEX_TABLE_MAX_SIZE to set this value.

§cache_subdir: String

Subdirectory name for shard cache within the endpoint cache directory.

The default value is “shard-cache”.

Use the environment variable HF_XET_SHARD_CACHE_SUBDIR to set this value.