pub type MdbShardConfig = ConfigValueGroup;Aliased Type§
pub struct MdbShardConfig {
pub target_size: u64,
pub max_target_size: u64,
pub cache_size_limit: ByteSize,
pub chunk_index_table_max_size: usize,
pub cache_subdir: String,
}Fields§
§target_size: u64The target shard size in bytes.
The default value is 67108864.
Use the environment variable HF_XET_SHARD_TARGET_SIZE to set this value.
max_target_size: u64Maximum shard size; small shards are aggregated until they are at most this.
The default value is 67108864.
Use the environment variable HF_XET_SHARD_MAX_TARGET_SIZE to set this value.
cache_size_limit: ByteSizeThe (soft) maximum size in bytes of the shard cache. Default is 16 GB.
As a rough calculation, a cache of size X will allow for dedup against data of size 1000 * X. The default would allow a 16 TB repo to be deduped effectively.
Note the cache is pruned to below this value at the beginning of a session, but during a single session new shards may be added such that this limit is exceeded.
The default value is 16gb.
Use the environment variable HF_XET_SHARD_CACHE_SIZE_LIMIT to set this value.
chunk_index_table_max_size: usizeThe maximum size of the chunk index table that’s stored in memory. After this, no new chunks are loaded for deduplication.
The default value is 67108864.
Use the environment variable HF_XET_SHARD_CHUNK_INDEX_TABLE_MAX_SIZE to set this value.
cache_subdir: StringSubdirectory name for shard cache within the endpoint cache directory.
The default value is “shard-cache”.
Use the environment variable HF_XET_SHARD_CACHE_SUBDIR to set this value.