Skip to main content

load_distributed

Function load_distributed 

Source
pub fn load_distributed<T: Float>(
    dir: &Path,
    rank: usize,
    world_size: usize,
) -> Result<HashMap<String, Tensor<T>>, DistCheckpointError>
Expand description

Load a distributed checkpoint for a specific rank.

Reads dir/metadata.json to discover the original sharding layout. If the current world_size matches the saved metadata, each rank simply loads its own shard file. If world sizes differ, automatic resharding is performed via reshard.

§Errors

Returns an error if metadata or shard files are missing, if tensors have unexpected dtypes, or if resharding fails.