Skip to main content

Crate burn_collective

Crate burn_collective 

Source

Structs§

CollectiveConfig
Parameter struct for setting up and getting parameters for collective operations. Used in most collective api calls. This config is per-node. It is passed to reduce.
GlobalRegisterParams
Helper struct for parameters in a multi-node register operation. Either they are all defined, or all not defined. Passed to the global client for registering on the global level and opening the p2p tensor service.
NodeId
Unique identifier for any node in the global collective.
PeerId
A unique identifier for a peer in the context of collective operations. They must be unique, even in multi-node contexts.
SharedAllReduceParams
Parameters for an all-reduce that should be the same between all devices
SharedBroadcastParams
Parameters for a broadcast that should be the same between all devices
SharedReduceParams
Parameters for a reduce that should be the same between all devices

Enums§

AllReduceStrategy
All reduce can be implemented with different algorithms, which all have the same result.
BroadcastStrategy
Broadcast can be implemented with different algorithms, which all have the same result.
CollectiveError
Errors from collective operations
ReduceOperation
Reduce can be done different ways
ReduceStrategy
Reduce can be implemented with different algorithms, which all have the same result.

Functions§

all_reduce
Calls for an all-reduce operation with the given parameters, and returns the result. The params must be the same as the parameters passed by the other nodes.
broadcast
Broadcasts, or receives a broadcasted tensor.
finish_collective
Closes the collective session, unregistering the device
reduce
Reduces a tensor onto one device.
register
Registers a device. num_devices must be the same for every register, and device_id must be unique.
reset_collective
Resets the local collective server. All registered callers and ongoing operations are forgotten