pub struct SubBackend { /* private fields */ }Expand description
A backend view that restricts communication to a subset of global ranks.
SubBackend wraps a parent Backend and a list of member global ranks
to form a logical sub-process-group. It translates every rank(),
world_size(), send(), and recv() call into the equivalent parent-
backend operation using the subgroup→global rank mapping.
Because SubBackend implements the Backend trait, the existing
allreduce,
all_gather, and
reduce_scatter collective functions
work on a subgroup without any changes: they only see the SubBackend
through the trait, so “rank 0” in the collective means “the first member
of the subgroup” and world_size means the subgroup size.
Used by FSDP’s HybridShard
strategy to form an intra-node (sharding) subgroup and an inter-node
(replication) subgroup.
CL-327.
Implementations§
Source§impl SubBackend
impl SubBackend
Sourcepub fn new(
parent: Arc<dyn Backend>,
members: Vec<usize>,
) -> FerrotorchResult<Self>
pub fn new( parent: Arc<dyn Backend>, members: Vec<usize>, ) -> FerrotorchResult<Self>
Create a subgroup view from a parent backend and a list of member global ranks.
The caller’s rank (read from parent.rank()) must be in members.
members is sorted and deduplicated before being stored.
§Errors
DistributedError::InvalidRankif the parent’s rank is not inmembers, or if any member is ≥ parentworld_size.DistributedError::InvalidWorldSizeifmembersis empty.
Sourcepub fn members(&self) -> &[usize]
pub fn members(&self) -> &[usize]
Return the global ranks that make up this subgroup, in ascending order. The index of this rank’s entry is its local rank.
Trait Implementations§
Source§impl Backend for SubBackend
impl Backend for SubBackend
Source§fn world_size(&self) -> usize
fn world_size(&self) -> usize
Source§fn recv(&self, dst: &mut [u8], src_rank: usize) -> FerrotorchResult<()>
fn recv(&self, dst: &mut [u8], src_rank: usize) -> FerrotorchResult<()>
dst from src_rank. The caller must allocate dst
with the correct length before calling.Source§fn recv_timeout(
&self,
dst: &mut [u8],
src_rank: usize,
timeout: Duration,
) -> FerrotorchResult<()>
fn recv_timeout( &self, dst: &mut [u8], src_rank: usize, timeout: Duration, ) -> FerrotorchResult<()>
Source§fn barrier(&self) -> FerrotorchResult<()>
fn barrier(&self) -> FerrotorchResult<()>
Auto Trait Implementations§
impl Freeze for SubBackend
impl !RefUnwindSafe for SubBackend
impl Send for SubBackend
impl Sync for SubBackend
impl Unpin for SubBackend
impl UnsafeUnpin for SubBackend
impl !UnwindSafe for SubBackend
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> DistributionExt for Twhere
T: ?Sized,
impl<T> DistributionExt for Twhere
T: ?Sized,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more