Expand description
Error types for distributed operations.
§REQ status (per .design/ferrotorch-distributed/error.md)
Full evidence rows (impl + non-test production consumer + upstream cites) live in the design doc; this synopsis is a one-line summary per REQ.
| REQ | Status | Evidence |
|---|---|---|
| REQ-1 (DistributedError enum) | SHIPPED | pub enum DistributedError in error.rs with 11 #[non_exhaustive] variants; consumers use crate::error::DistributedError; in backend.rs, collective.rs, gloo_backend.rs. |
| REQ-2 (diagnostic fields per variant) | SHIPPED | every variant carries named fields rendered in #[error("...")] strings; verified by backend.rs tests (test_invalid_world_size, test_send_to_invalid_rank). |
| REQ-3 (From conversion) | SHIPPED | impl From<DistributedError> for FerrotorchError at the bottom of error.rs; consumers .into() at every fallible site in backend.rs and collective.rs. |
| REQ-4 (BackendUnavailable variant) | SHIPPED | BackendUnavailable { backend: &'static str } variant in error.rs; consumers in gloo_backend.rs, mpi_backend.rs, ucc_backend.rs (feature-off construction paths). |
Enums§
- Distributed
Error - Errors specific to the distributed training subsystem.