Skip to main content

Module debugging

Module debugging 

Source
Expand description

Debugging utilities for distributed training systems

This module provides comprehensive debugging tools including operation tracing, state inspection, diagnostic tools, and automated troubleshooting capabilities.

Structs§

ActiveOperation
Active operation information
CommunicationState
Communication state information
DebugConfig
Configuration for debugging utilities
DebugEvent
Debug event for tracking system operations
DiagnosticResult
Diagnostic check result
DistributedDebugger
Comprehensive debugging system for distributed training
ProcessGroupState
Process group state information
ResourceState
Resource state information
SystemStateSnapshot
System state snapshot for debugging

Enums§

LogLevel
Logging levels for debugging

Functions§

get_global_debugger
Get the global debugger instance
init_global_debugger
Initialize the global debugger with custom configuration