Expand description
Workload definitions and modeling.
Structs§
- AIInference
Workload - AI inference workload implementation
- AITraining
Workload - AI training workload (placeholder for future implementation)
- Caching
Config - Caching configuration
- Dynamic
Batching Config - Dynamic batching configuration
- Inference
Config - Inference configuration
- Latency
Profile - Latency characteristics
- Model
Parameters - Model parameters for AI workloads
- Performance
Characteristics - Performance characteristics of a workload
- Resource
Patterns - Resource usage patterns
- Scalability
Profile - Scalability characteristics
- Throughput
Profile - Throughput characteristics
- Training
Config - Training configuration (basic structure)
- Workload
Metadata - Workload metadata
Enums§
- Cache
Eviction Policy - Cache eviction policies
- Latency
Sensitivity - Latency sensitivity levels
- Quantization
Level - Quantization levels for models
- Scaling
Capability - Scaling capability levels
- Throughput
Consistency - Throughput consistency requirements
- Usage
Pattern - Usage pattern types
- Workload
Type - Types of workloads
Traits§
- Workload
- Trait for workload implementations