Module workloads

Source
Expand description

Workload definitions and modeling.

Structs§

AIInferenceWorkload
AI inference workload implementation
AITrainingWorkload
AI training workload (placeholder for future implementation)
CachingConfig
Caching configuration
DynamicBatchingConfig
Dynamic batching configuration
InferenceConfig
Inference configuration
LatencyProfile
Latency characteristics
ModelParameters
Model parameters for AI workloads
PerformanceCharacteristics
Performance characteristics of a workload
ResourcePatterns
Resource usage patterns
ScalabilityProfile
Scalability characteristics
ThroughputProfile
Throughput characteristics
TrainingConfig
Training configuration (basic structure)
WorkloadMetadata
Workload metadata

Enums§

CacheEvictionPolicy
Cache eviction policies
LatencySensitivity
Latency sensitivity levels
QuantizationLevel
Quantization levels for models
ScalingCapability
Scaling capability levels
ThroughputConsistency
Throughput consistency requirements
UsagePattern
Usage pattern types
WorkloadType
Types of workloads

Traits§

Workload
Trait for workload implementations