axonml-profile 0.2.2

Profiling tools for the Axonml ML framework
Documentation

axonml-profile

Overview

axonml-profile provides comprehensive profiling capabilities for neural network training and inference. It includes memory tracking, compute profiling, timeline recording, and automatic bottleneck detection to help identify and resolve performance issues.

Features

  • Memory Profiler: Track allocations, deallocations, peak usage, and detect memory leaks
  • Compute Profiler: Measure operation times, FLOPS, throughput, and bandwidth
  • Timeline Profiler: Record timestamped events for visualization and analysis
  • Bottleneck Detection: Automatically identify slow operations, memory hotspots, and throughput issues
  • Report Generation: Export reports in Text, JSON, Markdown, and HTML formats
  • RAII Profiling: Scope-based profiling with automatic start/stop via guards
  • Global Profiler: Singleton profiler instance for convenient access throughout codebase
  • Thread-Safe: All profilers are thread-safe with parking_lot locks

Modules

Module Description
memory Memory profiler for tracking allocations, peak usage, and leak detection
compute Compute profiler for measuring operation times, FLOPS, and bandwidth
timeline Timeline profiler for recording events with timestamps and metadata
bottleneck Bottleneck analyzer for detecting performance issues with severity ratings
report Report generation in multiple formats (Text, JSON, Markdown, HTML)
error Error types and Result alias for profiling operations

Usage

Add this to your Cargo.toml:

[dependencies]
axonml-profile = "0.1.0"

Basic Profiling

use axonml_profile::Profiler;

// Create a profiler
let profiler = Profiler::new();

// Profile an operation
profiler.start("forward_pass");
let output = model.forward(&input);
profiler.stop("forward_pass");

// Check timing
let total = profiler.total_time("forward_pass");
let avg = profiler.avg_time("forward_pass");
println!("Total: {:?}, Average: {:?}", total, avg);

Memory Tracking

use axonml_profile::Profiler;

let profiler = Profiler::new();

// Track allocations
profiler.record_alloc("weights", 4 * 1024 * 1024);  // 4 MB
profiler.record_alloc("activations", 2 * 1024 * 1024);  // 2 MB

// Check memory usage
println!("Current: {} bytes", profiler.current_memory());
println!("Peak: {} bytes", profiler.peak_memory());

// Record deallocation
profiler.record_free("activations", 2 * 1024 * 1024);

RAII Scope Profiling

use axonml_profile::{Profiler, ProfileGuard, profile_scope};

let profiler = Profiler::new();

// Using ProfileGuard directly
{
    let _guard = ProfileGuard::new(&profiler, "expensive_operation");
    // Operation automatically timed from guard creation to drop
    perform_computation();
}

// Or using the macro
{
    profile_scope!(&profiler, "another_operation");
    perform_another_computation();
}

Bottleneck Analysis

use axonml_profile::Profiler;

let profiler = Profiler::new();

// Profile various operations
profiler.start("matmul");
// ... matrix multiplication
profiler.stop("matmul");

profiler.start("relu");
// ... activation
profiler.stop("relu");

// Analyze for bottlenecks
let bottlenecks = profiler.analyze_bottlenecks();
for b in &bottlenecks {
    println!("[{:?}] {}: {}", b.severity, b.name, b.description);
    println!("  Suggestion: {}", b.suggestion);
}

Report Generation

use axonml_profile::{Profiler, ReportFormat};
use std::path::Path;

let profiler = Profiler::new();
// ... profiling operations ...

// Generate and print summary
let report = profiler.summary();
println!("{}", report);

// Export to file
report.export(Path::new("profile.html"), ReportFormat::Html)?;
report.export(Path::new("profile.json"), ReportFormat::Json)?;
report.export(Path::new("profile.md"), ReportFormat::Markdown)?;

Global Profiler

use axonml_profile::{global_profiler, start, stop, record_alloc};

// Use global profiler functions
start("global_operation");
// ... operation ...
stop("global_operation");

record_alloc("global_tensor", 1024);

// Access the global profiler instance
let profiler = global_profiler();
profiler.print_summary();

Compute Statistics

use axonml_profile::ComputeProfiler;

let mut profiler = ComputeProfiler::new();

// Profile with FLOPS tracking
profiler.start_with_flops("matmul", 2.0 * 1024.0 * 1024.0 * 1024.0);  // 2 GFLOPS
// ... operation ...
profiler.stop("matmul");

// Get top operations
for op in profiler.top_by_time(5) {
    println!("{}: {} calls, {:?} total", op.name, op.call_count, op.total_time());
}

Bottleneck Types

Type Description Detection Threshold
SlowOperation Operation taking disproportionate time >20% of total time
HighCallCount Operation called too frequently >10,000 calls
MemoryHotspot Large memory allocation >30% of peak memory
MemoryLeak Memory not freed >5% of allocations
LowThroughput Low computational throughput <1 GFLOPS

Report Formats

Format Description Use Case
Text Plain text with ASCII tables Console output
JSON Structured JSON Programmatic analysis
Markdown GitHub-flavored Markdown Documentation
HTML Styled HTML page Browser viewing

Tests

Run the test suite:

cargo test -p axonml-profile

License

Licensed under either of:

  • MIT License
  • Apache License, Version 2.0

at your option.