# Using py-spy Library for Python Profiling
## Overview
The quality-agent now uses the **py-spy Rust crate** (v0.4.1) for Python runtime profiling, providing native integration without requiring external CLI tools.
## What is py-spy?
`py-spy` is a sampling profiler for Python programs that can attach to running Python processes and collect stack traces without requiring code instrumentation. The Rust crate provides programmatic access to py-spy's profiling capabilities.
## Implementation
### Direct Library Integration
Instead of spawning `py-spy` as a command-line tool, we now use the crate directly:
```rust
use py_spy::{Config, PythonSpy};
// Start Python process
let mut child = Command::new("python3")
.arg(&script_path)
.spawn()?;
let pid = child.id() as py_spy::Pid;
// Create py-spy config
let config = Config::default();
// Attach to running process
let mut spy = PythonSpy::new(pid, &config)?;
// Collect stack traces
let traces = spy.get_stack_traces()?;
```
### Benefits
1. **No External Dependencies**: No need to install py-spy CLI via pip
2. **Native Integration**: Direct Rust-to-Rust communication
3. **Better Performance**: No process spawning overhead for py-spy
4. **More Control**: Direct access to stack traces and profiling data
5. **Type Safety**: Compile-time checks for profiling logic
## Permissions
On macOS and Linux, attaching to processes requires elevated permissions:
### Option 1: Run with sudo (Recommended for testing)
```bash
sudo cargo run -- python examples/server_simulation.py --runtime 3
```
### Option 2: Grant permissions (macOS)
```bash
# Add your user to the developer group
sudo dscl . -append /Groups/_developer GroupMembership $USER
# Or temporarily disable SIP (System Integrity Protection) - NOT RECOMMENDED
# Boot into recovery mode, open terminal, and run: csrutil disable
```
### Option 3: Use in production with proper service permissions
- Run as a system service with appropriate capabilities
- Use container environments with proper security contexts
## How It Works
1. **Process Launch**: Start the target Python script as a subprocess
2. **Attachment Wait**: Wait 500ms for Python runtime initialization
3. **Profiler Attachment**: Use py-spy crate to attach to the running process
4. **Sample Collection**: Collect stack traces every 10ms during the profiling duration
5. **Data Aggregation**: Count function executions and identify hot functions
6. **Cleanup**: Wait for the Python process to complete
## Sample Output
```bash
$ sudo cargo run -- python examples/server_simulation.py --runtime 2
=== Profiling Results for Python ===
File Size: 2059 bytes
Lines of Code: 89
Functions: 6
Classes: 0
Imports: 3
Complexity Score: 22
Details:
- Detected 6 function definitions
- Detected 0 class definitions
- Detected 3 import statements
=== Runtime Profile (py-spy) ===
Total samples collected: 1847
Unique functions executed: 23
Top 10 Hot Functions:
1. process_data:server_simulation.py - 456 samples (24.69%)
2. compute_intensive_work:server_simulation.py - 398 samples (21.55%)
3. fibonacci:server_simulation.py - 287 samples (15.54%)
4. find_primes:server_simulation.py - 198 samples (10.72%)
5. is_prime:server_simulation.py - 156 samples (8.45%)
...
```
## Architecture
### Profiling Flow
```
User Script
↓
Python Process (subprocess)
↓
py-spy Library Attaches (native API)
↓
Stack Trace Collection (10ms intervals)
↓
Function Execution Counts
↓
Hot Function Analysis
↓
Profile Results
```
### Data Structure
```rust
pub struct PyCoverageData {
pub execution_count: HashMap<String, u64>, // Function name → sample count
pub hot_functions: Vec<(String, u64)>, // Top 10 hot functions
pub total_samples: u64, // Total samples collected
}
```
## Comparison: CLI vs Library
### Before (CLI approach)
```rust
// Spawn py-spy command
Command::new("py-spy")
.arg("record")
.arg("--format").arg("speedscope")
.arg("--output").arg("/tmp/profile.json")
.arg("--").arg("python3").arg(script)
.output()?;
// Parse JSON output file
parse_speedscope_json("/tmp/profile.json")?;
```
**Issues:**
- Requires py-spy CLI installation (pip install py-spy)
- Temporary file management
- JSON parsing overhead
- External process dependency
### After (Library approach)
```rust
// Start Python process
let mut child = Command::new("python3").arg(script).spawn()?;
// Attach py-spy library
let mut spy = PythonSpy::new(child.id() as Pid, &Config::default())?;
// Collect traces directly
let traces = spy.get_stack_traces()?;
```
**Benefits:**
- No external CLI tool needed
- Direct memory access
- No file I/O overhead
- Native Rust types
## Testing
### Unit Tests
All existing unit tests work without changes - they test static analysis which doesn't use py-spy.
### Integration Tests
Runtime profiling tests can now run without pip-installed py-spy (permissions still required):
```bash
# Run all tests
cargo test
# Run runtime profiling tests (requires sudo)
sudo cargo test -- --ignored
```
## Troubleshooting
### Permission Denied
```
Error: Failed to attach py-spy to process 12345: Permission denied
```
**Solution**: Run with `sudo` or adjust system permissions
### Process Not Found
```
Error: Failed to attach py-spy to process 12345: Failed to open process
```
**Solution**: Increase the sleep duration in the code or ensure Python process stays running
### No Samples Collected
```
Warning: Failed to get stack trace: Process exited
```
**Solution**: Ensure your Python script runs for at least the profiling duration
## Future Enhancements
1. **Configuration Options**: Expose py-spy Config options (sample rate, native unwinding, etc.)
2. **Multiple Processes**: Support profiling multi-process Python applications
3. **Real-time Streaming**: Stream profiling data instead of batch collection
4. **Flamegraph Generation**: Direct flamegraph creation from collected data
5. **Cross-Platform Improvements**: Better permission handling per platform
## References
- [py-spy crate](https://crates.io/crates/py-spy)
- [py-spy GitHub](https://github.com/benfred/py-spy)
- [remoteprocess crate](https://crates.io/crates/remoteprocess) - Underlying library
## Dependencies
```toml
[dependencies]
py-spy = "0.4.1" # Native Python profiling
```
No additional system dependencies required!