# Case Study: APR CLI Commands Demo
This case study demonstrates creating test models and using all 27 apr-cli commands for model inspection, validation, transformation, testing, and inference.
## The Problem
APR model files need comprehensive tooling for:
| Inspection | Custom scripts | No standardization |
| Validation | Manual checksums | Incomplete coverage |
| Transformation | Framework-specific | Lock-in |
| Regression | Manual testing | Error-prone |
## The Solution: apr-cli
The `apr` CLI provides 29+ commands for complete model lifecycle management:
```bash
# Build the CLI
cargo build -p apr-cli
# Inspect model metadata
./target/debug/apr inspect model.apr --json
# Validate integrity (100-point QA)
./target/debug/apr validate model.apr --quality
# Quantize model
./target/debug/apr convert model.apr --quantize int8 -o model-int8.apr
```
## Complete Example
Run: `cargo run --example apr_cli_commands`
```rust,ignore
{{#include ../../../examples/apr_cli_commands.rs}}
```
## All Commands
### Model Inspection
#### 1. INSPECT - View Model Metadata
```bash
apr inspect model.apr # Basic info
apr inspect model.apr --json # JSON output
apr inspect model.apr --weights # Include tensor info
```
Shows model type, framework, hyperparameters, and training info.
#### 2. TENSORS - List Tensor Info
```bash
apr tensors model.apr # List all tensors
apr tensors model.apr --stats # Include statistics
apr tensors model.apr --json # JSON output
```
Lists tensor names, shapes, dtypes, and statistics.
#### 3. TRACE - Layer-by-Layer Analysis
```bash
apr trace model.apr # Basic trace
apr trace model.apr --verbose # Detailed trace
apr trace model.apr --json # JSON output
```
Analyzes model layer by layer for debugging inference.
#### 4. DEBUG - Debug Output
```bash
apr debug model.apr # Standard debug
apr debug model.apr --drama # Detailed drama mode
apr debug model.apr --hex --limit 64 # Hex dump
```
Provides detailed tensor inspection for debugging.
### Quality & Validation
#### 5. VALIDATE - Check Model Integrity
```bash
apr validate model.apr # Basic validation
apr validate model.apr --quality # 100-point QA checklist
apr validate model.apr --strict # Strict mode
```
Runs the 100-point quality assessment with grades A+ to F.
#### 6. LINT - Best Practices Check
```bash
apr lint model.apr # Check best practices
```
Static analysis for naming conventions, metadata completeness, and efficiency.
Checks:
- Standard tensor naming patterns (layer.0.weight, not l0_w)
- Required metadata (author, license, provenance)
- Tensor alignment (64-byte boundaries)
- Compression for large tensors (>1MB)
#### 7. DIFF - Compare Two Models
```bash
apr diff model_v1.apr model_v2.apr # Compare models
apr diff model_v1.apr model_v2.apr --json # JSON output
```
Shows metadata and tensor differences between model versions.
### Model Transformation
#### 8. CONVERT - Quantization/Optimization
```bash
apr convert model.apr --quantize int8 -o model-int8.apr
apr convert model.apr --quantize int4 -o model-int4.apr
apr convert model.apr --quantize fp16 -o model-fp16.apr
```
Applies quantization for reduced model size and faster inference.
| fp16 | 50% | Minimal |
| int8 | 75% | Small |
| int4 | 87.5% | Moderate |
#### 9. EXPORT - Export to Other Formats
```bash
apr export model.apr --format safetensors -o model.safetensors
apr export model.apr --format gguf -o model.gguf
```
Exports APR models to other ecosystems:
- **SafeTensors** - HuggingFace ecosystem
- **GGUF** - llama.cpp / local inference
#### 10. MERGE - Merge Models
```bash
apr merge model1.apr model2.apr --strategy average -o merged.apr
apr merge model1.apr model2.apr --strategy weighted -o merged.apr
```
Combines multiple models using different strategies:
- **average** - Simple tensor averaging
- **weighted** - Weighted combination
### Import & Interop
#### 11. IMPORT - Import External Models
```bash
apr import external.safetensors -o imported.apr
apr import hf://org/repo -o model.apr --arch whisper
```
Imports from SafeTensors, HuggingFace Hub, and other formats.
### Testing & Regression
#### 12. CANARY - Regression Testing
```bash
# Create canary from original model
apr canary create model.apr --input ref.wav --output canary.json
# Check optimized model against canary
apr canary check model-optimized.apr --canary canary.json
```
Captures tensor statistics for regression testing after transformations (quantization, pruning).
Canary data includes:
- Tensor shapes and counts
- Mean, std, min, max for each tensor
- Drift tolerance checking
#### 13. PROBAR - Visual Regression Testing
```bash
apr probar model.apr -o probar_output # Create probar suite
apr probar model.apr -o output --format json # JSON format
```
Exports model data for visual regression testing.
### Help & Documentation
#### 14. EXPLAIN - Get Explanations
```bash
apr explain E002 # Explain error code
apr explain --tensor encoder.conv1.weight # Explain tensor by convention
apr explain --tensor conv1 --file model.safetensors # Look up in actual model
apr explain --file model.apr # Analyze architecture
apr explain --kernel llama # Kernel pipeline for family
apr explain --kernel qwen2 --json # JSON output for tooling
apr explain --kernel /path/to/config.json --verbose # Resolve from config.json
apr explain --kernel Qwen/Qwen2.5-Coder-0.5B-Instruct # Resolve from HF repo
apr explain --kernel gemma --proof-status # Include proof status
```
Provides context-aware explanations for errors, tensors, model architectures, and kernel pipelines. When `--file` is provided with `--tensor`, looks up the tensor in the actual model via RosettaStone (supports APR, GGUF, SafeTensors). The `--kernel` flag explains which kernel equivalence class (A-F) a model uses, the architectural constraints that drive selection, and the kernel ops pipeline.
### Interactive
#### 15. TUI - Interactive Terminal UI
```bash
apr tui model.apr # Launch interactive UI
```
Interactive terminal interface for model exploration with four tabs:
| Overview | `1` | Model metadata, hyperparameters, training info |
| Tensors | `2` | Tensor list with shapes, dtypes, sizes |
| Stats | `3` | Tensor statistics (mean, std, min, max, zeros, NaNs) |
| Help | `?` | Keyboard shortcuts and navigation help |
**Keyboard Navigation:**
- `1`, `2`, `3`, `?` - Switch tabs directly
- `Tab` / `Shift+Tab` - Cycle through tabs
- `j` / `↓` - Next item in list
- `k` / `↑` - Previous item in list
- `q` / `Esc` - Quit
### Inference (requires `--features inference`)
Build with inference support:
```bash
cargo build -p apr-cli --features inference
```
#### 16. RUN - Run Model Inference
```bash
apr run model.apr --input "[1.0, 2.0]" # JSON array input
apr run model.apr --input "1.0,2.0" # CSV input
apr run model.apr --input "[1.0, 2.0]" --json # JSON output
```
Runs inference on APR, SafeTensors, or GGUF models:
| APR (.apr) | Full ML inference via realizar |
| SafeTensors (.safetensors) | Tensor inspection |
| GGUF (.gguf) | Model inspection (mmap) |
**Input Formats:**
- JSON array: `"[1.0, 2.0, 3.0]"`
- CSV: `"1.0,2.0,3.0"`
#### 17. SERVE - Inference Server and Capacity Planning
**Serve Plan** — Pre-flight capacity planning (no weights loaded):
```bash
# Plan from local file
apr serve plan model.gguf --gpu
# Plan from HuggingFace repo (fetches only ~2KB config.json)
apr serve plan hf://Qwen/Qwen2.5-Coder-1.5B-Instruct --gpu --quant Q4_K_M
# JSON output for CI/tooling
apr serve plan microsoft/phi-2 --gpu --format json
```
**Serve Run** — Start inference server:
```bash
apr serve run model.apr --port 8080 # Start on port 8080
apr serve run model.apr --host 0.0.0.0 --port 3000 # Bind to all interfaces
```
Starts a REST API server for model inference:
**APR Models (full inference):**
```bash
# Health check
curl http://localhost:8080/health
# Run inference
curl -X POST http://localhost:8080/predict \
-H "Content-Type: application/json" \
-d '{"input": [1.0, 2.0]}'
```
**Server Features:**
- `/health` - Health check endpoint
- `/predict` - Inference endpoint (APR models)
- `/model` - Model info endpoint (GGUF/SafeTensors)
- `/tensors` - Tensor listing (SafeTensors)
- Graceful shutdown via Ctrl+C
### Chat & Comparison
#### 18. CHAT - Interactive Chat (LLM models)
```bash
apr chat model.gguf # Interactive chat
apr chat model.gguf --system "You are a helpful assistant" # Custom system prompt
```
#### 19. FLOW - Visualize Data Flow
```bash
apr flow model.safetensors # Show data flow
apr flow model.gguf --json # JSON output (architecture, groups)
apr flow model.apr --verbose # Verbose with shapes
```
Detects architecture (Encoder-Decoder, Decoder-Only, Encoder-Only) and groups tensors by layer. Supports APR, GGUF, and SafeTensors.
#### 20. COMPARE-HF - Compare Against HuggingFace Source
```bash
apr compare-hf model.apr --hf openai/whisper-tiny # APR format
apr compare-hf model.gguf --hf openai/whisper-tiny # GGUF format
apr compare-hf model.safetensors --hf openai/whisper-tiny # SafeTensors format
apr compare-hf model.apr --hf openai/whisper-tiny --json # JSON output
```
Auto-detects local model format. Compares tensor-by-tensor against HuggingFace source.
### HuggingFace Hub
#### 21. PUBLISH - Push to HuggingFace Hub
```bash
apr publish model_dir/ org/model-name --dry-run
```
#### 22. PULL - Download Model
```bash
apr pull hf://Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF -o ./models/
```
### Benchmarking & QA
#### 23. QA - Falsifiable QA Checklist
```bash
apr qa model.gguf # Run 8-gate QA checklist
apr qa model.gguf --json # JSON output
```
#### 24. QUALIFY - Cross-Subcommand Smoke Test
```bash
apr qualify model.gguf # Smoke test all 11 tools
apr qualify model.gguf --tier full # Full tier (+contracts +playbook)
apr qualify model.gguf --json # JSON output for CI
apr qualify model.gguf --skip validate,validate_quality # Skip slow gates
```
Runs every diagnostic CLI tool against a model to verify no crashes. Three tiers: smoke (11 in-process gates), standard (+contract audit), full (+playbook check).
#### 25. SHOWCASE - Performance Benchmark
```bash
apr showcase model.gguf --warmup 3 --iterations 10
```
#### 26. PROFILE - Deep Performance Profiling
```bash
apr profile model.gguf --roofline
```
#### 27. BENCH - Run Benchmarks
```bash
apr bench model.gguf --iterations 100
```
## Example Output
Running the example creates demo models:
```
=== APR CLI Commands Demo ===
--- Part 1: Creating Demo Model ---
Adding tensors...
Model type: Linear Regression
Tensors: 4
Size: 1690 bytes
Created: /tmp/apr_cli_demo/demo_model.apr
--- Part 2: Creating Second Model (for diff) ---
Model type: Linear Regression v2
Tensors: 4
Size: 1707 bytes
Created: /tmp/apr_cli_demo/demo_model_v2.apr
```
## Use Cases
### CI/CD Model Validation
```bash
# In CI pipeline
apr validate model.apr --strict --min-score 90 && apr lint model.apr
if [ $? -ne 0 ]; then
echo "Model validation failed"
exit 1
fi
```
### Model Optimization Pipeline
```bash
# Quantize for production
apr convert model.apr --quantize int8 -o model-int8.apr
# Verify no regression
apr canary create model.apr --input test.wav --output canary.json
apr canary check model-int8.apr --canary canary.json
# Export for deployment
apr export model-int8.apr --format gguf -o model.gguf
```
### Model Version Comparison
```bash
# Compare before/after optimization
### Debugging Inference Issues
```bash
# Layer-by-layer trace
# Drama mode for detailed analysis
apr debug model.apr --drama
```
## Benefits
| Standardized | Consistent CLI for all APR models |
| Comprehensive | 29+ commands cover full lifecycle |
| Scriptable | JSON output for automation |
| Debuggable | Deep inspection with drama mode |
| Validatable | 100-point QA with grades |
| Transformable | Quantization and format conversion |
| Testable | Canary regression testing |
| Inference | Run predictions and serve REST APIs |
## Related Resources
- [Case Study: APR with JSON Metadata](./apr-with-metadata.md)
- [The .apr Format: A Five Whys Deep Dive](./apr-format-deep-dive.md)
- [APR Loading Modes](./apr-loading-modes.md)
- [apr (APR Model Operations CLI)](../tools/apr-cli.md)