# Developer's Guide to Shell History Models
Build personalized ML models from your shell history using the `.apr` format. This guide follows **EXTREME TDD** methodology—every code example compiles and runs.
## Why Shell History is Perfect for ML
Shell commands exhibit strong **Markov properties**:
```text
P(next_token | all_previous) ≈ P(next_token | last_n_tokens)
```
Translation: What you type next depends mostly on your last few words, not your entire history.
**Evidence from real data:**
- `git` → 65% followed by `status`, `commit`, `push`, `pull`
- `cargo` → 70% followed by `build`, `test`, `run`, `clippy`
- `cd` → 80% followed by `..`, project names, or `~`
This predictability makes N-gram models highly effective with minimal compute.
## Part 1: First Principles - Building from Scratch
### Step 1: Define the Core Data Structure (RED)
```rust
use std::collections::HashMap;
/// N-gram frequency table
/// Maps context (previous n-1 tokens) → next token → count
#[derive(Default)]
struct NgramTable {
/// context → (next_token → frequency)
table: HashMap<String, HashMap<String, u32>>,
}
impl NgramTable {
fn new() -> Self {
Self::default()
}
/// Record an observation: given context, next token appeared
fn observe(&mut self, context: &str, next_token: &str) {
self.table
.entry(context.to_string())
.or_default()
.entry(next_token.to_string())
.and_modify(|c| *c += 1)
.or_insert(1);
}
/// Get probability distribution for context
fn predict(&self, context: &str) -> Vec<(String, f32)> {
let Some(counts) = self.table.get(context) else {
return vec![];
};
let total: u32 = counts.values().sum();
let mut probs: Vec<_> = counts
.iter()
.map(|(token, count)| {
(token.clone(), *count as f32 / total as f32)
})
.collect();
// Sort by probability descending
probs.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
probs
}
}
// Test: Empty table returns empty predictions
let table = NgramTable::new();
assert!(table.predict("git").is_empty());
// Test: Single observation
let mut table = NgramTable::new();
table.observe("git", "status");
let preds = table.predict("git");
assert_eq!(preds.len(), 1);
assert_eq!(preds[0].0, "status");
assert!((preds[0].1 - 1.0).abs() < 0.001); // 100% probability
```
### Step 2: Train on Command Sequences (GREEN)
```rust
use std::collections::HashMap;
#[derive(Default)]
struct NgramTable {
table: HashMap<String, HashMap<String, u32>>,
n: usize,
}
impl NgramTable {
fn with_n(n: usize) -> Self {
Self { table: HashMap::new(), n: n.max(2) }
}
fn observe(&mut self, context: &str, next_token: &str) {
self.table
.entry(context.to_string())
.or_default()
.entry(next_token.to_string())
.and_modify(|c| *c += 1)
.or_insert(1);
}
/// Train on a single command
fn train_command(&mut self, command: &str) {
let tokens: Vec<&str> = command.split_whitespace().collect();
if tokens.is_empty() {
return;
}
// Empty context predicts first token
self.observe("", tokens[0]);
// Build n-grams from token sequence
for i in 0..tokens.len() {
let context_start = i.saturating_sub(self.n - 1);
let context = tokens[context_start..=i].join(" ");
if i + 1 < tokens.len() {
self.observe(&context, tokens[i + 1]);
}
}
}
fn predict(&self, context: &str) -> Vec<(String, f32)> {
let Some(counts) = self.table.get(context) else {
return vec![];
};
let total: u32 = counts.values().sum();
let mut probs: Vec<_> = counts
.iter()
.map(|(t, c)| (t.clone(), *c as f32 / total as f32))
.collect();
probs.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
probs
}
}
// Train on real command patterns
let mut model = NgramTable::with_n(3);
let commands = [
"git status",
"git commit -m fix",
"git push",
"git status", // Repeated - should have higher probability
"git status",
"cargo build",
"cargo test",
"cargo build", // Repeated
];
for cmd in &commands {
model.train_command(cmd);
}
// Test: "git" context should predict "status" highest (3x vs 1x each)
let preds = model.predict("git");
assert!(!preds.is_empty());
assert_eq!(preds[0].0, "status"); // Most frequent
// Test: "cargo" context
let preds = model.predict("cargo");
assert_eq!(preds[0].0, "build"); // 2x vs 1x for test
// Test: Empty context predicts first tokens
let preds = model.predict("");
### Saving Models with aprender
The `.apr` format provides:
- **32-byte header** with magic, version, CRC32
- **MessagePack metadata** for model info
- **Bincode payload** for efficient serialization
- **Optional encryption** for privacy
```rust,ignore
use aprender::format::{save, load, ModelType, SaveOptions};
use serde::{Serialize, Deserialize};
use std::collections::HashMap;
#[derive(Serialize, Deserialize)]
struct ShellModel {
n: usize,
ngrams: HashMap<String, HashMap<String, u32>>,
total_commands: usize,
}
impl ShellModel {
fn new(n: usize) -> Self {
Self {
n,
ngrams: HashMap::new(),
total_commands: 0,
}
}
fn train(&mut self, commands: &[String]) {
self.total_commands = commands.len();
for cmd in commands {
let tokens: Vec<&str> = cmd.split_whitespace().collect();
if tokens.is_empty() {
continue;
}
// Empty context → first token
self.ngrams
.entry(String::new())
.or_default()
.entry(tokens[0].to_string())
.and_modify(|c| *c += 1)
.or_insert(1);
// Build context n-grams
for i in 0..tokens.len() {
let start = i.saturating_sub(self.n - 1);
let context = tokens[start..=i].join(" ");
if i + 1 < tokens.len() {
self.ngrams
.entry(context)
.or_default()
.entry(tokens[i + 1].to_string())
.and_modify(|c| *c += 1)
.or_insert(1);
}
}
}
}
}
// Create and train model
let mut model = ShellModel::new(3);
model.train(&[
"git status".to_string(),
"git commit -m test".to_string(),
"cargo build".to_string(),
]);
// Save to .apr format
let options = SaveOptions::default()
.with_name("my-shell-model")
.with_description("3-gram shell completion model");
save(&model, ModelType::Custom, "shell.apr", options)?;
// Load and verify
let loaded: ShellModel = load("shell.apr", ModelType::Custom)?;
assert_eq!(loaded.n, 3);
assert_eq!(loaded.total_commands, 3);
```
### Inspecting .apr Files
```bash
# View model metadata
apr inspect shell.apr
# Output:
# Model: my-shell-model
# Type: Custom
# Description: 3-gram shell completion model
# Created: 2025-11-26T15:30:00Z
# Size: 2.1 KB
# Checksum: CRC32 valid
```
## Part 3: Encryption for Privacy
Shell history contains sensitive patterns. Encrypt your models:
```rust,ignore
use aprender::format::{save_encrypted, load_encrypted, ModelType, SaveOptions};
// Save with password encryption (AES-256-GCM + Argon2id)
let options = SaveOptions::default()
.with_name("private-shell-model")
.with_description("Encrypted personal shell history model");
save_encrypted(&model, ModelType::Custom, "shell.apr", options, "my-password")?;
// Load requires password
let loaded: ShellModel = load_encrypted("shell.apr", ModelType::Custom, "my-password")?;
// Wrong password fails with DecryptionFailed error
let result: Result<ShellModel, _> = load_encrypted("shell.apr", ModelType::Custom, "wrong");
assert!(result.is_err());
```
### Recipient Encryption (X25519)
For sharing models with specific people:
```rust,ignore
use aprender::format::{save_for_recipient, load_as_recipient, ModelType, SaveOptions};
use aprender::format::x25519::{generate_keypair, PublicKey, SecretKey};
// Generate recipient keypair (they share public key with you)
let (recipient_secret, recipient_public) = generate_keypair();
// Save encrypted for specific recipient
let options = SaveOptions::default()
.with_name("team-shell-model");
save_for_recipient(&model, ModelType::Custom, "team.apr", options, &recipient_public)?;
// Only recipient can decrypt
let loaded: ShellModel = load_as_recipient("team.apr", ModelType::Custom, &recipient_secret)?;
```
## Part 4: Single Binary Deployment
Embed your trained model directly in a Rust binary:
```rust,ignore
// In build.rs or your binary
const MODEL_BYTES: &[u8] = include_bytes!("../shell.apr");
fn main() {
use aprender::format::load_from_bytes;
// Load at runtime - zero filesystem access
let model: ShellModel = load_from_bytes(MODEL_BYTES, ModelType::Custom)
.expect("embedded model should be valid");
// Use model
let suggestions = model.suggest("git ");
println!("Suggestions: {:?}", suggestions);
}
```
**Benefits:**
- Zero runtime dependencies
- Works in sandboxed environments
- Tamper-proof (model is part of binary hash)
- ~500KB overhead for typical shell model
### Complete Bundling Pipeline
```bash
# 1. Train on your history
aprender-shell train --output shell.apr
# 2. Optionally encrypt
apr encrypt shell.apr --password "$SECRET" --output shell-enc.apr
# 3. Embed in binary (Cargo.toml)
# [package]
# include = ["shell.apr"]
# 4. Build release
cargo build --release
# Result: Single binary with embedded, optionally encrypted model
```
## Part 5: Extending the Model
### Add Command Categories
```rust
use std::collections::HashMap;
#[derive(Default)]
struct CategorizedModel {
/// Category → NgramTable
categories: HashMap<String, HashMap<String, HashMap<String, u32>>>,
}
impl CategorizedModel {
fn categorize(command: &str) -> &'static str {
let first = command.split_whitespace().next().unwrap_or("");
match first {
"git" | "gh" => "vcs",
"cargo" | "rustc" | "rustup" => "rust",
"docker" | "kubectl" | "helm" => "containers",
"npm" | "yarn" | "pnpm" => "node",
"cd" | "ls" | "cat" | "grep" | "find" => "filesystem",
_ => "other",
}
}
fn train(&mut self, command: &str) {
let category = Self::categorize(command);
let tokens: Vec<&str> = command.split_whitespace().collect();
if tokens.is_empty() {
return;
}
let table = self.categories.entry(category.to_string()).or_default();
// Train within category
table
.entry(String::new())
.or_default()
.entry(tokens[0].to_string())
.and_modify(|c| *c += 1)
.or_insert(1);
for i in 0..tokens.len().saturating_sub(1) {
table
.entry(tokens[i].to_string())
.or_default()
.entry(tokens[i + 1].to_string())
.and_modify(|c| *c += 1)
.or_insert(1);
}
}
}
let mut model = CategorizedModel::default();
model.train("git status");
model.train("git commit");
model.train("cargo build");
model.train("cargo test");
model.train("ls -la");
// Verify categorization
assert!(model.categories.contains_key("vcs"));
assert!(model.categories.contains_key("rust"));
assert!(model.categories.contains_key("filesystem"));
```
### Add Time-Weighted Decay
Recent commands matter more than old ones:
```rust
use std::collections::HashMap;
struct DecayingModel {
/// context → (token → weighted_count)
ngrams: HashMap<String, HashMap<String, f32>>,
/// Decay factor per observation (0.99 = 1% decay)
decay: f32,
}
impl DecayingModel {
fn new(decay: f32) -> Self {
Self {
ngrams: HashMap::new(),
decay: decay.clamp(0.9, 0.999),
}
}
fn observe(&mut self, context: &str, token: &str) {
// Decay all existing counts first
for counts in self.ngrams.values_mut() {
for count in counts.values_mut() {
*count *= self.decay;
}
}
// Add new observation with weight 1.0
self.ngrams
.entry(context.to_string())
.or_default()
.entry(token.to_string())
.and_modify(|c| *c += 1.0)
.or_insert(1.0);
}
fn predict(&self, context: &str) -> Vec<(String, f32)> {
let Some(counts) = self.ngrams.get(context) else {
return vec![];
};
let total: f32 = counts.values().sum();
if total < 0.001 {
return vec![];
}
let mut probs: Vec<_> = counts
.iter()
.map(|(t, c)| (t.clone(), *c / total))
.collect();
probs.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
probs
}
}
// Test decay behavior
let mut model = DecayingModel::new(0.9); // 10% decay per observation
// Old observation
model.observe("git", "status");
// Newer observation (git status decays, commit is fresh)
model.observe("git", "commit");
let preds = model.predict("git");
// "commit" should be weighted higher (fresher)
assert_eq!(preds[0].0, "commit");
```
### Privacy Filter
Filter sensitive commands before training:
```rust
struct PrivacyFilter {
sensitive_patterns: Vec<String>,
}
impl PrivacyFilter {
fn new() -> Self {
Self {
sensitive_patterns: vec![
"password".to_string(),
"passwd".to_string(),
"secret".to_string(),
"token".to_string(),
"api_key".to_string(),
"AWS_SECRET".to_string(),
"GITHUB_TOKEN".to_string(),
"Authorization:".to_string(),
],
}
}
fn is_safe(&self, command: &str) -> bool {
let lower = command.to_lowercase();
// Check sensitive patterns
for pattern in &self.sensitive_patterns {
if lower.contains(&pattern.to_lowercase()) {
return false;
}
}
// Skip history manipulation
if command.starts_with("history") || command.starts_with("fc ") {
return false;
}
// Skip very short commands
if command.len() < 2 {
return false;
}
true
}
fn filter(&self, commands: Vec<String>) -> Vec<String> {
commands.into_iter().filter(|c| self.is_safe(c)).collect()
}
}
let filter = PrivacyFilter::new();
// Safe commands pass through
assert!(filter.is_safe("git push origin main"));
assert!(filter.is_safe("cargo build --release"));
// Sensitive commands are blocked
assert!(!filter.is_safe("export API_KEY=secret123"));
assert!(!filter.is_safe("curl -H 'Authorization: Bearer token'"));
assert!(!filter.is_safe("echo $PASSWORD"));
// History manipulation blocked
assert!(!filter.is_safe("history -c"));
assert!(!filter.is_safe("fc -l"));
// Filter a batch
let commands = vec![
"git status".to_string(),
"export SECRET=abc".to_string(),
"cargo test".to_string(),
];
let safe = filter.filter(commands);
assert_eq!(safe.len(), 2);
assert_eq!(safe[0], "git status");
assert_eq!(safe[1], "cargo test");
```
## Part 6: Complete Working Example
```rust,ignore
//! Complete shell history model with .apr persistence
//!
//! cargo run --example shell_history_model
use aprender::format::{save, load, ModelType, SaveOptions};
use serde::{Serialize, Deserialize};
use std::collections::HashMap;
use std::path::Path;
#[derive(Serialize, Deserialize, Default)]
pub struct ShellHistoryModel {
n: usize,
ngrams: HashMap<String, HashMap<String, u32>>,
command_freq: HashMap<String, u32>,
total_commands: usize,
}
impl ShellHistoryModel {
pub fn new(n: usize) -> Self {
Self {
n: n.clamp(2, 5),
..Default::default()
}
}
pub fn train(&mut self, commands: &[String]) {
for cmd in commands {
self.train_command(cmd);
}
}
fn train_command(&mut self, cmd: &str) {
self.total_commands += 1;
*self.command_freq.entry(cmd.to_string()).or_insert(0) += 1;
let tokens: Vec<&str> = cmd.split_whitespace().collect();
if tokens.is_empty() {
return;
}
// Empty context → first token
self.observe("", tokens[0]);
// Build n-grams
for i in 0..tokens.len() {
let start = i.saturating_sub(self.n - 1);
let context = tokens[start..=i].join(" ");
if i + 1 < tokens.len() {
self.observe(&context, tokens[i + 1]);
}
}
}
fn observe(&mut self, context: &str, token: &str) {
self.ngrams
.entry(context.to_string())
.or_default()
.entry(token.to_string())
.and_modify(|c| *c += 1)
.or_insert(1);
}
pub fn suggest(&self, prefix: &str, count: usize) -> Vec<(String, f32)> {
let tokens: Vec<&str> = prefix.trim().split_whitespace().collect();
if tokens.is_empty() {
return self.top_first_tokens(count);
}
let start = tokens.len().saturating_sub(self.n - 1);
let context = tokens[start..].join(" ");
let Some(next_tokens) = self.ngrams.get(&context) else {
return vec![];
};
let total: u32 = next_tokens.values().sum();
let mut suggestions: Vec<_> = next_tokens
.iter()
.map(|(token, count)| {
let completion = format!("{} {}", prefix, token);
let prob = *count as f32 / total as f32;
(completion, prob)
})
.collect();
suggestions.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
suggestions.truncate(count);
suggestions
}
fn top_first_tokens(&self, count: usize) -> Vec<(String, f32)> {
let Some(firsts) = self.ngrams.get("") else {
return vec![];
};
let total: u32 = firsts.values().sum();
let mut results: Vec<_> = firsts
.iter()
.map(|(t, c)| (t.clone(), *c as f32 / total as f32))
.collect();
results.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
results.truncate(count);
results
}
pub fn save_to_apr(&self, path: &Path) -> Result<(), aprender::error::AprenderError> {
let options = SaveOptions::default()
.with_name("shell-history-model")
.with_description(&format!(
"{}-gram model trained on {} commands",
self.n, self.total_commands
));
save(self, ModelType::Custom, path, options)
}
pub fn load_from_apr(path: &Path) -> Result<Self, aprender::error::AprenderError> {
load(path, ModelType::Custom)
}
pub fn stats(&self) -> ModelStats {
ModelStats {
n: self.n,
total_commands: self.total_commands,
unique_commands: self.command_freq.len(),
ngram_count: self.ngrams.values().map(|m| m.len()).sum(),
}
}
}
#[derive(Debug)]
pub struct ModelStats {
pub n: usize,
pub total_commands: usize,
pub unique_commands: usize,
pub ngram_count: usize,
}
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Simulate shell history
let history = vec![
"git status",
"git add .",
"git commit -m fix",
"git push",
"git status",
"git log --oneline",
"cargo build",
"cargo test",
"cargo build --release",
"cargo clippy",
]
.into_iter()
.map(String::from)
.collect::<Vec<_>>();
// Train model
let mut model = ShellHistoryModel::new(3);
model.train(&history);
// Show stats
let stats = model.stats();
println!("Model Statistics:");
println!(" N-gram size: {}", stats.n);
println!(" Total commands: {}", stats.total_commands);
println!(" Unique commands: {}", stats.unique_commands);
println!(" N-gram count: {}", stats.ngram_count);
// Test suggestions
println!("\nSuggestions for 'git ':");
for (suggestion, prob) in model.suggest("git ", 5) {
println!(" {:.1}% {}", prob * 100.0, suggestion);
}
println!("\nSuggestions for 'cargo ':");
for (suggestion, prob) in model.suggest("cargo ", 5) {
println!(" {:.1}% {}", prob * 100.0, suggestion);
}
// Save to .apr
let path = std::path::Path::new("shell_history.apr");
model.save_to_apr(path)?;
println!("\nModel saved to: {}", path.display());
// Reload and verify
let loaded = ShellHistoryModel::load_from_apr(path)?;
assert_eq!(loaded.total_commands, model.total_commands);
println!("Model reloaded successfully!");
// Cleanup
std::fs::remove_file(path)?;
Ok(())
}
```
## Part 7: Model Validation with aprender Metrics
The `aprender-shell` CLI uses aprender's ranking metrics for proper evaluation:
```bash
# Train on your history
aprender-shell train
# Validate with holdout evaluation
aprender-shell validate
```
### Ranking Metrics (aprender::metrics::ranking)
```rust,ignore
use aprender::metrics::ranking::{hit_at_k, mrr, RankingMetrics};
// Hit@K: Is correct answer in top K predictions?
let predictions = vec!["git commit", "git push", "git pull"];
let target = "git push";
assert_eq!(hit_at_k(&predictions, target, 1), 0.0); // Not #1
assert_eq!(hit_at_k(&predictions, target, 2), 1.0); // In top 2
// Mean Reciprocal Rank: 1/rank of correct answer
let all_predictions = vec![
vec!["git commit", "git push"], // target at rank 2 → RR = 0.5
vec!["cargo test", "cargo build"], // target at rank 1 → RR = 1.0
];
let targets = vec!["git push", "cargo test"];
let score = mrr(&all_predictions, &targets); // (0.5 + 1.0) / 2 = 0.75
// Comprehensive metrics
let metrics = RankingMetrics::compute(&all_predictions, &targets);
println!("Hit@1: {:.1}%", metrics.hit_at_1 * 100.0);
println!("Hit@5: {:.1}%", metrics.hit_at_5 * 100.0);
println!("MRR: {:.3}", metrics.mrr);
```
### Validation Output
```text
🔬 aprender-shell: Model Validation
📂 History file: ~/.zsh_history
📊 Total commands: 21,763
⚙️ N-gram size: 3
📈 Train/test split: 80% / 20%
═══════════════════════════════════════════
VALIDATION RESULTS
═══════════════════════════════════════════
Training set: 17,410 commands
Test set: 4,353 commands
Evaluated: 3,857 commands
───────────────────────────────────────────
Hit@1 (top 1): 13.3%
Hit@5 (top 5): 26.2%
Hit@10 (top 10): 30.7%
MRR (Mean Recip): 0.181
═══════════════════════════════════════════
```
**Interpretation:**
- **Hit@5 ~27%**: Model suggests correct command in top 5 for ~1 in 4 predictions
- **MRR ~0.18**: Average rank of correct answer is ~5th position
- This is realistic for shell completion given command diversity
## Part 8: Synthetic Data Augmentation
Improve model coverage with three strategies:
```bash
# Generate 5000 synthetic commands and retrain
aprender-shell augment --count 5000
```
### CLI Command Templates
```rust,ignore
use aprender_shell::synthetic::CommandGenerator;
let gen = CommandGenerator::new();
let commands = gen.generate(1000);
// Generates realistic dev commands:
// - git status, git commit -m, git push --force
// - cargo build --release, cargo test --lib
// - docker run -it, kubectl get pods
// - npm install --save-dev, pip install -r
```
### Mutation Engine
```rust,ignore
use aprender_shell::synthetic::CommandMutator;
let mutator = CommandMutator::new();
// Original: "git commit -m test"
// Mutations:
// - "git add -m test" (command substitution)
// - "git commit -am test" (flag substitution)
// - "git commit test" (flag removal)
let mutations = mutator.mutate("git commit -m test");
```
### Coverage-Guided Generation
```rust,ignore
use aprender_shell::synthetic::{SyntheticPipeline, CoverageGuidedGenerator};
use std::collections::HashSet;
// Extract known n-grams from current model
let known_ngrams: HashSet<String> = model.ngram_keys().collect();
// Generate commands that maximize new n-gram coverage
let pipeline = SyntheticPipeline::new();
let result = pipeline.generate(&real_history, known_ngrams, 5000);
println!("New n-grams added: {}", result.report.new_ngrams);
println!("Coverage gain: {:.1}%", result.report.coverage_gain * 100.0);
```
### Augmentation Output
```text
🧬 aprender-shell: Data Augmentation
📂 History file: ~/.zsh_history
📊 Real commands: 21,761
🔢 Known n-grams: 39,176
🧪 Generating synthetic commands... done!
📈 Coverage Report:
Synthetic commands: 5,000
New n-grams added: 5,473
Coverage gain: 99.0%
✅ Augmented model saved
📊 Model Statistics:
Total training commands: 26,761
Unique n-grams: 46,340 (+18%)
Vocabulary size: 21,101 (+31%)
```
## Summary
| N-gram table | Token prediction | O(1) lookup |
| Trie index | Prefix completion | O(k) where k=prefix length |
| .apr format | Persistence + metadata | ~2KB overhead |
| Encryption | Privacy protection | +50ms save/load |
| Single binary | Zero-dependency deployment | +500KB binary size |
| **Ranking metrics** | Model validation | `aprender::metrics::ranking` |
| **Synthetic data** | Coverage improvement | +13% n-grams |
**Key insights:**
1. Shell commands are highly predictable (Markov property)
2. N-grams outperform neural nets for this domain (speed, size, accuracy)
3. `.apr` format provides type-safe, versioned persistence
4. Encryption enables sharing sensitive models securely
5. `include_bytes!()` enables self-contained deployment
6. **Ranking metrics** (Hit@K, MRR) are standard for language model evaluation
7. **Synthetic data** fills coverage gaps for commands you rarely use
## CLI Reference
```bash
# Training
aprender-shell train # Full retrain from history
aprender-shell update # Incremental update (fast)
# Evaluation
aprender-shell validate # Holdout evaluation with metrics
aprender-shell validate -n 4 # Test different n-gram sizes
aprender-shell stats # Model statistics
# Data Augmentation
aprender-shell augment # Generate synthetic data + retrain
aprender-shell augment -c 10000 # Custom synthetic count
# Inference
aprender-shell suggest "git " # Get completions
aprender-shell suggest "cargo t" # Prefix matching
# Export
aprender-shell export model.apr # Export to .apr format
```
## Next Steps
- [`aprender-shell` source code](https://github.com/paiml/aprender/tree/main/crates/aprender-shell)
- [Model Format Specification](../examples/model-format.md)
- Ranking Metrics API (see `aprender::metrics`)