shell_sanitize/
lib.rs

1//! Type-safe input sanitization for shell arguments and file paths.
2//!
3//! This crate provides the core framework: the [`Rule`] trait, the
4//! [`Sanitizer`] builder, and the [`Sanitized<T>`] proof type that
5//! can only be constructed by passing all rules.
6//!
7//! For built-in rules and ready-made presets, see the companion crate
8//! [`shell_sanitize_rules`](https://docs.rs/shell-sanitize-rules).
9//!
10//! # When to use this crate
11//!
12//! **Prefer [`std::process::Command`] when possible.** Passing arguments
13//! via `Command::new("git").arg(user_input)` bypasses the shell entirely
14//! and is always the safest option.
15//!
16//! This crate is for situations where **shell evaluation is unavoidable**:
17//!
18//! | Scenario | Why you can't avoid the shell |
19//! |----------|------------------------------|
20//! | SSH remote commands | Remote side evaluates through shell |
21//! | `docker exec ctr sh -c "..."` | Container-side shell |
22//! | CI/CD pipeline `run:` blocks | YAML → shell evaluation |
23//! | AI agent tool execution | LLM output may reach a shell |
24//! | Legacy `system()` / `popen()` | API forces shell involvement |
25//!
26//! It is also valuable for **path validation** even without shell
27//! involvement: blocking `../../etc/passwd` in upload paths, config
28//! file references, and template includes.
29//!
30//! # Design principle: reject, don't escape
31//!
32//! Escaping is fragile — it depends on the target shell, can be
33//! double-applied, and makes legitimate commands non-functional.
34//! This crate **rejects** dangerous input with a clear error instead
35//! of trying to transform it into something "safe".
36//!
37//! # Scope: argument validation, not command validation
38//!
39//! This crate validates **individual arguments and paths** — it does not
40//! parse or validate entire shell command strings.
41//!
42//! ```text
43//! "git clone https://example.com"   ← full command: out of scope
44//!                                       (use sandbox + command allowlist)
45//!
46//! "https://example.com"             ← individual argument: in scope
47//!                                       (validate with shell_command preset)
48//!
49//! "uploads/photo.jpg"               ← file path: in scope
50//!                                       (validate with file_path preset)
51//! ```
52//!
53//! Sanitizing an entire command string would break legitimate syntax
54//! (pipes, redirects, subshells are valid command constructs). Instead,
55//! separate the **trusted command structure** from **untrusted data**,
56//! then validate only the data.
57//!
58//! # AI agent threat model
59//!
60//! LLM output should be treated as **untrusted input** — indirect prompt
61//! injection can manipulate what the AI produces. However, in practice,
62//! AI agents generate commands at different levels of structure:
63//!
64//! | Pattern | Example | shell-sanitize applicability |
65//! |---------|---------|------------------------------|
66//! | **Structured tool call** | `{ tool: "read", path: "src/lib.rs" }` | **High** — validate `path` with `file_path` |
67//! | **Single command + args** | `Bash("git diff HEAD~3")` | **Medium** — tokenize first, then validate each arg |
68//! | **Free-form command string** | `Bash("cd repo && make && ./run")` | **Out of scope** — use sandbox/container |
69//!
70//! ## Where this crate fits in defense-in-depth
71//!
72//! ```text
73//! AI Agent Framework
74//! ┌──────────────────────────────────────────┐
75//! │                                          │
76//! │   Path-based tools        Bash tool      │
77//! │   (Read/Write/Glob)       (free-form)    │
78//! │         │                      │         │
79//! │         ▼                      ▼         │
80//! │   ★ shell-sanitize ★     Sandbox/Container
81//! │   file_path() preset     (OS-level isolation)
82//! │   file_path_absolute()                   │
83//! │                                          │
84//! └──────────────────────────────────────────┘
85//! ```
86//!
87//! - **Path arguments** (Read, Write, Include) → this crate's primary value
88//! - **Structured tool call arguments** → effective with appropriate preset
89//! - **Trusted template + sanitized slots** → `sh -c "cd {safe} && make {safe}"`
90//! - **Free-form bash strings** → out of scope; rely on sandbox, container
91//!   isolation, and command allowlists
92
93mod error;
94mod marker;
95mod rule;
96mod sanitized;
97mod sanitizer;
98
99pub use error::{RuleViolation, SanitizeError};
100pub use marker::{FilePath, MarkerType, ShellArg};
101pub use rule::{Rule, RuleResult};
102pub use sanitized::Sanitized;
103pub use sanitizer::Sanitizer;
shell_sanitize/lib.rs

shell_sanitize/
lib.rs