shell-sanitize-rules 0.1.0

# shell-sanitize

Type-safe input sanitization for shell arguments and file paths.

**Reject, don't escape** — this crate rejects dangerous input with a clear
error instead of trying to transform it into something "safe".

## When to use this crate

**Prefer [`std::process::Command`] when possible.** Passing arguments
via `Command::new("git").arg(user_input)` bypasses the shell entirely
and is always the safest option.

This crate is for situations where **shell evaluation is unavoidable**:

| Scenario | Why you can't avoid the shell |
|----------|-------------------------------|
| SSH remote commands | Remote side evaluates through shell |
| `docker exec ctr sh -c "..."` | Container-side shell |
| CI/CD pipeline `run:` blocks | YAML → shell evaluation |
| AI agent tool execution | LLM output may reach a shell |
| Legacy `system()` / `popen()` | API forces shell involvement |

It is also valuable for **path validation** even without shell involvement:
blocking `../../etc/passwd` in upload paths, config file references, and
template includes.

## Crates

| Crate | Description |
|-------|-------------|
| [`shell-sanitize`](https://crates.io/crates/shell-sanitize) | Core framework: `Rule` trait, `Sanitizer` builder, `Sanitized<T>` proof type |
| [`shell-sanitize-rules`](https://crates.io/crates/shell-sanitize-rules) | Built-in rules and ready-made presets |

## Quick start

```toml
[dependencies]
shell-sanitize-rules = "0.1"
```

```rust
use shell_sanitize_rules::presets;

// AI agent validates a file path argument
let s = presets::file_path();
assert!(s.sanitize("uploads/photo.jpg").is_ok());
assert!(s.sanitize("../../etc/passwd").is_err());

// Value interpolated into `sh -c "..."`
let s = presets::shell_command();
assert!(s.sanitize("my-branch").is_ok());
assert!(s.sanitize("branch; rm -rf /").is_err());
```

## Presets

| Preset | Target context | Rules |
|--------|---------------|-------|
| `command_arg()` | `Command::new().arg()` | ControlChar |
| `shell_command()` | `sh -c`, SSH, popen | ShellMeta + ControlChar + EnvExpansion + Glob |
| `file_path()` | Upload dest, include | PathTraversal + ControlChar |
| `file_path_absolute()` | Config file, absolute OK | PathTraversal(allow_abs) + ControlChar |
| `strict()` | SSH remote path ops, max protection | All 5 rules |

## Custom rules

Implement the `Rule` trait from `shell-sanitize` to create your own rules:

```rust
use shell_sanitize::{Rule, RuleResult, RuleViolation, Sanitizer, ShellArg};

struct MaxLengthRule(usize);

impl Rule for MaxLengthRule {
    fn name(&self) -> &'static str { "max_length" }

    fn check(&self, input: &str) -> RuleResult {
        if input.len() > self.0 {
            Err(vec![RuleViolation::new("max_length", "input too long")])
        } else {
            Ok(())
        }
    }
}

let sanitizer = Sanitizer::<ShellArg>::builder()
    .add_rule(MaxLengthRule(255))
    .build();
```

## Defense in depth for AI agents

```text
AI Agent Framework
┌──────────────────────────────────────────┐
│                                          │
│   Path-based tools        Bash tool      │
│   (Read/Write/Glob)       (free-form)    │
│         │                      │         │
│         ▼                      ▼         │
│   ★ shell-sanitize ★     Sandbox/Container
│   file_path() preset     (OS-level isolation)
│   file_path_absolute()                   │
│                                          │
└──────────────────────────────────────────┘
```

## Scope: argument validation, not command validation

This crate validates **individual arguments and paths** — it does not
parse or validate entire shell command strings. Sanitizing an entire
command string would break legitimate syntax (pipes, redirects, subshells).
Separate the **trusted command structure** from **untrusted data**, then
validate only the data.

## Known limitations

- **Free-form command strings** — use sandbox/container isolation
- **Argument injection** (`--upload-pack=evil`) — use `--` separators or command-specific validation
- **URL-encoded bypasses** (`%2e%2e`) — decode input before sanitizing
- **Semantic attacks** — a path like `safe/but/wrong/file.txt` passes all rules but may still be the wrong file

## License

Licensed under either of

- [Apache License, Version 2.0](LICENSE-APACHE)
- [MIT License](LICENSE-MIT)

at your option.