Crate soft_canonicalize

Crate soft_canonicalize 

Source
Expand description

§soft-canonicalize

Path canonicalization that works with non-existing paths.

Inspired by Python 3.6+ pathlib.Path.resolve(strict=False), this crate:

  • Matches std::fs::canonicalize exactly for fully-existing paths
  • Extends canonicalization to non-existing suffixes
  • Preserves robust behavior across Windows, macOS, and Linux
  • Provides zero-dependency, security-focused implementation

§Quick Start

[dependencies]
soft-canonicalize = "0.3"

§Cross-Platform Example

use soft_canonicalize::soft_canonicalize;

// Existing path behaves like std::fs::canonicalize
let existing = soft_canonicalize(&std::env::temp_dir())?;

// Also works when suffixes don't exist yet
let non_existing = soft_canonicalize(
    std::env::temp_dir().join("some/deep/non/existing/path.txt")
)?;

§Windows Example (UNC/extended-length)

use soft_canonicalize::soft_canonicalize;
let p = r"C:\\Users\\user\\documents\\..\\non\\existing\\config.json";
let result = soft_canonicalize(p)?;
assert!(result.to_string_lossy().starts_with(r"\\\\?\\C:"));

§Anchored Canonicalization (Security-Focused)

For secure path handling within a known root directory:

use soft_canonicalize::{anchored_canonicalize, soft_canonicalize};
use std::fs;

// Set up an anchor directory
let root = std::env::temp_dir().join("workspace_root");
fs::create_dir_all(&root)?;
// No need to pre-canonicalize: anchored_canonicalize soft-canonicalizes the anchor internally
let anchor = &root;

// Canonicalize user input relative to anchor
let user_input = "../../../etc/passwd";
let resolved_path = anchored_canonicalize(anchor, user_input)?;

§How It Works

  1. Input validation (empty path, platform pre-checks)
  2. Convert to absolute path (preserving drive/root semantics)
  3. Fast-path: try fs::canonicalize on the original absolute path
  4. Lexically normalize . and .. (streaming, no extra allocations)
  5. Fast-path: try fs::canonicalize on the normalized path when different
  6. Validate null bytes (platform-specific)
  7. Discover deepest existing prefix; resolve symlinks inline with cycle detection
  8. Optionally canonicalize the anchor (if symlinks seen) and rebuild
  9. Append non-existing suffix lexically, then normalize if needed
  10. Windows: ensure extended-length prefix for absolute paths

§Security Considerations

  • Directory traversal (..) resolved lexically before filesystem access
  • Symlink chains resolved with cycle detection and depth limits
  • Windows NTFS ADS validation performed early and after normalization
  • Embedded NUL byte checks on all platforms

§Cross-Platform Notes

  • Windows: returns extended-length verbatim paths for absolute results (\\?\C:\…, \\?\UNC\…)
  • Unix-like systems: standard absolute and relative path semantics
  • UNC floors and device namespaces are preserved and respected

§Testing

301 tests including:

  • std::fs::canonicalize compatibility tests (existing paths)
  • Path traversal and robustness tests
  • Python pathlib-inspired behavior checks
  • Platform-specific cases (Windows/macOS/Linux)
  • Symlink semantics and cycle detection
  • Windows-specific UNC, 8.3, and ADS validation
  • Anchored canonicalization tests (with anchored feature)

§Known Limitation (Windows 8.3)

On Windows, for non-existing paths we cannot determine equivalence between a short (8.3) name and its long form. Existing paths are canonicalized to the same result.

use soft_canonicalize::soft_canonicalize;
let short_form = soft_canonicalize("C:/PROGRA~1/MyApp/config.json")?;
let long_form  = soft_canonicalize("C:/Program Files/MyApp/config.json")?;
assert_ne!(short_form, long_form); // for non-existing suffixes

Structs§

SoftCanonicalizeError
Error payload used by this crate to attach the offending path to I/O errors.

Constants§

MAX_SYMLINK_DEPTH
Maximum number of symlinks to follow before giving up. This matches the behavior of std::fs::canonicalize and OS limits:

Traits§

IoErrorPathExt
Extension to extract our path-aware payload from io::Error.

Functions§

soft_canonicalize
Performs “soft” canonicalization on a path.