Crate typed_path

Source
Expand description

§Typed Path

Crates.io Docs.rs CI RustC 1.58.1+

Provides typed variants of Path and PathBuf for Unix and Windows.

§Install

[dependencies]
typed-path = "0.11"

As of version 0.7, this library also supports no_std environments that depend on alloc. To build in this manner, remove the default std feature:

[dependencies]
typed-path = { version = "...", default-features = false }

§Why?

Some applications need to manipulate Windows or UNIX paths on different platforms, for a variety of reasons: constructing portable file formats, parsing files from other platforms, handling archive formats, working with certain network protocols, and so on.

– Josh Triplett

Check out this issue of a discussion for this. The functionality actually exists within the standard library, but is not exposed!

This means that parsing a path like C:\path\to\file.txt will be parsed differently by std::path::Path depending on which platform you are on!

use std::path::Path;

// On Windows, this prints out:
//
// * Prefix(PrefixComponent { raw: "C:", parsed: Disk(67) })
// * RootDir
// * Normal("path")
// * Normal("to")
// * Normal("file.txt")]
//
// But on Unix, this prints out:
//
// * Normal("C:\\path\\to\\file.txt")
let path = Path::new(r"C:\path\to\file.txt");
for component in path.components() {
    println!("* {:?}", component);
}

§Usage

§Byte paths

The library provides a generic Path<T> and PathBuf<T> that use [u8] and Vec<u8> underneath instead of OsStr and OsString. An encoding generic type is provided to dictate how the underlying bytes are parsed in order to support consistent path functionality no matter what operating system you are compiling against!

use typed_path::WindowsPath;

// On all platforms, this prints out:
//
// * Prefix(PrefixComponent { raw: "C:", parsed: Disk(67) })
// * RootDir
// * Normal("path")
// * Normal("to")
// * Normal("file.txt")]
//
let path = WindowsPath::new(r"C:\path\to\file.txt");
for component in path.components() {
    println!("* {:?}", component);
}

§UTF8-enforced paths

Alongside the byte paths, this library also supports UTF8-enforced paths through Utf8Path<T> and Utf8PathBuf<T>, which internally use str and String. An encoding generic type is provided to dictate how the underlying characters are parsed in order to support consistent path functionality no matter what operating system you are compiling against!

use typed_path::Utf8WindowsPath;

// On all platforms, this prints out:
//
// * Prefix(Utf8WindowsPrefixComponent { raw: "C:", parsed: Disk(67) })
// * RootDir
// * Normal("path")
// * Normal("to")
// * Normal("file.txt")]
//
let path = Utf8WindowsPath::new(r"C:\path\to\file.txt");
for component in path.components() {
    println!("* {:?}", component);
}

§Checking paths

When working with user-defined paths, there is an additional layer of defense needed to prevent abuse to avoid path traversal attacks and other risks.

To that end, you can use PathBuf::push_checked and Path::join_checked (and equivalents) to ensure that the paths being created do not alter pre-existing paths in unexpected ways.

use typed_path::{CheckedPathError, Path, PathBuf, UnixEncoding};

let path = Path::<UnixEncoding>::new("/etc");

// A valid path can be joined onto the existing one
assert_eq!(path.join_checked("passwd"), Ok(PathBuf::from("/etc/passwd")));

// An invalid path will result in an error
assert_eq!(
    path.join_checked("/sneaky/replacement"), 
    Err(CheckedPathError::UnexpectedRoot)
);

let mut path = PathBuf::<UnixEncoding>::from("/etc");

// Pushing a relative path that contains parent directory references that cannot be
// resolved within the path is considered an error as this is considered a path
// traversal attack!
assert_eq!(
    path.push_checked(".."), 
    Err(CheckedPathError::PathTraversalAttack)
);
assert_eq!(path, PathBuf::from("/etc"));

// Pushing an absolute path will fail with an error
assert_eq!(
    path.push_checked("/sneaky/replacement"), 
    Err(CheckedPathError::UnexpectedRoot)
);
assert_eq!(path, PathBuf::from("/etc"));

// Pushing a relative path that is safe will succeed
assert!(path.push_checked("abc/../def").is_ok());
assert_eq!(path, PathBuf::from("/etc/abc/../def"));

§Converting between encodings

There may be times in which you need to convert between encodings such as when you want to load a native path and convert it into another format. In that case, you can use the with_encoding method (or specific variants like with_unix_encoding and with_windows_encoding) to convert a Path or Utf8Path into their respective PathBuf and Utf8PathBuf with an explicit encoding:

use typed_path::{Utf8Path, Utf8UnixEncoding, Utf8WindowsEncoding};

// Convert from Unix to Windows
let unix_path = Utf8Path::<Utf8UnixEncoding>::new("/tmp/foo.txt");
let windows_path = unix_path.with_encoding::<Utf8WindowsEncoding>();
assert_eq!(windows_path, Utf8Path::<Utf8WindowsEncoding>::new(r"\tmp\foo.txt"));

// Converting from Windows to Unix will drop any prefix
let windows_path = Utf8Path::<Utf8WindowsEncoding>::new(r"C:\tmp\foo.txt");
let unix_path = windows_path.with_encoding::<Utf8UnixEncoding>();
assert_eq!(unix_path, Utf8Path::<Utf8UnixEncoding>::new(r"/tmp/foo.txt"));

// Converting to itself should retain everything
let path = Utf8Path::<Utf8WindowsEncoding>::new(r"C:\tmp\foo.txt");
assert_eq!(
    path.with_encoding::<Utf8WindowsEncoding>(),
    Utf8Path::<Utf8WindowsEncoding>::new(r"C:\tmp\foo.txt"),
);

Like with pushing and joining paths using checked variants, we can also ensure that paths created from changing encodings are still valid:

use typed_path::{CheckedPathError, Utf8Path, Utf8UnixEncoding, Utf8WindowsEncoding};

// Convert from Unix to Windows
let unix_path = Utf8Path::<Utf8UnixEncoding>::new("/tmp/foo.txt");
let windows_path = unix_path.with_encoding_checked::<Utf8WindowsEncoding>().unwrap();
assert_eq!(windows_path, Utf8Path::<Utf8WindowsEncoding>::new(r"\tmp\foo.txt"));

// Convert from Unix to Windows will fail if there are characters that are valid in Unix but not in Windows
let unix_path = Utf8Path::<Utf8UnixEncoding>::new("/tmp/|invalid|/foo.txt");
assert_eq!(
    unix_path.with_encoding_checked::<Utf8WindowsEncoding>(),
    Err(CheckedPathError::InvalidFilename),
);

§Typed Paths

In the above examples, we were using paths where the encoding (Unix or Windows) was known at compile time. There may be situations where we need runtime support to decide and switch between encodings. For that, this crate provides the TypedPath and TypedPathBuf enumerations (and their Utf8TypedPath and Utf8TypedPathBuf variations):

use typed_path::Utf8TypedPath;

// Derive the path by determining if it is Unix or Windows
let path = Utf8TypedPath::derive(r"C:\path\to\file.txt");
assert!(path.is_windows());

// Change the encoding to Unix
let path = path.with_unix_encoding();
assert_eq!(path, "/path/to/file.txt");

§Normalization

Alongside implementing the standard methods associated with Path and PathBuf from the standard library, this crate also implements several additional methods including the ability to normalize a path by resolving . and .. without the need to have the path exist.

use typed_path::Utf8UnixPath;

assert_eq!(
    Utf8UnixPath::new("foo/bar//baz/./asdf/quux/..").normalize(),
    Utf8UnixPath::new("foo/bar/baz/asdf"),
);

In addition, you can leverage absolutize to convert a path to an absolute form by prepending the current working directory if the path is relative and then normalizing it (requires std feature):

use typed_path::{utils, Utf8UnixPath};

// With an absolute path, it is just normalized
// NOTE: This requires `std` feature, otherwise `absolutize` is missing!
let path = Utf8UnixPath::new("/a/b/../c/./d");
assert_eq!(path.absolutize().unwrap(), Utf8UnixPath::new("/a/c/d"));

// With a relative path, it is first joined with the current working directory
// and then normalized
// NOTE: This requires `std` feature, otherwise `utf8_current_dir` and
//       `absolutize` are missing!
let cwd = utils::utf8_current_dir().unwrap().with_unix_encoding();
let path = cwd.join(Utf8UnixPath::new("a/b/../c/./d"));
assert_eq!(path.absolutize().unwrap(), cwd.join(Utf8UnixPath::new("a/c/d")));

§Utility Functions

Helper functions are available in the utils module (requires std feature).

Today, there are three mirrored methods to those found in std::env:

Each has an implementation to produce a NativePathBuf and a Utf8NativePathBuf.

§Current directory
// Retrieves the current directory as a NativePathBuf:
//
// * For Unix family, this would be PathBuf<UnixEncoding>
// * For Windows family, this would be PathBuf<WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `current_dir` is missing!
let _cwd = typed_path::utils::current_dir().unwrap();

// Retrieves the current directory as a Utf8NativePathBuf:
//
// * For Unix family, this would be Utf8PathBuf<Utf8UnixEncoding>
// * For Windows family, this would be Utf8PathBuf<Utf8WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `utf8_current_dir` is missing!
let _utf8_cwd = typed_path::utils::utf8_current_dir().unwrap();
§Current exe
// Returns the full filesystem path of the current running executable as a NativePathBuf:
//
// * For Unix family, this would be PathBuf<UnixEncoding>
// * For Windows family, this would be PathBuf<WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `current_exe` is missing!
let _exe = typed_path::utils::current_exe().unwrap();

// Returns the full filesystem path of the current running executable as a Utf8NativePathBuf:
//
// * For Unix family, this would be Utf8PathBuf<Utf8UnixEncoding>
// * For Windows family, this would be Utf8PathBuf<Utf8WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `utf8_current_exe` is missing!
let _utf8_exe = typed_path::utils::utf8_current_exe().unwrap();
§Temporary directory
// Returns the path of a temporary directory as a NativePathBuf:
//
// * For Unix family, this would be PathBuf<UnixEncoding>
// * For Windows family, this would be PathBuf<WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `temp_dir` is missing!
let _temp_dir = typed_path::utils::temp_dir().unwrap();

// Returns the path of a temporary directory as a Utf8NativePathBuf:
//
// * For Unix family, this would be Utf8PathBuf<Utf8UnixEncoding>
// * For Windows family, this would be Utf8PathBuf<Utf8WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `utf8_temp_dir` is missing!
let _utf8_temp_dir = typed_path::utils::utf8_temp_dir().unwrap();

§License

This project is licensed under either of

Apache License, Version 2.0, (LICENSE-APACHE or apache-license) MIT license (LICENSE-MIT or mit-license) at your option.

Modules§

constants
Contains constants associated with different path formats.
utils

Structs§

Ancestors
An iterator over Path and its ancestors.
Display
Helper struct for safely printing paths with format! and {}.
Iter
An iterator over the Components of a Path, as [[u8]] slices.
Path
A slice of a path (akin to str).
PathBuf
An owned, mutable path that mirrors std::path::PathBuf, but operatings using an Encoding to determine how to parse the underlying bytes.
PlatformEncoding
Represents an abstraction of Encoding that represents the current platform encoding.
StripPrefixError
An error returned if the prefix was not found.
UnixComponents
UnixEncoding
Represents a Unix-specific Encoding
Utf8Ancestors
An iterator over Utf8Path and its ancestors.
Utf8Iter
An iterator over the Utf8Components of a Utf8Path, as str slices.
Utf8Path
A slice of a path (akin to str).
Utf8PathBuf
An owned, mutable path that mirrors std::path::PathBuf, but operatings using a Utf8Encoding to determine how to parse the underlying str.
Utf8PlatformEncoding
Represents an abstraction of Utf8Encoding that represents the current platform encoding.
Utf8UnixComponents
Utf8UnixEncoding
Represents a Unix-specific Utf8Encoding
Utf8WindowsComponents
Represents a Windows-specific Components
Utf8WindowsEncoding
Represents a Windows-specific Utf8Encoding
Utf8WindowsPrefixComponent
A structure wrapping a Windows path prefix as well as its unparsed string representation. str version of std::path::PrefixComponent.
WindowsComponents
Represents a Windows-specific Components
WindowsEncoding
Represents a Windows-specific Encoding
WindowsPrefixComponent
A structure wrapping a Windows path prefix as well as its unparsed string representation. Byte slice version of std::path::PrefixComponent.

Enums§

CheckedPathError
An error returned when a path violates checked criteria.
PathType
Represents the type of the path.
TypedAncestors
An iterator over TypedPath and its ancestors.
TypedComponent
Byte slice version of std::path::Component that represents either a Unix or Windows path component.
TypedComponents
TypedIter
An iterator over the TypedComponents of a TypedPath, as [[u8]] slices.
TypedPath
Represents a path with a known type that can be one of:
TypedPathBuf
Represents a pathbuf with a known type that can be one of:
UnixComponent
Byte slice version of std::path::Component that represents a Unix-specific component
Utf8TypedAncestors
An iterator over Utf8TypedPath and its ancestors.
Utf8TypedComponent
Str slice version of std::path::Component that represents either a Unix or Windows path component.
Utf8TypedComponents
Utf8TypedIter
An iterator over the Utf8TypedComponents of a Utf8TypedPath, as str slices.
Utf8TypedPath
Represents a path with a known type that can be one of:
Utf8TypedPathBuf
Represents a pathbuf with a known type that can be one of:
Utf8UnixComponent
str slice version of std::path::Component that represents a Unix-specific component
Utf8WindowsComponent
str slice version of std::path::Component that represents a Windows-specific component
Utf8WindowsPrefix
Windows path prefixes, e.g., C: or \\server\share. This is a byte slice version of std::path::Prefix.
WindowsComponent
Byte slice version of std::path::Component that represents a Windows-specific component
WindowsPrefix
Windows path prefixes, e.g., C: or \\server\share. This is a byte slice version of std::path::Prefix.

Traits§

Component
Interface representing a component in a Path
Components
Interface of an iterator over a collection of Components
Encoding
Interface to provide meaning to a byte slice such that paths can be derived
TryAsRef
Interface to try to perform a cheap reference-to-reference conversion.
Utf8Component
Interface representing a component in a Utf8Path
Utf8Components
Interface of an iterator over a collection of Utf8Components
Utf8Encoding
Interface to provide meaning to a byte slice such that paths can be derived

Type Aliases§

NativeComponent
Component that is native to the platform during compilation
NativeEncoding
Encoding that is native to the platform during compilation
NativePath
Path that is native to the platform during compilation
NativePathBuf
PathBuf that is native to the platform during compilation
ParseError
PlatformPath
Path that has the platform’s encoding during compilation.
PlatformPathBuf
PathBuf that has the platform’s encoding during compilation.
UnixPath
Represents a Unix-specific Path
UnixPathBuf
Represents a Unix-specific PathBuf
Utf8NativeComponent
Utf8Component that is native to the platform during compilation
Utf8NativeEncoding
Utf8Path that is native to the platform during compilation
Utf8NativePath
Utf8Path that is native to the platform during compilation
Utf8NativePathBuf
Utf8PathBuf that is native to the platform during compilation
Utf8PlatformPath
Utf8Path that has the platform’s encoding during compilation.
Utf8PlatformPathBuf
Utf8PathBuf that has the platform’s encoding during compilation.
Utf8UnixPath
Represents a Unix-specific Utf8Path
Utf8UnixPathBuf
Represents a Unix-specific Utf8PathBuf
Utf8WindowsPath
Represents a Windows-specific Utf8Path
Utf8WindowsPathBuf
Represents a Windows-specific Utf8PathBuf
WindowsPath
Represents a Windows-specific Path
WindowsPathBuf
Represents a Windows-specific PathBuf