Crate os_str_bytes

source ·
Expand description

This crate provides additional functionality for OsStr and OsString, without resorting to panics or corruption for invalid UTF-8. Thus, familiar methods from str and String can be used.

§Usage

The most important trait included is OsStrBytesExt, which provides methods analagous to those of str but for OsStr. These methods will never panic for invalid UTF-8 in a platform string, so they can be used to manipulate OsStr values with the same simplicity possible for str.

Additionally, the following wrappers are provided. They are primarily legacy types from when this crate needed to perform more frequent encoding conversions. However, they may be useful for their trait implementations.

§User Input

Most methods in this crate should not be used to convert byte sequences that did not originate from OsStr or a related struct. The encoding used by this crate is an implementation detail, so it does not make sense to expose it to users.

For user input with an unknown encoding similar to UTF-8, use the following IO-safe methods, which avoid errors when writing to streams on Windows. These methods will not accept or return byte sequences that are invalid for input and output streams. Therefore, they can be used to convert between bytes strings exposed to users and platform strings.

§Features

These features are optional and can be enabled or disabled in a “Cargo.toml” file.

§Default Features

§Optional Features

§Implementation

Some methods return Cow to account for platform differences. However, no guarantee is made that the same variant of that enum will always be returned for the same platform. Whichever can be constructed most efficiently will be returned.

All traits are sealed, meaning that they can only be implemented by this crate. Otherwise, backward compatibility would be more difficult to maintain for new features.

§Encoding Conversions

Methods provided by the “conversions” feature use an intentionally unspecified encoding. It may vary for different platforms, so defining it would run contrary to the goal of generic string handling. However, the following invariants will always be upheld:

  • The encoding will be compatible with UTF-8. In particular, splitting an encoded byte sequence by a UTF-8–encoded character always produces other valid byte sequences. They can be re-encoded without error using RawOsString::into_os_string and similar methods.

  • All characters valid in platform strings are representable. OsStr and OsString can always be losslessly reconstructed from extracted bytes.

Note that the chosen encoding may not match how OsStr stores these strings internally, which is undocumented. For instance, the result of calling OsStr::len will not necessarily match the number of bytes this crate uses to represent the same string. However, unlike the encoding used by OsStr, the encoding used by this crate can be validated safely using the following methods:

Concatenation may yield unexpected results without a UTF-8 separator. If two platform strings need to be concatenated, the only safe way to do so is using OsString::push. This limitation also makes it undesirable to use the bytes in interchange.

Since this encoding can change between versions and platforms, it should not be used for storage. The standard library provides implementations of OsStrExt and OsStringExt for various platforms, which should be preferred for that use case.

  • print_bytes - Used to print byte and platform strings as losslessly as possible.

  • uniquote - Used to display paths using escapes instead of replacement characters.

§Examples

use std::env;
use std::fs;

use os_str_bytes::OsStrBytesExt;

for file in env::args_os().skip(1) {
    if !file.starts_with('-') {
        let string = "Hello, world!";
        fs::write(&file, string)?;
        assert_eq!(string, fs::read_to_string(file)?);
    }
}

Modules§

  • iterraw_os_str
    Iterators provided by this crate.

Structs§

  • EncodingErrorchecked_conversions
    The error that occurs when a byte sequence is not representable in the platform encoding.
  • A container for platform strings containing no unicode characters.
  • A container providing additional functionality for OsStr.
  • A container for owned byte strings converted by this crate.

Traits§