pub fn escape<T: Into<OsString>>(s: T) -> Vec<u8>
Expand description

Escape a string of bytes into a new Vec<u8>.

This will return one of the following:

NOTE: It is possible to encode NUL in this syntax as $'\x00', but Bash appears to then truncate the rest of the string after that point, likely because NUL is the C string terminator. This seems like a bug in Bash.

See escape_into for a variant that extends an existing Vec instead of allocating a new one.

Examples

assert_eq!(bash::escape("foobar"), b"foobar");
assert_eq!(bash::escape("foo bar"), b"$'foo bar'");

Notes

From bash(1):

Words of the form $‘string’ are treated specially. The word expands to string, with backslash- escaped characters replaced as specified by the ANSI C standard. Backslash escape sequences, if present, are decoded as follows:

\a     alert (bell)
\b     backspace
\e     an escape character
\f     form feed
\n     new line
\r     carriage return
\t     horizontal tab
\v     vertical tab
\\     backslash
\'     single quote
\nnn   the eight-bit character whose value is the
       octal value nnn (one to three digits)
\xHH   the eight-bit character whose value is the
       hexadecimal value HH (one or two hex digits)
\cx    a control-x character

You can see that Bash allows (maybe only in newer versions?) for non-ASCII Unicode characters with \uHHHH and \UXXXXXXXX syntax, but we avoid this and work only with bytes. Part of the problem is that it’s not clear how Bash then works with these strings. Does it encode these characters into bytes according to the user’s current locale? Are strings in Bash now natively Unicode?

For now it’s up to the caller to figure out encoding. A significant use case for this code is to escape filenames into scripts, and on *nix variants I understand that filenames are essentially arrays of bytes, even if the OS adds some normalisation and case-insensitivity on top.

If you have some expertise in this area I would love to hear from you.

The argument passed into escape is Into<OsString>, so you can pass in regular Rust strings, PathBuf, and so on. For a regular Rust string it will be quoted byte for byte