Expand description

Quote strings for use with Bash, the GNU Bourne-Again Shell.

⚠️ Warning

It is possible to encode NUL in a Bash string, but Bash appears to then truncate the rest of the string after that point, likely because NUL is the C string terminator. This appears to be a bug in Bash or at least a serious limitation.

If you’re quoting UTF-8 content this may not be a problem since there is only one code point – the null character itself – that will ever produce a NUL byte. To avoid this problem entirely, consider using Modified UTF-8 so that the NUL byte can never appear in a valid byte stream.

Notes

From bash(1):

Words of the form $‘string’ are treated specially. The word expands to string, with backslash- escaped characters replaced as specified by the ANSI C standard. Backslash escape sequences, if present, are decoded as follows:

\a     alert (bell)
\b     backspace
\e     an escape character
\f     form feed
\n     new line
\r     carriage return
\t     horizontal tab
\v     vertical tab
\\     backslash
\'     single quote
\nnn   the eight-bit character whose value is the
       octal value nnn (one to three digits)
\xHH   the eight-bit character whose value is the
       hexadecimal value HH (one or two hex digits)
\cx    a control-x character

Bash allows, in newer versions, for non-ASCII Unicode characters with \uHHHH and \UXXXXXXXX syntax inside these ANSI C quoted strings, but we avoid this and work only with bytes. Part of the problem is that it’s not clear how Bash then works with these strings. Does it encode these characters into bytes according to the user’s current locale? Are strings in Bash now natively Unicode?

For now it’s up to the caller to figure out encoding. A significant use case for this code is to escape filenames into scripts, and on *nix variants I understand that filenames are essentially arrays of bytes, even if the OS adds some normalisation and case-insensitivity on top.

If you have some expertise in this area I would love to hear from you.

Functions

Escape a string of bytes into a new Vec<u8>.

Escape a string of bytes into an existing Vec<u8>.

Escape a string of bytes into a new OsString.