Expand description
Shell-quoting, à la Perl’s quotemeta
function.
This crate currently provides a quotemeta
function which shell-escapes a filename or similar
data, and its corresponding inverse, unquotemeta
.
It is anticipated that the crate may expand to include fine-tuning of the escaping strategy, but for
now quotemeta
returns a string which bash
and zsh
will expand back into the original string.
unquotemeta
correctly round-trips the output of quotemeta
, but does not support arbitrary
shell expansions.
At present, this crate is Unix-only. Windows filenames (and thus OsStr
) have edge cases which
cannot be round-tripped in a way that is compatible with a Unix implementation.
Rationale and implementation details
This crate exists because I transferred a peculiar design trope from some of my Perl utility scripts to their Rust replacements, which is to emit a shell script to do a task rather than do it directly. This separation of responsibilities aligns with the Unix philosophy of composing small tools and gives a lot of interesting benefits: the script can be saved and executed later (or not at all) rather than piped into a shell, and not necessarily even on the same machine which generated it.
But you’re not here to listen to a sales pitch on Unix.
The problem with generating a shell script is that Unix filenames are an arbitrary sequence of
octets, many of them being shell metacharacters which need to be quoted before passing to a shell.
Perl’s quotemeta
simply backslashes “all ASCII
characters not matching /[A-Za-z_0-9]/
” which works most of the time, but often only by accident.
For example, a backslash-quoted newline is removed by POSIX-compliant shells. Non-ASCII characters
are not quoted at all, but this might still work provided the locale settings are just so and the
Perl string isn’t marked as UTF-8. Good luck!
Rust introduces a new gotcha because it is less laissez-faire with string types than Perl and
shells. One cannot println!
text which is not valid UTF-8, and so filenames which are not valid
UTF-8 cannot be printed as-is. Write::write()
makes it possible to
write non-UTF-8 text to stdout, but &[u8]
lacks a lot of useful string-handling functions, has an
unhelpful Debug
representation, and this generally just makes the code harder to write and more
unreadable. So the quoted form really needs to be valid UTF-8, and the path of least resistance is
plain ASCII. Sometimes you just want to println!("cat {}", quotemeta(path))
and it work properly.
I’m specifically targeting bash
, and the only way to shell-quote high-bit-set octets without
actually including them literally is using ANSI-C
quoting.
Pure POSIX shells do not understand this, but zsh
also does, thus the two major religions are
covered. fish
doesn’t seem to allow encoding high-bit-set octets at all,dash
needs different
syntax, and we don’t talk about csh
in polite company. There is sadly no one-size-fits-all
solution. Welcome to Unix.
Functions
- Shell-quotes the given OS string into a string.
- Shell-unquotes a string into an OS string.