ming 0.1.0

Minimalist pedantic command line parser
Documentation
  • Coverage
  • 100%
    29 out of 29 items documented1 out of 13 items with examples
  • Size
  • Source code size: 66.14 kB This is the summed size of all the files inside the crates.io package for this release.
  • Documentation size: 2.78 MB This is the summed size of all files generated by rustdoc for all configured targets
  • Links
  • blyxxyz/lexopt
    370 12 4
  • crates.io
  • Dependencies
  • Versions
  • Owners
  • ldm0

Lexopt

Crates.io API reference MSRV CI

Lexopt is an argument parser for Rust. It tries to have the simplest possible design that's still correct. It's so simple that it's a bit tedious to use.

Lexopt is:

  • Small: one file, no dependencies, no macros. Easy to audit or vendor.
  • Correct: standard conventions are supported and ambiguity is avoided. Tested and fuzzed.
  • Pedantic: arguments are returned as OsStrings, forcing you to convert them explicitly. This lets you handle badly-encoded filenames.
  • Imperative: options are returned as they are found, nothing is declared ahead of time.
  • Annoyingly minimalist: only the barest necessities are provided.
  • Unhelpful: there is no help generation and error messages often lack context.

Example

struct Args {
    thing: String,
    number: u32,
    shout: bool,
}

fn parse_args() -> Result<Args, lexopt::Error> {
    use lexopt::prelude::*;

    let mut thing = None;
    let mut number = 1;
    let mut shout = false;
    let mut parser = lexopt::Parser::from_env();
    while let Some(arg) = parser.next()? {
        match arg {
            Short('n') | Long("number") => {
                number = parser.value()?.parse()?;
            }
            Long("shout") => {
                shout = true;
            }
            Value(val) if thing.is_none() => {
                thing = Some(val.into_string()?);
            }
            Long("help") => {
                println!("Usage: hello [-n|--number=NUM] [--shout] THING");
                std::process::exit(0);
            }
            _ => return Err(arg.unexpected()),
        }
    }

    Ok(Args {
        thing: thing.ok_or("missing argument THING")?,
        number,
        shout,
    })
}

fn main() -> Result<(), lexopt::Error> {
    let args = parse_args()?;
    let mut message = format!("Hello {}", args.thing);
    if args.shout {
        message = message.to_uppercase();
    }
    for _ in 0..args.number {
        println!("{}", message);
    }
    Ok(())
}

Let's walk through this:

  • We start parsing with Parser::from_env().
  • We call parser.next() in a loop to get all the arguments until they run out.
  • We match on arguments. Short and Long indicate an option.
  • To get the value that belongs to an option (like 10 in -n 10) we call parser.value().
    • This returns a standard OsString.
    • For convenience, use lexopt::prelude::* adds a .parse() method, analogous to str::parse.
  • Value indicates a free-standing argument. In this case, a filename.
    • if thing.is_none() is a useful pattern for positional arguments. If we already found thing we pass it on to another case.
    • It also contains an OsString.
      • The standard .into_string() method can decode it into a plain String.
  • If we don't know what to do with an argument we use return Err(arg.unexpected()) to turn it into an error message.
  • Strings can be promoted to errors for custom error messages.

This covers almost all the functionality in the library. Lexopt does very little for you.

For a larger example with useful patterns, see examples/cargo.rs.

Command line syntax

The following conventions are supported:

  • Short options (-q)
  • Long options (--verbose)
  • -- to mark the end of options
  • = to separate long options from values (--option=value)
  • Spaces to separate options from values (--option value, -f value)
  • Unseparated short options (-fvalue)
  • Combined short options (-abc to mean -a -b -c)

These are not supported:

  • -f=value for short options
  • Options with optional arguments (like GNU sed's -i, which can be used standalone or as -iSUFFIX)
  • Single-dash long options (like find's -name)
  • Abbreviated long options (GNU's getopt lets you write --num instead of --number if it can be expanded unambiguously)

Unicode

This library supports unicode while tolerating non-unicode arguments.

Short options may be unicode, but only a single codepoint. (If you need whole grapheme clusters you can use a long option. If you need normalization you're on your own, but it can be done.)

Options can be combined with non-unicode arguments. That is, --option=��� will not cause an error or mangle the value. This is surprisingly tricky to support: see os_str_bytes.

Options themselves are patched as by String::from_utf8_lossy if they're not valid unicode. That typically means you'll raise an error later when they're not recognized.

Why?

For a particular application I was looking for a small parser that's pedantically correct. There are other compact argument parsing libraries, but I couldn't find one that handled OsStrings and implemented all the fiddly details of the argument syntax faithfully.

This library may also be useful if a lot of control is desired, like when the exact argument order matters or not all options are known ahead of time. It could be considered more of a lexer than a parser.

Why not?

This library may not be worth using if:

  • You don't care about non-unicode arguments
  • You don't care about exact compliance and correctness
  • You don't care about code size
  • You do care about great error messages
  • You hate boilerplate

See also

  • clap/structopt: very fully-featured. The only other argument parser for Rust I know of that truly handles invalid unicode properly, if used right. Large.
  • argh and gumdrop: much leaner, yet still convenient and powerful enough for most purposes. Panic on invalid unicode.
    • argh adheres to the Fuchsia specification and therefore does not support --option=value and -ovalue, only --option value and -o value.
  • pico-args: slightly smaller than lexopt and easier to use (but less rigorous).
  • ap: I have not used this, but it seems to support iterative parsing while being less bare-bones than lexopt.
  • libc's getopt.

pico-args has a nifty table with build times and code sizes for different parsers. I've rerun the tests and added lexopt (using the program in examples/pico_test_app.rs):

null lexopt pico-args clap gumdrop structopt argh
Binary overhead 0KiB 14.5KiB 13.5KiB 372.8KiB 17.7KiB 371.2KiB 16.8KiB
Build time 0.9s 1.7s 1.6s 13.0s 7.5s 17.0s 7.5s
Number of dependencies 0 0 0 8 4 19 6
Tested version - 0.1.0 0.4.2 2.33.3 0.8.0 0.3.22 0.1.4

(Tests were run on x86_64 Linux with Rust 1.53 and cargo-bloat 0.10.1.)