Lexopt

Lexopt is an argument parser for Rust. It tries to have the simplest possible design that's still correct. It's so simple that it's a bit tedious to use.

Lexopt is:

Small: one file, no dependencies, no macros. Easy to audit or vendor.
Correct: standard conventions are supported and ambiguity is avoided. Tested and fuzzed.
Pedantic: arguments are returned as OsStrings, forcing you to convert them explicitly. This lets you handle badly-encoded filenames.
Imperative: options are returned as they are found, nothing is declared ahead of time.
Annoyingly minimalist: only the barest necessities are provided.
Unhelpful: there is no help generation and error messages often lack context.

Example

struct Args {
    thing: String,
    number: u32,
    shout: bool,
}

fn parse_args() -> Result<Args, lexopt::Error> {
    use lexopt::prelude::*;

    let mut thing = None;
    let mut number = 1;
    let mut shout = false;
    let mut parser = lexopt::Parser::from_env();
    while let Some(arg) = parser.next()? {
        match arg {
            Short('n') | Long("number") => {
                number = parser.value()?.parse()?;
            }
            Long("shout") => {
                shout = true;
            }
            Value(val) if thing.is_none() => {
                thing = Some(val.into_string()?);
            }
            Long("help") => {
                println!("Usage: hello [-n|--number=NUM] [--shout] THING");
                std::process::exit(0);
            }
            _ => return Err(arg.unexpected()),
        }
    }

    Ok(Args {
        thing: thing.ok_or("missing argument THING")?,
        number,
        shout,
    })
}

fn main() -> Result<(), lexopt::Error> {
    let args = parse_args()?;
    let mut message = format!("Hello {}", args.thing);
    if args.shout {
        message = message.to_uppercase();
    }
    for _ in 0..args.number {
        println!("{}", message);
    }
    Ok(())
}

Let's walk through this:

We start parsing with Parser::from_env().
We call parser.next() in a loop to get all the arguments until they run out.
We match on arguments. Short and Long indicate an option.
To get the value that belongs to an option (like 10 in -n 10) we call parser.value().
- This returns a standard OsString.
- For convenience, use lexopt::prelude::* adds a .parse() method, analogous to str::parse.
Value indicates a free-standing argument. In this case, a filename.
- if thing.is_none() is a useful pattern for positional arguments. If we already found thing we pass it on to another case.
- It also contains an OsString.
  - The standard .into_string() method can decode it into a plain String.
If we don't know what to do with an argument we use return Err(arg.unexpected()) to turn it into an error message.
Strings can be promoted to errors for custom error messages.

This covers almost all the functionality in the library. Lexopt does very little for you.

For a larger example with useful patterns, see examples/cargo.rs.

Command line syntax

The following conventions are supported:

Short options (-q)
Long options (--verbose)
-- to mark the end of options
= to separate long options from values (--option=value)
Spaces to separate options from values (--option value, -f value)
Unseparated short options (-fvalue)
Combined short options (-abc to mean -a -b -c)

These are not supported:

-f=value for short options
Options with optional arguments (like GNU sed's -i, which can be used standalone or as -iSUFFIX)
Single-dash long options (like find's -name)
Abbreviated long options (GNU's getopt lets you write --num instead of --number if it can be expanded unambiguously)

Unicode

This library supports unicode while tolerating non-unicode arguments.

Short options may be unicode, but only a single codepoint. (If you need whole grapheme clusters you can use a long option. If you need normalization you're on your own, but it can be done.)

Options can be combined with non-unicode arguments. That is, --option=�� will not cause an error or mangle the value. This is surprisingly tricky to support: see os_str_bytes.

Options themselves are patched as by String::from_utf8_lossy if they're not valid unicode. That typically means you'll raise an error later when they're not recognized.

Why?

For a particular application I was looking for a small parser that's pedantically correct. There are other compact argument parsing libraries, but I couldn't find one that handled OsStrings and implemented all the fiddly details of the argument syntax faithfully.

This library may also be useful if a lot of control is desired, like when the exact argument order matters or not all options are known ahead of time. It could be considered more of a lexer than a parser.

Why not?

This library may not be worth using if:

You don't care about non-unicode arguments
You don't care about exact compliance and correctness
You don't care about code size
You do care about great error messages
You hate boilerplate

	null	lexopt	pico-args	clap	gumdrop	structopt	argh
Binary overhead	0KiB	14.5KiB	13.5KiB	372.8KiB	17.7KiB	371.2KiB	16.8KiB
Build time	0.9s	1.7s	1.6s	13.0s	7.5s	17.0s	7.5s
Number of dependencies	0	0	0	8	4	19	6
Tested version	-	0.1.0	0.4.2	2.33.3	0.8.0	0.3.22	0.1.4