rust-parallel 1.4.2

Fast command line app in rust/tokio to execute commands in parallel. Similar interface to GNU parallel or xargs.
rust-parallel-1.4.2 is not a library.

rust-parallel

Command-line utility to execute commands in parallel and aggregate their output.

Similar interface to GNU Parallel or xargs but implemented in rust and tokio.

  • Supports running commands read from stdin or input files similar to xargs, for example: head -1000 /usr/share/dict/words | rust-parallel md5 -s.
  • Supports ::: syntax to run all combinations of argument groups similar to GNU Parallel, for example: rust-parallel gzip -k ::: *.html

Prevents output interleaving and is very fast.

See the demos for example usage.

Crates.io CI workflow

Contents:

Usage:

$ rust-parallel --help
Execute commands in parallel

By Aaron Riekenberg <aaron.riekenberg@gmail.com>

https://github.com/aaronriekenberg/rust-parallel
https://crates.io/crates/rust-parallel

Usage: rust-parallel [OPTIONS] [COMMAND_AND_INITIAL_ARGUMENTS]...

Arguments:
  [COMMAND_AND_INITIAL_ARGUMENTS]...
          Optional command and initial arguments.

          If this contains 1 or more ::: delimiters the cartesian product of arguments from all groups are run.

Options:
  -d, --discard-output <DISCARD_OUTPUT>
          Discard output for commands

          Possible values:
          - stdout: Redirect stdout for commands to /dev/null
          - stderr: Redirect stderr for commands to /dev/null
          - all:    Redirect stdout and stderr for commands to /dev/null

  -i, --input-file <INPUT_FILE>
          Input file or - for stdin.  Defaults to stdin if no inputs are specified

  -j, --jobs <JOBS>
          Maximum number of commands to run in parallel, defauts to num cpus

          [default: 8]

  -0, --null-separator
          Use null separator for reading input files instead of newline

  -s, --shell
          Use shell mode for running commands.

          Each command line is passed to "<shell-path> -c" as a single argument.

      --channel-capacity <CHANNEL_CAPACITY>
          Input and output channel capacity, defaults to num cpus * 2

          [default: 16]

      --disable-path-cache
          Disable command path cache

      --shell-path <SHELL_PATH>
          Path to shell to use for shell mode

          [default: /bin/bash]

  -h, --help
          Print help (see a summary with '-h')

  -V, --version
          Print version

Installation:

Recommended:

  1. Download a pre-built release from Github Releases for Linux or MacOS.
  2. Extract the executable and put somewhere in your $PATH.

For manual installation/update:

  1. Install Rust
  2. Install the latest version of this app from crates.io:
$ cargo install rust-parallel   
  1. The same cargo install rust-parallel command will also update to the latest version after initial installation.

Demos:

See the wiki page for demos.

Benchmarks:

See the wiki page for benchmarks.

Features:

  • Use only safe rust.
    • main.rs contains #![forbid(unsafe_code)])
  • Prevent output interleaving.
  • Use only asynchronous operations supported by tokio, do not use any blocking operations. This includes writing to stdout and stderr.
  • Support arbitrarily large number of input lines, avoid O(number of input lines) memory usage. In support of this:
    • tokio::sync::Semaphore is used carefully to limit the number of commands that run concurrently. Do not spawn tasks for all input lines immediately to limit memory usage.
  • Cache resolved command paths so expensive lookup in $PATH is not done for every command executed. This can be disabled with --disable-path-cache option.
  • Support running commands on local machine only, not on remote machines.

Tech Stack:

  • anyhow used for application error handling to propogate and format fatal errors.
  • clap command line argument parser.
  • itertools using multi_cartesian_product to process ::: command line inputs.
  • tokio asynchronous runtime for rust. From tokio this app uses:
    • async / await functions (aka coroutines)
    • Singleton CommandLineArgs instance using tokio::sync::OnceCell.
    • Asynchronous command execution using tokio::process::Command
    • tokio::sync::Semaphore used to limit number of commands that run concurrently.
    • tokio::sync::mpsc::channel used to receive inputs from input task, and to send command outputs to an output writer task. To await command completions, use the elegant property that when all Senders are dropped the channel is closed.
  • tracing structured debug and warning logs.
  • which used to resolve command paths for path cache.