Crate pfmt

source ·
Expand description

Overview

This library provides a flexible and powerful method to format data. At the heart of the library lie two traits: Fmt, for things that are formattable, and FormatTable, that maps placeholders in format strings to actual Fmts and supply those with information they need to produce output. Unlike with format! from the standard library, there is no restriction that format strings need to be static; in fact the whole point of the library is to allow moving as much control over formatting process into the format strings themselves (and ideally those - into user-editable config files).

There are several impls of FormatTable, most notable for HashMaps (with either str or String keys, and Borrow<dyn Fmt> values, which means a bit of type annotations required to use them) and Vecs (also with Borrow<dyn Fmt> elements). The method on FormatTable to format a string is format(&self, format_string: &str) -> Result<String, FormattingError>

Format string syntax

A format string consists of literals and placeholders. They can be separated by colons, but this is not required. Just keep in mind that colons need to be escaped in literals.

A literal is any string not containing unescaped opening brackets “{” or colons “:”. Escaping is done in the usual fashion, with backslashes.

A placeholder has a more complex structure. Each placeholder is contained within a set of curly brackets, and starts with a name of the Fmt it requests formatting from. Name is a dot-separated list of segments denoting access to a (possibly nested) Fmt. There must be at least one segment, and empty segments are not allowed (FormattingErrors happen in both cases). All of the following are valid name-only placeholders:

  • "{name}"
  • "{a.path.to.some.nested.field}"
  • r"{escapes.\{are\}.allowed}"

Note that trailing and leading whitespace is stripped from each segment (which means that whitespace-only segments are not allowed as well as empty segments).

The name may be followed by an arguments block. An arguments block is just a valid format string surrounded by a pair of curly brackets. Colons take on a special meaning in argument blocks: they separate individual arguments. While in a top-level format string the following two would behave identically, they are very different if used as arguments:

  • "{foo{baz}}"
  • "{foo:{baz}}"

If used as an argument block, the first string would form a single argument, concatenating a literal "foo" and the expansion of placeholder "baz". The second would form two arguments instead.

Here are some examples of valid placeholders with arguments:

  • "{name{simple!}}"
  • "{name{two:arguments}}"
  • "{name{{a}{b}:{c}foobar}}"
  • "{nested{{also.can.have.arguments{arg}}}}"

There’s one limitation to the above: there is a rather arbitrary limit of 100 to the allowed nesting of placeholders inside placeholders in general and inside argument lists in particular, to avoid blowing the stack up by mistake of from malice.

After the argument block (or after the name if there is no arguments) may be a flags block. If the flags block follows the name, it has to be separated from it by a colon. If the flags block follows an argument list, it may or may not be separated from it by the colon. Flags block may be empty, or contain one or more single-character flags. Flags can be repeated, the exact meaning of the repetition depends on the Fmt the flags will be fed to. If the flags block is followed by options (see below) it has to be terminated with a colon, otherwise the colon is optional.

Here are some examples of placeholders with flags:

  • "{name:a=}"
  • "{name{arg}asdf}"
  • "{name{arg}:asdf:}"

Finally, a placeholder can contain zero or more options after the flags block. Note that if the options are present, flags block must be present as well (but may be empty). Options are key-value pairs, with keys being simple strings with all leading and trailing whitespace stripped, while values can be any valid format strings (the same caveat about nesting as with arguments applies here as well). A key is separated from a value by an equals sign. Key-value pairs are separated by colons. Empty values are allowed, empty keys are not.

Here are some examples of placeholders with options:

  • "{name::opt=value}"
  • "{name::a=foo:b=baz}"
  • "{name::opt=foo{can.be.a.placeholder.as.well}}"

Different implementations of Fmt support different flags and options, see each entry to find out which. There is also a group of common options, described in a separate section below.

Examples

Let’s start with something boring:

use std::collections::HashMap;
use pfmt::{Fmt, FormatTable};

let i = 2;
let j = 5;
let mut table: HashMap<&str, &dyn Fmt> = HashMap::new();
table.insert("i", &i);
table.insert("j", &j);
let s = table.format("i = {i}, j = {j}").unwrap();
assert_eq!(s, "i = 2, j = 5");

I can do that with format! too. This is a bit more fun, and shows both options and flags:

use std::collections::HashMap;
use pfmt::{Fmt, FormatTable};

let s = "a_really_long_string";
let i = 10;
let j = 12;
let mut table: HashMap<&str, &dyn Fmt> = HashMap::new();
table.insert("s", &s);
table.insert("i", &i);
table.insert("j", &j);
// (note escaped colons)
let s = table.format("hex\\: {i:px}, octal\\: {j:o}, fixed width\\: {s::truncate=r5}").unwrap();
assert_eq!(s, "hex: 0xa, octal: 14, fixed width: a_rea");

Can’t decide if you want your booleans as “true” and “false”, or “yes” and “no”? Easy:

use std::collections::HashMap;
use pfmt::{Fmt, FormatTable};

let a = true;
let b = false;
let mut table: HashMap<&str, &dyn Fmt> = HashMap::new();
table.insert("a", &a);
table.insert("b", &b);
let s = table.format("{a}, {b:y}, {b:Y}").unwrap();
assert_eq!(s, "true, no, N");

And here are Vecs as format tables:

use pfmt::{Fmt, FormatTable};
let i = 1;
let j = 2;
let table: Vec<&dyn Fmt> = vec![&i, &j];
let s = table.format("{0}, {1}, {0}").unwrap();
assert_eq!(s, "1, 2, 1");

All of the above examples used references as the element type of the format tables, but FormatTable is implemented (for hashmaps and vectors) for anything that is Borrow<dyn Fmt>, which means boxes, and reference counters and more. Tables thus can fully own the data:

use std::collections::HashMap;
use pfmt::{Fmt, FormatTable};

let mut table: HashMap<String, Box<Fmt>> = HashMap::new();
table.insert("a".to_string(), Box::new(2) as Box<Fmt>);
table.insert("b".to_string(), Box::new("foobar".to_string()) as Box<Fmt>);
let s = table.format("{a}, {b}").unwrap();
assert_eq!(s, "2, foobar");

This is a bit on the verbose side, though.

The library also suppports accessing elements of Fmts through the same syntax Rust uses: dot-notation, provided the implementation of Fmt in question allows it:

use std::collections::HashMap;
use pfmt::{Fmt, FormatTable, SingleFmtError, util};
 
struct Point {
    x: i32,
    y: i32
}
 
impl Fmt for Point {
    fn format(
        &self,
        full_name: &[String],
        name: &[String],
        args: &[String],
        flags: &[char],
        options: &HashMap<String, String>,
    ) -> Result<String, SingleFmtError> {
        if name.is_empty() {
            Err(SingleFmtError::NamespaceOnlyFmt(util::join_name(full_name)))
        } else if name[0] == "x" {
            self.x.format(full_name, &name[1..], args, flags, options)
        } else if name[0] == "y" {
            self.y.format(full_name, &name[1..], args, flags, options)
        } else {
            Err(SingleFmtError::UnknownSubfmt(util::join_name(full_name)))
        }
    }
}
 
let p = Point { x: 1, y: 2 };
let mut table: HashMap<&str, &dyn Fmt> = HashMap::new();
table.insert("p", &p);
let s = table.format("{p.x}, {p.y}").unwrap();
assert_eq!(s, "1, 2");

This can be nested to arbitrary depth.

Errors

format method on FormatTables returns a Result<String, FormattingError>. There are three primary types of these: parsing errors which occur when the format string is not well-formed, errors arising from usage of unknown options and flags or options with invalid values, and finally errors due to requesting Fmts that are missing in the table.

With hard-coded format strings and rigid format tables, most of these can be safely ignored, so unwrap() away.

Common options

Most pre-made implementation of Fmt honor several common options. Here’s a list of them, with detailed info available further in this section:

  • truncate
  • width

truncate: {'l', 'r'} + non-negative integer

Controls truncation of the field. If begins with l, left part of the field that doesn’t fit is truncated, if begins with r - the right part is removed instead. Note that "l0" is not actually forbidden, just very useless.

It is an InvalidOptionValue to pass anything not fitting into the template in the header as the value of this option.

width: {'l', 'c', 'r'} + non-negative integer

Controls the width of the field. Has no effect if the field is already wider than the value supplied. If starts with “l”, the field will be left-justified. If starts with “c”, the field will be centered. If starts with “r”, the field will be right-justified.

It is an InvalidOptionValue to pass anything not fitting into the template in the header as the value for this option.

Common numeric options

Most numeric Fmts honor these. For the detailed description skip to the end of this section.

  • prec
  • round

prec: integer

Controls precision of the displayed number, with bigger values meaning more significant digits will be displayed. If negative, the number will be rounded, the rounding direction is controlled by the round option. Positive values are accepted by integer Fmts, but have no effect.

It is an InvalidOptionValue to pass a string that doesn’t parse as a signed integer as a value to this option.

round: {"up", "down", "nearest"}

Controls the direction of rounding by the round option, and has no effect without it. Defaults to nearest.

It is an InvalidOptionValue to pass a string different from the mentioned three to this option.

More fun

Format tables are pretty flexible in the type of format units they can return from the get_fmt method. They don’t even have to actually contain the Fmts, you can make your format tables produce format units on the fly. The drawback is that you’ll probably lose ability to combine format tables via tuples (see below) if your Item is not &'a dyn Fmt. But variations on the following are possible (and possibly useful):

use pfmt::{Fmt, FormatTable};
 
struct Producer { }
 
impl<'a> FormatTable<'a> for Producer {
    type Item = i32;
 
    fn get_fmt(&'a self, name: &str) -> Option<Self::Item> {
        name.parse().ok()
    }
}
  
let table = Producer { };
let s = table.format("{1}, {12}").unwrap();
assert_eq!(s, "1, 12");

There’s also an implementation of FormatTable for tuples (up to 6-tuples) that contain format tables with the same Item type. It searches the format tables in order and uses the first Fmt successfully returned by get_fmt. This is particularly useful with Mono format table from the extras module. It allows to easily combine format tables or provide defaults or overrides without modifying the tables in question.

use std::collections::HashMap;
 
use pfmt::{Fmt, FormatTable};
use pfmt::extras::Mono;
 
let a = Mono("a", 5);
let b = Mono("b", "foo");
let t = {
    let mut res: HashMap<&str, Box<dyn Fmt>> = HashMap::new();
    res.insert("a", Box::new("not five"));
    res.insert("b", Box::new("not foo"));
    res
};
let s1 = (&a, &b, &t).format("{a}, {b}").expect("Failed to format");
let s2 = (&a, &t, &b).format("{a}, {b}").expect("Failed to format");
assert_eq!(s1, "5, foo");
assert_eq!(s2, "5, not foo");

Modules

Non-essential niceties

Enums

Any error that can happen during formatting.
Errors that happen in individual Fmts. All of them contain the full path to the Fmt in error as the first field.

Traits

A unit of formatting.
A collection or producer of format units.