Crate pretty_expressive

Crate pretty_expressive 

Source
Expand description

This crate is a Rust port of the pretty-expressive printer from Racket. It is an efficient pretty printer with an expressive language for describing documents. The algorithm is described in “A Pretty Expressive Printer”. The two official implementations, in Racket and OCaml, were both instrumental to being able to port the library to Rust.

§Overview

To use the pretty printer, you need to construct an “abstract document” (a Doc in this library). This document contains not just the text you want to format, but also instructions about the different ways that it is valid to format it. pretty-expressive provides a variety of functions and operators to help you construct these documents.

Let’s look at a small example:

let then_expr = text("'even");
let else_expr = text("'odd");
let cond_expr = text("(zero? (mod n 2))");

let one_line = cond_expr.clone() &
               space() &
               then_expr.clone() &
               space() &
               else_expr.clone();

let one_column = align(cond_expr.clone() &
                       nl() &
                       then_expr.clone() &
                       nl() &
                       else_expr.clone());

let if_expr = lparen() &
              text("if ") &
              (one_line | one_column) &
              rparen();

let doc = lparen() &
          text("defn even? (n)") &
          nest(2, nl() & if_expr & rparen());

It may look overwhelming at first, but if you can look past all the clone()s, it’s expressing something pretty simple. This document contains the code to define a small Lisp function to check if a number is even. To create the document:

  1. The three inputs to the if expression are created with text.
  2. We reuse those three documents to describe two different layouts for how those inputs could be formatted. One places all three on the same line, while the other aligns them vertically in a column. The & operator is used to concatenate the smaller documents together with punctuation into a larger document.
  3. The if expression combines the two layouts in one document using the | operator. This gives the printer a choice. It has to choose one of these two layouts to render for this portion of the document. The printer will decide which is more optimal and use that one.
  4. Finally, the final document is built by combining the if expression with other documents to describe the layout of the entire function definition. nest is used to indent the body of the defn (i.e. the if expression) by 2 spaces.

Now that we’ve constructed the document, we can call doc.to_string() to render it to a String.

assert_eq!(doc.to_string(),
r"(defn even? (n)
  (if (zero? (mod n 2)) 'even 'odd))");

The main constraint we can impose on the printer is the page width: the maximum number of characters we want to fit on a single line. Calling to_string() uses a default page width of 80 characters. If we want to use a different page width, we can use the format! macro.

Since there’s plenty of room, the printer opted to put all three if arguments on one line together. If we use a smaller page width, the printer makes a different choice:

assert_eq!(format!("{doc:20}"),
r"(defn even? (n)
  (if (zero? (mod n 2))
      'even
      'odd))");

It might seem contrived to use a width so small: nobody would actually try to format their code into a width of 20 characters! That’s probably true, but imagine this expression appears nested much further into a function, and further right in the line it appears on. The document we created here describes the valid ways for this expression to be formatted no matter where in the overall document it appears.

This means we can write a Rust function that can describe how any if expression should be formatted (this example isn’t far off from doing so already). We can then do the same for every possible construct in our language that needs formatting, composing them together as needed. In the end, we have a program that can format any document in our language, built from simple descriptions of the possible layouts each piece could use.

§A more complete example

I ported this library to Rust because I wanted to build a code formatter for MJL, a Lisp language I’m creating. Other approaches I tried for solving the problem didn’t get me the results I was hoping for, and the code was much more convoluted. The version I built with pretty-expressive handles more cases correctly and is significantly easier to read and reason about.

§Creating abstract documents

The documentation for Doc describes the different functions and operators used to create, wrap, and combine abstract documents to prepare them to be printed.

§Printing a document

The example above used to_string() and format! to turn an abstract document into a String. If you have another destination you want to write to, you can instead use one of the other Rust formatting macros to send the printed document wherever you like.


let doc = text("hello world");
println!("{doc:120}");

You can use the width parameter in the macro to request a particular page width.

§Adjusting the printer’s notion of “optimal”

The pretty printer chooses an optimal layout by assigning a Cost to each layout it explores. By default, it’s designed to optimize first for not exceeding the desired page width and second for minimizing the number of lines produced. In other words, it will try to fit as much as it can on each line, only using more lines when it has no other choice or it would make the document too wide.

You can probably get pretty far using this default, but there are some ways to tweak it if needed:

  • Use the cost function to increase the cost imposed on a particular document, which can cause the printer to choose an alternative layout instead.
  • Replace the CostFactory that the printer uses to assign costs. If you do this, be sure to read the documentation carefully, as there are rules that costs and cost factories must follow for the printer to work correctly.

§Credits

My thanks to the authors of “A Pretty Expressive Printer”:

  • Sorawee Porncharoenwase
  • Justin Pombrio
  • Emina Torlak

This crate is almost entirely based on their ideas and using their implementations as a reference.

Structs§

DefaultCost
The default cost type used for documents.
DefaultCostFactory
The default strategy for assigning costs to layouts.
Doc
An abstract document containing pretty printing instructions.
Error
Error to signal that no printable layout could be found for a document.
PrintResult
The resolved optimal layout for a successful print attempt.

Traits§

Cost
A measurement of the badness of a particular printing layout.
CostFactory
Trait for types that can assign costs to particular layouts of a document.

Functions§

a_append
Concatenates two documents horizontally with alignment.
a_concat
Concatenates several documents horizontally with alignment.
align
Aligns a document.
as_append
Concatenates two documents horizontally with spacing and alignment.
as_concat
Concatenates several documents horizontally with spacing and alignment.
brk
Creates a newline document that flattens to "".
comma
Creates a document that renders the text ",".
concat
Concatenates several documents into one.
cost
Introduces an extra cost to a document.
dquote
Creates a document that renders the text "\"".
fail
Creates a document that fails to render.
flatten
Create a flattened version of a document.
full
Requires that a document not be followed by any more text on the same line.
group
Create a choice between a document and its flattened version.
hard_nl
A newline that cannot be flattened.
lbrace
Creates a document that renders the text "{".
lbrack
Creates a document that renders the text "[".
lparen
Creates a document that renders the text "(".
nest
Increase the indentation level while rendering a document.
newline
Creates a newline document.
nl
Creates a newline document that flattens to " ".
rbrace
Creates a document that renders the text "}".
rbrack
Creates a document that renders the text "]".
reset
Resets the indentation level for a document to 0.
rparen
Creates a document that renders the text ")".
space
Creates a document that renders the text " ".
text
Creates a document that renders a single-line string.
u_append
Concatenates two documents horizontally.
us_append
Concatenates two documents horizontally with spacing.
us_concat
Concatenates several documents horizontally with spacing.
v_append
Concatenates two documents vertically.
v_concat
Concatenates several documents vertically.

Type Aliases§

Result
Result type for pretty-printing operations.