[][src]Crate json5format

A stylized formatter for JSON5 ("JSON for Humans") documents.

The intent of this formatter is to rewrite a given valid JSON5 document, restructuring the output (if required) to conform to a consistent style.

The resulting document should preserve all data precision, data format representations, and semantic intent. Readability should be maintained, if not improved by the consistency within and across documents.

Most importantly, all JSON5 comments should be preserved, maintaining the positional relationship with the JSON5 data elements they were intended to document.

Example

  use json5format::*;
  use maplit::hashmap;
  use maplit::hashset;

  let json5=r##"{
      "name": {
          "last": "Smith",
          "first": "John",
          "middle": "Jacob"
      },
      "children": [
          "Buffy",
          "Biff",
          "Balto"
      ],
      // Consider adding a note field to the `other` contact option
      "contact_options": [
          {
              "home": {
                  "email": "jj@notreallygmail.com",
                  "phone": "212-555-4321"
              },
              "other": {
                  "email": "volunteering@serviceprojectsrus.org"
              },
              "work": {
                  "phone": "212-555-1234",
                  "email": "john.j.smith@worksforme.gov"
              }
          }
      ],
      "address": {
          "city": "Anytown",
          "country": "USA",
          "state": "New York",
          "street": "101 Main Street"
          /* Update schema to support multiple addresses:
             "work": {
                 "city": "Anytown",
                 "country": "USA",
                 "state": "New York",
                 "street": "101 Main Street"
             }
          */
      }
  }
  "##;

  let options = FormatOptions {
      indent_by: 2,
      collapse_containers_of_one: true,
      options_by_path: hashmap! {
          "/*" => hashset! {
              PathOption::PropertyNameOrder(vec![
                  "name",
                  "address",
                  "contact_options",
              ]),
          },
          "/*/name" => hashset! {
              PathOption::PropertyNameOrder(vec![
                  "first",
                  "middle",
                  "last",
                  "suffix",
              ]),
          },
          "/*/children" => hashset! {
              PathOption::SortArrayItems(true),
          },
          "/*/*/*" => hashset! {
              PathOption::PropertyNameOrder(vec![
                  "work",
                  "home",
                  "other",
              ]),
          },
          "/*/*/*/*" => hashset! {
              PathOption::PropertyNameOrder(vec![
                  "phone",
                  "email",
              ]),
          },
      },
      ..Default::default()
  };

  let filename = "new_contact.json5".to_string();

  let format = Json5Format::with_options(options)?;
  let parsed_document = ParsedDocument::from_str(&json5, Some(filename))?;
  let bytes: Vec<u8> = format.to_utf8(&parsed_document)?;

  assert_eq!(std::str::from_utf8(&bytes)?, r##"{
  name: {
    first: "John",
    middle: "Jacob",
    last: "Smith",
  },
  address: {
    city: "Anytown",
    country: "USA",
    state: "New York",
    street: "101 Main Street",

    /* Update schema to support multiple addresses:
       "work": {
           "city": "Anytown",
           "country": "USA",
           "state": "New York",
           "street": "101 Main Street"
       }
    */
  },

  // Consider adding a note field to the `other` contact option
  contact_options: [
    {
      work: {
        phone: "212-555-1234",
        email: "john.j.smith@worksforme.gov",
      },
      home: {
        phone: "212-555-4321",
        email: "jj@notreallygmail.com",
      },
      other: { email: "volunteering@serviceprojectsrus.org" },
    },
  ],
  children: [
    "Balto",
    "Biff",
    "Buffy",
  ],
}
"##);

Formatter Actions

When the options above are applied to the input, the formatter will make the following changes:

  • The formatted document will be indented by 2 spaces.
  • Quotes are removed from all property names (since they are all legal ECMAScript identifiers)
  • The top-level properties will be reordered to [name, address, contact_options]. Since property name children was not included in the sort order, it will be placed at the end.
  • The name properties will be reordered to [first, middle, last].
  • The properties of the unnamed object in array contact_options will be reordered to [work, home, other].
  • The properties of the work, home, and other objects will be reordered to [phone, email].
  • The children names array of string primitives will be sorted.
  • All elements (except the top-level object, represented by the outermost curly braces) will end with a comma.
  • Since the contact_options descendant element other has only one property, the other object structure will collapse to a single line, with internal trailing comma suppressed.
  • The line comment will retain its relative position, above contact_options.
  • The block comment will retain its relative position, inside and at the end of the address object.

Formatter Behavior Details

For reference, the following sections detail how the JSON5 formatter verifies and processes JSON5 content.

Syntax Validation

  • Structural syntax is checked, such as validating matching braces, property name-colon-value syntax, enforced separation of values by commas, properly quoted strings, and both block and line comment extraction.
  • Non-string literal value syntax is checked (null, true, false, and the various legal formats for JSON5 Numbers).
  • Syntax errors produce error messages with the line and column where the problem was encountered.

Property Names

  • Duplicate property names are retained, but may constitute errors in higher-level JSON5 parsers or schema-specific deserializers.
  • All JSON5 unquoted property name characters are supported, including '$' and '_'. Digits are the only valid property name character that cannot be the first character. Property names can also be represented as quoted strings. All valid JSON5 strings, if quoted, are valid property names (including multi-line strings and quoted numbers).

Example:

    $_meta_prop: 'Has "double quotes" and \'single quotes\' and \
multiple lines with escaped \\ backslash',

Literal Values

  • JSON5 supports quoting strings (literal values or quoted property names) by either double (") or single (') quote. The formatter does not change the quotes. Double-quoting is conventional, but single quotes may be used when quoting strings containing double-quotes, and leaving the single quotes as-is is preferred.
  • JSON5 literal values are retained as-is. Strings retain all spacing characters, including escaped newlines. All other literals (unquoted tokens without spaces, such as false, null, 0.234, 1337, or l33t) are not interpreted syntactically. Other schema-based tools and JSON5 deserializers may flag these invalid values.

Optional Sorting

  • By default, array items and object properties retain their original order. (Some JSON arrays are order-dependent, and sorting them indiscriminantly might change the meaning of the data.)
  • The formatter can automatically sort array items and object properties if enabled via FormatOptions:
    • To sort all arrays in the document, set FormatOptions.sort_array_items to true
    • To sort only specific arrays in the target schema, specify the schema location under FormatOptions.options_by_path, and set its SortArrayItems option.
    • Properties are sorted based on an explicit user-supplied list of property names in the preferred order, for objects at a specified path. Specify the object's location in the target schema using FormatOptions.options_by_path, and provide a vector of property name strings with the PropertyNameOrder option. Properties not included in this option retain their original order, behind the explicitly ordered properties, if any.
  • When sorting array items, the formatter only sorts array item literal values (strings, numbers, bools, and null). Child arrays or objects are left in their original order, after sorted literals, if any, within the same array.
  • Array items are sorted in case-insensitive unicode lexicographic order. (Note that, since the formatter does not parse unquoted literals, number types cannot be sorted numerically.) Items that are case-insensitively equal are re-compared and ordered case-sensitively with respect to each other.

Associated Comments

  • All comments immediately preceding an element (value or start of an array or object), and trailing line comments on the same line as the element, are retained and move with the associated item if the item is repositioned during sorting.
  • All line and block comments are retained. Typically, the comments are re-aligned vertically (indented) with the values with which they were associated.
  • A single line comment appearing immediately after a JSON value (primitive or closing brace), on the same line, will remain appended to that value on its line after re-formatting.
  • Spaces separate block comments from blocks of contiguous line comments associated with the same entry.
  • Comments at the end of a list (after the last property or item) are retained at the end of the same list.
  • Block comments with lines that extend to the left of the opening "/*" are not re-aligned.

Whitespace Handling

  • Unicode characters are allowed, and unicode space characters should retain their meaning according to unicode standards.
  • All spaces inside single- or multi-line strings are retained. All spaces in comments are retained except trailing spaces at the end of a line.
  • All other original spaces are removed.

Macros

test_error

Create a TestFailure error including the source file location of the macro call.

Structs

FormatOptions

Options that change the style of the formatted JSON5 output.

Json5Format

A JSON5 formatter that parses a valid JSON5 input buffer and produces a new, formatted document.

Location

A location within a document buffer or document file. This module uses Location to identify to refer to locations of JSON5 syntax errors, while parsing) and also to locations in this Rust source file, to improve unit testing output.

ParsedDocument

Represents the parsed state of a given JSON5 document.

Enums

Error

Errors produced by the json5format library.

PathOption

Options that can be applied to specific objects or arrays in the target JSON5 schema, through FormatOptions.options_by_path. Each option can be set at most once per unique path.

Functions

format

Format a JSON5 document, applying a consistent style, with given options.