Expand description

Types for extracting and representing comments of a syntax tree.

Most programming languages support comments allowing programmers to document their programs. Comments are different from other syntaxes because programming languages allow comments in almost any position, giving programmers great flexibility on where they can write comments:

/**
 * Documentation comment
 */
async /* comment */ function Test () // line comment
{/*inline*/}

However, this flexibility makes formatting comments challenging because:

  • The formatter must consistently place comments so that re-formatting the output yields the same result and does not create invalid syntax (line comments).
  • It is essential that formatters place comments close to the syntax the programmer intended to document. However, the lack of rules regarding where comments are allowed and what syntax they document requires the use of heuristics to infer the documented syntax.

This module strikes a balance between placing comments as closely as possible to their source location and reducing the complexity of formatting comments. It does so by associating comments per node rather than a token. This greatly reduces the combinations of possible comment positions but turns out to be, in practice, sufficiently precise to keep comments close to their source location.

§Node comments

Comments are associated per node but get further distinguished on their location related to that node:

§Leading Comments

A comment at the start of a node

// Leading comment of the statement
console.log("test");

[/* leading comment of identifier */ a ];

§Dangling Comments

A comment that is neither at the start nor the end of a node

[/* in between the brackets */ ];
async  /* between keywords */  function Test () {}

§Trailing Comments

A comment at the end of a node

[a /* trailing comment of a */, b, c];
[
    a // trailing comment of a
]

§Limitations

Limiting the placement of comments to leading, dangling, or trailing node comments reduces complexity inside the formatter but means, that the formatter’s possibility of where comments can be formatted depends on the AST structure.

For example, the continue statement in JavaScript is defined as:

JsContinueStatement =
'continue'
(label: 'ident')?
';'?

but a programmer may decide to add a comment in front or after the label:

continue /* comment 1 */ label;
continue label /* comment 2*/; /* trailing */

Because all children of the continue statement are tokens, it is only possible to make the comments leading, dangling, or trailing comments of the continue statement. But this results in a loss of information as the formatting code can no longer distinguish if a comment appeared before or after the label and, thus, has to format them the same way.

This hasn’t shown to be a significant limitation today but the infrastructure could be extended to support a label on SourceComment that allows to further categorise comments.

Structs§

  • The comments of a syntax tree stored by node.
  • A comment decorated with additional information about its surrounding context in the source document.
  • Formats a comment as it was in the source document
  • A comment in the source document.

Enums§

Traits§

Functions§

  • Returns true if comment is a multi line block comment where each line starts with a star (*). These comments can be formatted to always have the leading stars line up in a column.
  • TODO: This is really JS-specific logic, both in syntax and semantics. It should probably be moved to biome_js_formatter when possible, but is currently tied to other behavior about formatting sets of comments (which might also be best to move as well, since it relates to the same specific behavior).