Crate biome_analyze
source ·Expand description
§Analyzer
The analyzer is a generic crate aimed to implement a visitor-like infrastructure, where it’s possible to inspect a piece of AST and emit diagnostics or actions based on a static check.
§Folder structure
First, you need to identify the crate where you want to implement the rule.
If the rule is going to be implemented for the JavaScript language (and its super languages),
then the rule will be implemented inside the biome_js_analyze
crate.
Rules are divided by capabilities:
-
analyzers/
folder contains rules that don’t require any particular capabilities, via theAst<>
query type; -
semantic_analyzer/
folder contains rules that require the use of the semantic model, viaSemantic<>
query type; -
aria_analyzers/
folder contains rules that require the use of ARIA metadata, viaAria<>
query type; -
assists/
folder contains rules that contribute to refactor code, with not associated diagnostics. These are rules that are usually meant for editors and IDEs.
Most of the rules will go under analyzers/
or semantic_analyzer/
.
Inside these four folders, we have a folder for each group that Biome supports.
When implementing new rules, they have to be implemented under the group nursery
.
New rules should always be considered unstable/not exhaustive.
In addition to selecting a group, rules may be flagged as recommended
.
The recommended rules are enabled in the default configuration of the Biome linter.
As a general principle, a recommended rule should catch actual programming errors.
For instance, detecting a coding pattern that will throw an exception at runtime.
Pedantic rules that check for specific unwanted patterns but may have high false positive rates,
should be left off from the recommended set.
Rules intended to be recommended should be flagged as such even if they are still part of the nursery
group,
as unstable rules are only enabled by default on unstable builds.
This gives the project time to test the rule, find edge cases, etc.
§Lint rules
When creating or updating a lint rule, you need to be aware that there’s a lot of generated code inside our toolchain. Our CI ensures that this code is not out of sync and fails otherwise. See the code generation section for more details.
To create a new rule, you have to create and update several files. Because it is a bit tedious, Biome provides an easy way to create and test your rule using Just. Just is not part of the rust toolchain, you have to install it with a package manager.
§Choose a name
Biome follows a naming convention according to what the rule do:
-
Forbid a concept
no<Concept>
When a rule’s sole intention is to forbid a single concept - such as disallowing the use of
debugger
statements - the rule should be named using theno
prefix. For example, the rule to disallow the use ofdebugger
statements is namednoDebugger
. -
Mandate a concept
use<Concept>
When a rule’s sole intention is to mandate a single concept - such as forcing the use of camel-casing - the rule should be named using the
use
prefix. For example, the rule to mandating the use of camel-cased variable names is nameduseCamelCase
.
§What a rule should say to the user
A rule should be informative to the user, and give as much explanation as possible.
When writing a rule, you must adhere to the following pillars:
- Explain to the user the error. Generally, this is the message of the diagnostic.
- Explain to the user why the error is triggered. Generally, this is implemented with an additional node.
- Tell the user what they should do. Generally, this is implemented using a code action. If a code action is not applicable a note should tell the user what they should do to fix the error.
§Create and implement the rule
Let’s say we want to create a new rule called myRuleName
, which uses the semantic model.
-
Run the command
just new-lintrule crates/biome_js_analyze/src/semantic_analyzers/nursery myRuleName
Rules go in different folders, and the folder depend on the type of query system your rule will use:
type Query = Ast<>
->analyzers/
foldertype Query = Semantic<>
->semantic_analyzers/
foldertype Query = SemanticServices
->semantic_analyzers/
foldertype Query = Aria<>
->aria_analyzers
foldertype Query = ControlFlowGraph
->analyzers/
folder
The core team will help you out if you don’t get the folder right. Using the incorrect folder won’t break any code.
-
The
Query
needs to have theSemantic
type, because we want to have access to the semantic model.Query
tells the engine on which AST node we want to trigger the rule. -
The
State
type doesn’t have to be used, so it can be considered optional. However, it has to be defined astype State = ()
. -
Implement the
run
function:This function is called every time the analyzer finds a match for the query specified by the rule, and may return zero or more “signals”.
-
Implement the
diagnostic
function, to tell the user where’s the error and why:ⓘimpl Rule for UseAwesomeTricks { // .. code fn diagnostic(_ctx: &RuleContext<Self>, _state: &Self::State) -> Option<RuleDiagnostic> {} }
While implementing the diagnostic, please keep Biome’s technical principals in mind. This function is called for every signal emitted by the
run
function, and it may return zero or one diagnostic. -
Implement the optional
action
function, if we are able to provide automatic code fix to the rule:ⓘimpl Rule for UseAwesomeTricks { // .. code fn action(_ctx: &RuleContext<Self>, _state: &Self::State) -> Option<JsRuleAction> {} }
This function is called for every signal emitted by the
run
function. It may return zero or one code action. Rules can return a code action that can be safe or unsafe. If a rule returns a code action, you must addfix_kind
to the macrodeclare_rule
.ⓘuse biome_analyze::FixKind; declare_rule!{ fix_kind: FixKind::Safe, }
When returning a code action, you must pass the
category
and theapplicability
fields.category
must beActionCategory::QuickFix
.applicability
is eitherApplicability:MaybeIncorrect
orApplicability:Always
.Applicability:Always
must only be used when the code transformation is safe. In other words, the code transformation should always result in code that does no change the behavior of the code. In the case ofnoVar
, it is not always safe to turnvar
toconst
orlet
.
Don’t forget to format your code with cargo format
and lint with cargo lint
.
That’s it! Now, let’s test the rule.
§Test the rule
Inside the tests/specs/
folder, rules are divided by group and rule name.
The test infrastructure is rigid around the association of the pair “group/rule name”, which means that
your test cases are placed inside the wrong group, you won’t see any diagnostics.
Since each new rule will start from nursery
, that’s where we start.
If you used just new-lintrule
, a folder that use the name of the rule should exist.
Otherwise, create a folder called myRuleName/
, and then create one or more files where you want to create different cases.
A common pattern is to create files prefixed by invalid
or valid
.
The files prefixed by invalid
contain code that are reported by the rule.
The files prefixed by valid
contain code that are not reported by the rule.
Files ending with the extension .jsonc
are differently handled.
These files should contain an array of strings where each string is a code snippet.
For instance, for the rule noVar
, the file invalidScript.jsonc
contains:
["var x = 1; foo(x);", "for (var x of [1,2,3]) { foo(x); }"]
Note that code in a file ending with the extension .jsonc
are in a script environment.
This means that you cannot use syntax that belongs to ECMAScript modules such as import
and export
.
Run the command
just test-lintrule myRuleName
and if you’ve done everything correctly, you should see some snapshots emitted with diagnostics and code actions.
Check our main contribution document to know how to deal with the snapshot tests.
§Promote a rule
Promoting a rule when is stable can be a tedious work. Internally, we have a script that does that for you:
just promote-rule noConsoleLog style
The first argument is the name of the rule, in camel case. The second argument is the name of the group where you’re promoting the rule to.
The script will run some checks and some other script for you.
You’re now ready to commit the changes using git
!
§Document the rule
The documentation needs to adhere to the following rules:
- The first paragraph of the documentation is used as brief description of the rule, and it must be written in one single line. Breaking the paragraph in multiple lines will break the table content of the rules page.
- The next paragraphs can be used to further document the rule with as many details as you see fit.
- The documentation must have a
## Examples
header, followed by two headers:### Invalid
and### Valid
.### Invalid
must go first because we need to show when the rule is triggered. - Each code block must have a language defined.
- When adding invalid snippets in the
### Invalid
section, you must use theexpect_diagnostic
code block property. We use this property to generate a diagnostic and attach it to the snippet. A snippet must emit only ONE diagnostic. - When adding valid snippets in the
### Valid
section, you can use one single snippet. - You can use the code block property
ignore
to tell the code generation script to not generate a diagnostic for an invalid snippet.
Here’s an example of how the documentation could look like:
use biome_analyze::declare_rule;
declare_rule! {
/// Disallow the use of `var`.
///
/// _ES2015_ allows to create variables with block scope instead of function scope
/// using the `let` and `const` keywords.
/// Block scope is common in many other programming languages and help to avoid mistakes.
///
/// Source: https://eslint.org/docs/latest/rules/no-var
///
/// ## Examples
///
/// ### Invalid
///
/// ```js,expect_diagnostic
/// var foo = 1;
/// ```
///
/// ```js,expect_diagnostic
/// var bar = 1;
/// ```
///
/// ### Valid
///
/// ```js
/// const foo = 1;
/// let bar = 1;
///```
pub(crate) NoVar {
version: "next",
name: "noVar",
recommended: false,
}
}
This will cause the documentation generator to ensure the rule does emit exactly one diagnostic for this code, and to include a snapshot for the diagnostic in the resulting documentation page.
§Code generation
For simplicity, use just
to run all the commands with:
just gen-lint
This command runs several sub-commands:
-
cargo codegen-configuration
, this command must be run first and, it will update the configuration; -
cargo lintdoc
, it will update the website with the documentation of the rules, checkdeclare_rule
for more information about it; -
cargo codegen-bindings
, it will update the TypeScript types released inside the JS APIs; -
cargo codegen-schema
, it will update the JSON Schema file of the configuration, used by the npm packages.
§Commit your work
Once the rule implemented, tested, and documented, you are ready to open a pull request!
Stage and commit your changes:
> git add -A
> git commit -m 'feat(biome_js_analyze): myRuleName'
To test if everything is ready, run the following command:
just ready
§Rule configuration
Some rules may allow customization using options. We try to keep rule options to a minimum and only when needed. Before adding an option, it’s worth a discussion. Options should follow our technical philosophy.
Let’s assume that the rule we implement support the following options:
behavior
: a string among"A"
,"B"
, and"C"
;threshold
: an integer between 0 and 255;behaviorExceptions
: an array of strings.
We would like to set the options in the biome.json
configuration file:
{
"linter": {
"rules": {
"recommended": true,
"nursery": {
"my-rule": {
"behavior": "A",
"threshold": 30,
"behaviorExceptions": ["f"],
}
}
}
}
}
The first step is to create the Rust data representation of the rule’s options.
use biome_deserializable_macros::Deserializable;
#[derive(Clone, Debug, Default, Deserializable)]
pub struct MyRuleOptions {
behavior: Behavior,
threshold: u8,
behavior_exceptions: Vec<String>
}
#[derive(Clone, Debug, Default, Deserializable)]
pub enum Behavior {
#[default]
A,
B,
C,
}
To allow deserializing instances of the types MyRuleOptions
and Behavior
,
they have to implement the Deserializable
trait from the biome_deserialize
crate.
This is what the Deserializable
keyword in the #[derive]
statements above did.
It’s a so-called derive macros, which generates the implementation for the Deserializable
trait
for you.
With these types in place, you can set the associated type Options
of the rule:
impl Rule for MyRule {
type Query = Semantic<JsCallExpression>;
type State = Fix;
type Signals = Vec<Self::State>;
type Options = MyRuleOptions;
...
}
A rule can retrieve its options with:
let options = ctx.options();
The compiler should warn you that MyRuleOptions
does not implement some required types.
We currently require implementing serde’s traits Deserialize
/Serialize
.
You can simply use a derive macros:
#[derive(Debug, Default, Clone, Serialize, Deserialize)]
#[cfg_attr(feature = "schemars", derive(JsonSchema))]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
pub struct MyRuleOptions {
#[serde(default, skip_serializing_if = "is_default")]
main_behavior: Behavior,
#[serde(default, skip_serializing_if = "is_default")]
extra_behaviors: Vec<Behavior>,
}
#[derive(Debug, Default, Clone)]
#[cfg_attr(feature = "schemars", derive(JsonSchema))]
pub enum Behavior {
#[default]
A,
B,
C,
}
§Deprecate a rule
There are occasions when a rule must be deprecated, to avoid breaking changes. The reason of deprecation can be multiple.
In order to do, the macro allows adding additional field to add the reason for deprecation
use biome_analyze::declare_rule;
declare_rule! {
/// Disallow the use of `var`.
///
/// ## Examples
///
/// ### Invalid
///
/// ```js,expect_diagnostic
/// var a, b;
/// ```
pub(crate) NoVar {
version: "1.0.0",
name: "noVar",
deprecated: "Use the rule `noAnotherVar`",
recommended: false,
}
}
§Custom Visitors
Some lint rules may need to deeply inspect the child nodes of a query match
before deciding on whether they should emit a signal or not. These rules can be
inefficient to implement using the query system, as they will lead to redundant
traversal passes being executed over the same syntax tree. To make this more
efficient, you can implement a custom Queryable
type and associated
Visitor
to emit it as part of the analyzer’s main traversal pass. As an
example, here’s how this could be done to implement the useYield
rule:
// First, create a visitor struct that holds a stack of function syntax nodes and booleans
#[derive(Default)]
struct MissingYieldVisitor {
stack: Vec<(AnyFunctionLike, bool)>,
}
// Implement the `Visitor` trait for this struct
impl Visitor for MissingYieldVisitor {
type Language = JsLanguage;
fn visit(
&mut self,
event: &WalkEvent<SyntaxNode<Self::Language>>,
mut ctx: VisitorContext<Self::Language>,
) {
match event {
WalkEvent::Enter(node) => {
// When the visitor enters a function node, push a new entry on the stack
if let Some(node) = AnyFunctionLike::cast_ref(node) {
self.stack.push((node, false));
}
if let Some((_, has_yield)) = self.stack.last_mut() {
// When the visitor enters a `yield` expression, set the
// `has_yield` flag for the top entry on the stack to `true`
if JsYieldExpression::can_cast(node.kind()) {
*has_yield = true;
}
}
}
WalkEvent::Leave(node) => {
// When the visitor exits a function, if it matches the node of the top-most
// entry of the stack and the `has_yield` flag is `false`, emit a query match
if let Some(exit_node) = AnyFunctionLike::cast_ref(node) {
if let Some((enter_node, has_yield)) = self.stack.pop() {
debug_assert_eq!(enter_node, exit_node);
if !has_yield {
ctx.match_query(MissingYield(enter_node));
}
}
}
}
}
}
}
// Declare a query match struct type containing a JavaScript function node
pub(crate) struct MissingYield(AnyFunctionLike);
impl QueryMatch for MissingYield {
fn text_range(&self) -> TextRange {
self.0.range()
}
}
// Implement the `Queryable` trait for this type
impl Queryable for MissingYield {
// `Input` is the type that `ctx.match_query()` is called with in the visitor
type Input = Self;
type Language = JsLanguage;
// `Output` if the type that `ctx.query()` will return in the rule
type Output = AnyFunctionLike;
type Services = ();
fn build_visitor(
analyzer: &mut impl AddVisitor<Self::Language>,
_: &<Self::Language as Language>::Root,
) {
// Register our custom visitor to run in the `Syntax` phase
analyzer.add_visitor(Phases::Syntax, MissingYieldVisitor::default);
}
// Extract the output object from the input type
fn unwrap_match(services: &ServiceBag, query: &Self::Input) -> Self::Output {
query.0.clone()
}
}
impl Rule for UseYield {
// Declare the custom `MissingYield` queryable as the rule's query
type Query = MissingYield;
fn run(ctx: &RuleContext<Self>) -> Self::Signals {
// Read the function's root node from the queryable output
let query: &AnyFunctionLike = ctx.query();
// ...
}
}
§declare_rule
This macro is used to declare an analyzer rule type, and implement the RuleMeta trait for it.
The macro itself expect the following syntax:
use biome_analyze::declare_rule;
declare_rule! {
/// Documentation
pub(crate) ExampleRule {
version: "next",
name: "myRuleName",
recommended: false,
}
}
§Category Macro
Declaring a rule using declare_rule!
will cause a new rule_category!
macro to be declared in the surrounding module. This macro can be used to
refer to the corresponding diagnostic category for this lint rule, if it
has one. Using this macro instead of getting the category for a diagnostic
by dynamically parsing its string name has the advantage of statically
injecting the category at compile time and checking that it is correctly
registered to the biome_diagnostics
library.
declare_rule! {
/// Documentation
pub(crate) ExampleRule {
version: "next",
name: "myRuleName",
recommended: false,
}
}
impl Rule for ExampleRule {
fn diagnostic(ctx: &RuleContext<Self>, _state: &Self::State) -> Option<RuleDiagnostic> {
Some(RuleDiagnostic::new(
rule_category!(),
ctx.query().text_trimmed_range(),
"message",
))
}
}
§Semantic Model
The semantic model provides information about the references of a binding (variable) within a program, indicating if it is written (e.g., const a = 4
), read (e.g., const b = a
, where a
is read), or exported.
§How to use the query Semantic<>
in a lint rule
We have a for loop that creates an index i, and we need to identify where this index is used inside the body of the loop
for (let i = 0; i < array.length; i++) {
array[i] = i
}
To get started we need to create a new rule using the semantic type type Query = Semantic<JsForStatement>;
We can now use the ctx.model()
to get information about bindings in the for loop.
impl Rule for ForLoopCountReferences {
type Query = Semantic<JsForStatement>;
type State = ();
type Signals = Option<Self::State>;
type Options = ();
fn run(ctx: &RuleContext<Self>) -> Self::Signals {
let node = ctx.query();
// The model holds all informations about the semantic, like scopes and declarations
let model = ctx.model();
// Here we are extracting the `let i = 0;` declaration in for loop
let initializer = node.initializer()?;
let declarators = initializer.as_js_variable_declaration()?.declarators();
let initializer = declarators.first()?.ok()?;
let initializer_id = initializer.id().ok()?;
// Now we have the binding of this declaration
let binding = initializer_id
.as_any_js_binding()?
.as_js_identifier_binding()?;
// How many times this variable appers in the code
let count = binding.all_references(model).count();
// Get all read references
let readonly_references = binding.all_reads(model);
// Get all write references
let write_references = binding.all_writes(model);
}
}
Re-exports§
pub use crate::options::AnalyzerConfiguration;
pub use crate::options::AnalyzerOptions;
pub use crate::options::AnalyzerRules;
Modules§
Macros§
- The
category_concat!
macro is a variant ofcategory!
using a slightly different syntax, for use in thedeclare_group
anddeclare_rule
macros in the analyzer - This macro is used by the codegen script to declare an analyzer rule group, and implement the RuleGroup trait for it
- This macro is used to declare an analyzer rule type, and implement the
- Creates a single struct implementing Visitor over a collection of type implementing the NodeVisitor helper trait. Unlike the global Visitor, node visitors are transient: they get instantiated when the traversal enters the corresponding node and destroyed when the node is exited. They are intended as a building blocks for creating and managing the state of complex visitors by allowing the implementation to be split over multiple smaller components.
Structs§
- Allows filtering the list of rules that will be executed in a run of the analyzer, and at what source code range signals (diagnostics or actions) may be raised
- The analyzer is the main entry point into the
biome_analyze
infrastructure. Its role is to run a collection of Visitors over a syntax tree, with each visitor implementing various analysis over this syntax tree to generate auxiliary data structures as well as emit “query match” events to be processed by lint rules and in turn emit “analyzer signals” in the form of diagnostics, code actions or both - Code Action object returned by the analyzer, generated from a crate::RuleAction with additional information about the rule injected by the analyzer
- Small wrapper for diagnostics during the analysis phase.
- Query type usable by lint rules to match on specific AstNode types
- Simple implementation of AnalyzerSignal generating a AnalyzerDiagnostic from a provided factory function. Optionally, this signal can be configured to also emit a code action, by calling
.with_action
with a secondary factory function for said action. - Adapter type wrapping a QueryMatcher type with a function that can be used to inspect the query matches emitted by the analyzer
- Parameters provided to QueryMatcher::match_query and require to run lint rules
- Stores metadata information for all the rules in the registry, sorted alphabetically
- Metadata entry for a rule and its group in the registry
- Code Action object returned by a single analysis rule
- Diagnostic object returned by a single analysis rule
- Opaque identifier for a single rule
- Static metadata containing information about a rule
- The rule registry holds type-erased instances of all active analysis rules for each phase. What defines a phase is the set of services that a phase offers. Currently we have:
- Set of nodes this rule has suppressed from matching its query
- Entry for a pending signal in the
signal_queue
- An action meant to suppress a lint rule
- Payload received by the function responsible to mark a suppression comment
- The SyntaxVisitor is the simplest form of visitor implemented for the analyzer, it simply broadcast each WalkEvent::Enter as a query match event for the SyntaxNode being entered
- Mutable context objects shared by all visitors
- Mutable context objects provided to the finish hook of visitors
Enums§
- The category of a code action, this type maps directly to the CodeActionKind type in the Language Server Protocol specification
- Used to identify the kind of code action emitted by a rule
- Utility type to be used as a default value for the
B
generic type onanalyze
when the provided callback never breaks - Defines all the phases that the RuleRegistry supports.
- Represents which type a given Queryable type can match, either a specific subset of syntax node kinds or any generic type
- The sub-category of a refactor code action
- Allow filtering a single rule or group of rules by their names
- The sub-category of a source code action
- This enum is used to categorize what is disabled by a suppression comment and with what syntax
Traits§
- This trait is implemented on all types that supports the registration of Visitor
- Event raised by the analyzer when a Rule emits a diagnostic, a code action, or both
- This trait is implemented for tuples of Rule types of size 1 to 29 if the language of all the groups in the tuple share the same associated Language (which is then aliased as the
Language
associated type on CategoryLanguage itself). It is used to ensure all the groups in a given category are all querying the same underlying language - A group category is a collection of rule groups under a given category ID, serving as a broad classification on the kind of diagnostic or code action these rule emit, and allowing whole categories of rules to be disabled at once depending on the kind of analysis being performed
- This trait is implemented for tuples of Rule types of size 1 to 29 if the query type of all the rules in the tuple share the same associated Language (which is then aliased as the
Language
associated type on GroupLanguage itself). It is used to ensure all the rules in a given group are all querying the same underlying language - A node visitor is a special kind of visitor that does not have a persistent state for the entire run of the analyzer. Instead these visitors are transient, they get instantiated when the traversal enters the corresponding node type and destroyed when the corresponding node exits
- Defines which phase a rule will run. This will be defined by the set of services a rule demands.
- Marker trait implemented for all the types analyzer visitors may emit
- The QueryMatcher trait is responsible of running lint rules on QueryMatch instances emitted by the various Visitor and push signals wrapped in SignalEntry to the signal queue
- Trait implemented for types that lint rules can query in order to emit diagnostics or code actions.
- Trait implemented by all analysis rules: declares interest to a certain AstNode type, and a callback function to be executed on all nodes matching the query to possibly raise an analysis event
- A rule group is a collection of rules under a given name, serving as a “namespace” for lint rules and allowing the entire set of rules to be disabled at once
- Visitors are the main building blocks of the analyzer: they receive syntax WalkEvents, process these events to build secondary data structures from the syntax tree, and emit rule query matches through the crate::RuleRegistry