ts_query_ls 3.14.0

An LSP implementation for Tree-sitter's query files
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
# An LSP implementation for [tree-sitter]https://tree-sitter.github.io/tree-sitter/ query files

<!-- vim: set spell: -->

## Configuration

Configuration can be done via server initialization or via a configuration file
named `.tsqueryrc.json` located in the project workspace directory, or in any of
its ancestor directories. Below is an example file:

```json
{
  "$schema": "https://raw.githubusercontent.com/ribru17/ts_query_ls/refs/heads/master/schemas/config.json",
  "parser_install_directories": ["${HOME}/my/parser", "/installation/paths"],
  "parser_aliases": {
    "ecma": "javascript"
  },
  "language_retrieval_patterns": [
    "languages/src/([^/]+)/[^/]+\\.scm$"
  ],
  "valid_captures": {
    "highlights": {
      "variable": "Simple identifiers",
      "variable.parameter": "Parameters of a function"
    }
  },
  "supported_abi_versions": {
    "start": 13,
    "end": 15
  },
  "valid_predicates": {
    "eq": {
      "parameters": [
        {
          "type": "capture",
          "arity": "required"
        },
        {
          "type": "any",
          "arity": "required"
        }
      ],
      "description": "Checks for equality between two nodes, or a node and a string.",
      "any": true
    }
  }
}
```

### Configuration options

#### `parser_install_directories`

A list of strings representing directories to search for parsers, of the form
`<lang>.(so|dll|dylib|wasm)` or `tree-sitter-<lang>.(so|dll|dylib|wasm)`.

Supports environment variable expansion of the form `${VAR}`.

**NOTE:** Directories are **NOT** searched recursively. Only immediate children
will be scanned. If you have the sort of file structure where each parser object
is stored in its own directory, consider creating one main directory which
contains symlinks to all of your parsers, and pass that as your parser install
directory.

#### `parser_aliases`

A map of parser aliases. E.g., to point `queries/ecma/*.scm` files to the
`javascript` parser:

```json
{
  "parser_aliases": {
    "ecma": "javascript"
  }
}
```

#### `language_retrieval_patterns`

A list of patterns to aid the LSP in finding a language, given a file path.
Patterns must have one capture group which represents the language name. Ordered
from highest to lowest precedence. E.g., for `zed` support:

```json
{
  "language_retrieval_patterns": [
    "languages/src/([^/]+)/[^/]+\\.scm$"
  ]
}
```

**NOTE:** The following fallbacks are _always_ provided:

- `tree-sitter-([^/]+)/queries/[^/]+\.scm$`
- `queries/([^/]+)/[^/]+\.scm$`

#### `diagnostic_options`

An optional object specifying diagnostic style preferences. Currently supported
options are:

- `string_argument_style`
  - The style for predicate string arguments
  - Default: `none`
  - Possible values:
    - `none`
    - `prefer_quoted`
    - `prefer_unquoted`
- `warn_unused_underscore_captures`
  - Whether to warn on `_`-prefixed captures which are not referenced by a
    predicate or directive
  - Default: `true`

#### `valid_captures`

A map from query file name to valid captures. Valid captures are represented as
a map from capture name (sans `@`) to a short (markdown format) description.
Note that captures prefixed with an underscore are always permissible.

```json
{
  "valid_captures": {
    "highlights": {
      "variable": "Simple identifiers",
      "variable.parameter": "Parameters of a function"
    }
  }
}
```

#### `valid_predicates`

A map of predicate names (sans `#` and `?`) to parameter specifications.

Parameters can be one or both of two types (a capture or a string), and can be
required, optional, or "variadic" (there can be zero-to-many of them). Optional
parameters cannot be followed by required parameters, and a variadic parameter
may only appear once, as the last parameter.

Parameters can also be given **constraints** which are checked when they are
string values (not captures). The optional `constraint` field accepts the
following values:

- `none`: no constraint enforced (default)
- `integer`: parameter must be a valid integer
- `named_node`: parameter must be a named node kind
- `enum`: parameter must be one of the specified values

```json
{
  "valid_predicates": {
    "any-of": {
      "parameters": [
        {
          "type": "capture",
          "arity": "required"
        },
        {
          "type": "string",
          "arity": "required",
          "constraint": {
            "enum": ["0", "1"]
          }
        },
        {
          "type": "string",
          "arity": "variadic",
          "constraint": "integer"
        }
      ],
      "description": "Checks for equality between multiple strings"
    }
  }
}
```

Predicates are special because they can also accept two other properties: `not`
(`boolean`, default `true`), and `any` (`boolean`, default `false`). `not` means
that the predicate supports a `not-` prefixed version of itself, which acts as
its negation, and `any` means that is supports an `any-` prefixed version of
itself, which holds true if any of the nodes in a quantified capture hold true.
If both properties are `true`, then there will be a predicate of the form
`#not-any-foo?`.

#### `valid_directives`

Same as `valid_predicates`, but for directives (e.g. `#foo!`).

#### `supported_abi_versions`

An inclusive range of ABI versions supported by your tool. The end of the range
must be greater than or equal to the start.

```json
{
  "supported_abi_versions": {
    "start": 13,
    "end": 15
  }
}
```

### Example setup (for Neovim):

```lua
-- Disable the (slow) builtin query linter
vim.g.query_lint_on = {}

vim.api.nvim_create_autocmd('FileType', {
  pattern = 'query',
  callback = function(ev)
    if vim.bo[ev.buf].buftype == 'nofile' then
      return
    end
    vim.lsp.start {
      name = 'ts_query_ls',
      cmd = { '/path/to/ts_query_ls/target/release/ts_query_ls' },
      root_dir = vim.fs.root(0, { '.tsqueryrc.json', 'queries' }),
      -- OPTIONAL: Override the query omnifunc
      on_attach = function(_, buf)
        vim.bo[buf].omnifunc = 'v:lua.vim.lsp.omnifunc'
      end,
      init_options = {
        parser_install_directories = {
          -- If using nvim-treesitter with lazy.nvim
          vim.fs.joinpath(
            vim.fn.stdpath('data'),
            '/lazy/nvim-treesitter/parser/'
          ),
        },
        parser_aliases = {
          ecma = 'javascript',
        },
        language_retrieval_patterns = {
          'languages/src/([^/]+)/[^/]+\\.scm$',
        },
      },
    }
  end,
})
```

### Run in VSCode

This repo provides a very minimal extension for usage within VSCode. First,
ensure you have a `scheme` extension installed for VSCode to recognize `.scm`
files as the `scheme` file type. Next, enter the `client/vscode` directory and
install dependencies with `npm i`. From that directory, start VSCode (`code .`).
Then you can press `F5`, or go into the debug menu and click `Run Extension`,
and this will activate the extension, building and starting the language server.

## Features

- Go to definition, references, renaming for captures
  - Captures, unlike node names, are treated like "variables" in a sense. They
    have definitions and references, and can be conveniently renamed. See the
    example below:

    ```query
    (function_definition
      name: (identifier) @function.builtin ; this is a capture *definition*
      (#eq? @function.builtin "print")) ; this is a capture *reference*
    ```
- Completions for valid node names, field names, and allowable captures and
  predicate/directives
  - Node and field names are determined by the installed language object, while
    allowable capture and predicate/directive names are specified in the
    language server configuration.
- Diagnostics for impossible patterns, invalid node names, invalid syntax, etc.
  - This language server strives for 1:1 parity with tree-sitter's query errors,
    to catch issues before they happen. If you notice a query error that was not
    caught by the server (or a false positive from the server), please report an
    issue!
- Formatting and analysis of query workspaces (see the
  [standalone tool section]#standalone-tool)
- Support for importing query modules from other queries
  - The language server will recognize comments of the form
    `; inherits: foo,bar` as "import statements", and will act as though these
    query files were imported in the current one. It will provide additional
    diagnostics and allow you to jump to the imported files using the go to
    definition functionality.
  - **IMPORTANT!** This comment _must_ be the _first line_ in the file,
    otherwise it will not be recognized. There must be exactly one space after
    `inherits`, and there must be no spaces after the following comma(s). Files
    will be searched across the workspace until one matches a valid
    `language_retrieval_pattern`. Note that imported files will match the query
    type of the original (e.g. `; inherits: foo`) inside `bar/highlights.scm`
    will retrieve `foo/highlights.scm`, and not e.g. `foo/folds.scm`.
  - Query files will not be searched within hidden directories or `gitignore`d
    locations.
- Support for hover, selection range, document symbols, semantic tokens, code
  actions, and document highlight

## Standalone tool

### Formatter

The language server can be used as a standalone formatter by passing the
`format` argument, e.g. `ts_query_ls format ./queries`. The command can accept
multiple directories to format. It can also run in "check" mode by passing the
`--check` (`-c`) flag, which will only validate formatting without writing to
the files.

```sh
# use this command for the full documentation
ts_query_ls format --help
```

> **NOTE:** You can ignore formatting for a node by placing a `; format-ignore`
> comment before it.

### CI Tool

The language server can also be used as standalone CI tool by passing the
`check` argument, e.g:

```sh
ts_query_ls check ./queries --config \
'{"parser_install_directories": ["/home/jdoe/Documents/parsers/"]}'
```

The command can accept a list of directories to search for queries, as well as a
flag to pass JSON configuration to the server (needed to detect parser
locations). If no configuration flag is passed, the command will attempt to read
it from the `.tsqueryrc.json` configuration file in the current directory. The
command also accepts a `--format` (`-f`) flag which instructs it to also check
formatting for the given directories. Quick fixes can be applied to supported
diagnostics by passing the `--fix` flag. If no directories are specified to be
checked, then the command will search for all queries in the current directory.

It may also be useful to specify the workspace directory with the `--workspace`
flag (defaults to the current directory). This is the directory that will be
scanned for query modules when `; inherits` is used.

> **NOTE:** This command performs a superset of the work done by the lint
> command; it reads the query's language to validate query structure, node
> names, etc.

```sh
# use this command for the full documentation
ts_query_ls check --help
```

### Linter

The server can be used as a general linter which can operate without access to
the underlying parser objects. The following command will lint the `queries`
directory, meaning it will scan it for invalid capture names or invalid
predicate signatures, as defined by the configuration. Configuration can be
passed in via the `--config` flag, or it will be read from the current directory
if no flag is passed. Quick fixes can be applied to supported diagnostics by
passing the `--fix` flag.

```sh
ts_query_ls lint ./queries
# Use this command for the full documentation
ts_query_ls lint --help
```

### Profiler

The server can be used to profile individual query patterns to check for
patterns which are very slow to compile (often because they are too complex).
This can be done via the `profile` subcommand, which prints each pattern's file
path, start line, and the time (in milliseconds) that it took to compile.
Alternatively, it can also time the entire query file itself (rather than each
pattern inside of it).

```sh
ts_query_ls profile ./queries
# Use this command for the full documentation
ts_query_ls profile --help
```

> **NOTE:** This command will not warm up the cache for you, so it may be best
> to run more than once.

## Checklist

- [x] References for captures
- [x] Renaming captures
- [x] Completions for capture names in a pattern (for predicates)
- [x] Completions for node names
- [x] Fix utility functions, making them robust when it comes to UTF-16 code
      points
- [x] Go to definition for captures
- [x] Recognition/completion of supertypes (requires `tree-sitter 0.25`)
- [x] Completions and diagnostics for a supertype's subtypes
  - Requires <https://github.com/tree-sitter/tree-sitter/pull/3938>
- [x] Completions field names
- [x] Diagnostics for unrecognized nodes
- [x] Diagnostics for referencing undefined capture groups in predicates
- [x] Diagnostics for incorrect syntax
- [x] Diagnostics for impossible patterns
  - ~~Currently not possible without a full (sometimes expensive) run of the
    query file. This should either be implemented as a user command, or core
    methods should be exposed to gather pattern information more efficiently~~
  - For now, this has been made possible due to caching and spawning query scans
    on a separate, blocking thread. Ideally, in the future, the kinks of query
    creation will be ironed out so query creation will be quicker, and this
    logic can be simplified
- [x] Recognize parsers built for `WASM`
- [x] Document formatting compatible with the `nvim-treesitter` formatter
- [x] Code cleanup
- [x] Add tests for all* functionality
- [x] Support for importing query modules via the `; inherits: foo` modeline

### Packaging

- [x] [`homebrew`](https://github.com/Homebrew/homebrew-core)
- [x] [`nixpkgs`](https://github.com/NixOS/nixpkgs)
- [x] [`mason.nvim`](https://github.com/mason-org/mason-registry)
- [x] [`AUR`](https://aur.archlinux.org/)

And others?

## References

Many thanks to @lucario387, and the
[asm-lsp](https://github.com/bergercookie/asm-lsp),
[`jinja-lsp`](https://github.com/uros-5/jinja-lsp),
[`beancount`-language-server](https://github.com/polarmutex/beancount-language-server),
and [helix-editor](https://github.com/helix-editor/helix) projects for the
amazing code that I took inspiration from!