tree-sitter-ktav
A tree-sitter grammar for Ktav (כְּתָב) — the Written Configuration Format.
Languages: English · Русский · 简体中文
Playground: convert JSON / YAML / TOML / INI ⇄ Ktav in your browser at ktav-lang.github.io.
What is Ktav?
Ktav (Hebrew כְּתָב, "writing") is a plain-text configuration
format. JSON-shape (scalars, arrays, objects, null, booleans), but
without quotes around strings, without commas, and with dotted keys
(server.port: 8080) for nesting. The full specification — the same
one all official Ktav implementations target — lives in the
ktav-lang/spec repository.
What is tree-sitter?
Tree-sitter is an incremental parser generator. Editors (Neovim, Helix, Emacs, VS Code, Zed, …) use it for syntax highlighting, code folding, structural selection, and other features that need a real parse tree rather than a regex-based tokenizer. Each language ships its own grammar package; this crate / npm package is the one for Ktav.
Installation
Rust (tree-sitter crate)
[]
= "0.25"
= "0.6.0"
Node.js (tree-sitter package)
const Parser = require;
const Ktav = require;
const parser = ;
parser.;
const tree = parser.;
console.log;
Editor integration
Neovim (with nvim-treesitter)
Until the grammar is upstreamed, register it manually in your config:
require.. =
vim..
Then :TSInstall ktav. Drop queries/highlights.scm,
queries/locals.scm, and queries/injections.scm into your
~/.config/nvim/queries/ktav/ directory (or let nvim-treesitter pick
them up from the parser repo).
Helix
Add to ~/.config/helix/languages.toml:
[[]]
= "ktav"
= "source.ktav"
= ["ktav"]
= []
= "#"
= { = 4, = " " }
[[]]
= "ktav"
= { = "https://github.com/ktav-lang/tree-sitter-ktav", = "main" }
Then hx --grammar fetch && hx --grammar build.
Other editors
The grammar exports the standard tree-sitter node-type metadata
(src/node-types.json) and queries (queries/*.scm), so any editor
with tree-sitter support can consume it once the parser is built.
Node types
The grammar produces the following named nodes:
| Node | What it captures |
|---|---|
source_file |
the whole document |
comment |
a #-line comment |
blank_line |
an empty line |
object_pair |
key SEP value line |
key / dotted_key |
the key portion (with optional . separators) |
sep_string / sep_raw / sep_int / sep_float |
the four separators |
keyword / kw_null / kw_true / kw_false |
keywords |
scalar |
the catch-all single-line value body |
compound_object |
{ … } block |
compound_array |
[ … ] block |
array_item |
a single item in an array |
multiline_stripped |
( … ) block |
multiline_verbatim |
(( … )) block |
multiline_content_line |
a single line inside a multi-line string |
empty_object / empty_array / empty_paren / empty_double_paren |
inline empty forms |
empty_value |
separator immediately followed by EOL |
object_pair exposes key, separator, and value as fields;
array_item exposes marker (optional) and value.
Building from source
The generated src/parser.c, src/grammar.json, src/node-types.json,
and src/tree_sitter/* are checked into git per tree-sitter convention,
so consumers do not need the CLI to build.
Status
0.6.0 — implements Ktav 0.6.0.
The grammar accepts every valid Ktav 0.6.0 document (verified against
all tests/valid/*.ktav fixtures from the spec repo). It is a
syntactic accepter, not a strict spec validator — see
CHANGELOG.md "Known limitations" for the small
number of pathological cases the grammar accepts that the spec
rejects (mostly missing-whitespace-after-marker — § 6.10).
License
Dual-licensed under MIT OR Apache-2.0 — see LICENSE-MIT and LICENSE-APACHE.