Ktav (כְּתָב)
A plain configuration format. JSON5-shaped, but without the quotes, without the commas, with dotted keys for nesting. Native
serdeintegration.
Languages: English · Русский · 简体中文
Specification: this crate implements Ktav 0.1. The format is
versioned and maintained independently of this crate — see
ktav-lang/spec for the formal
document.
Name
Ktav (Hebrew: כְּתָב) means "writing, that which is written" — a thing recorded in a form fixed enough that its meaning does not depend on who passes it along. The name fits literally: a config file is ktav on disk, and the library reads it and hands you back a live structure without making anything up along the way.
Motto
Be the config's friend, not its examiner. The config isn't perfect — but it's the best one.
Every rule is local. Every line either stands on its own or depends only on visible brackets. No indentation pitfalls, no forgotten quotes, no trailing-comma arithmetic.
The rules
A Ktav document is an implicit top-level object. Inside any object you have pairs; inside any array you have items.
# comment — any line starting with '#'
key: value — scalar pair; key may be a dotted path (a.b.c)
key:: value — scalar pair; value is ALWAYS a literal string
key: { ... } — multi-line object; `}` closes on its own line
key: [ ... ] — multi-line array; `]` closes on its own line
key: {} / key: [] — empty compound, inline
key: ( ... ) — multi-line string; common indent stripped
key: (( ... )) — multi-line string; verbatim (no stripping)
:: value — inside an array: literal-string item
That's the whole language. No commas, no quotes, no escape inside the
value itself — the only "escape" is the :: marker, and it lives in the
separator (for pairs) or as a line prefix (for array items).
Values and special tokens
Strings
Default for any scalar. Stored internally as Value::String. The value
is whatever follows : after trimming.
name: Russia
path: /etc/hosts
greeting: hello world
# `::` forces a literal string
pattern:: [a-z]+
Numbers
Numbers are written bare (no quotes). At the Value level they are
strings; serde parses them into the target Rust type (u16, i64,
f64, …) using FromStr on deserialization, and formats them with
Display on serialization.
port: 8080
ratio: 3.14159
offset: -42
huge: 1234567890123
A value like port: abc parses fine at the Ktav level (string
"abc"), but serde::deserialize into u16 will return a clear
ParseError.
Booleans: true / false
Strict lowercase. Anything else is a string.
# Value::Bool(true)
on: true
# Value::Bool(false)
off: false
# Value::String("True")
capitalized: True
# Value::String("FALSE")
yelling: FALSE
# Value::String("true")
literal:: true
Null: null
Strict lowercase. Matches Option::None on the Rust side, as well as
() for unit.
# Value::Null
label: null
# Value::String("Null")
capitalized: Null
# Value::String("null")
literal:: null
When serializing, Option::None is emitted as null. Suppress with
#[serde(skip_serializing_if = "Option::is_none")] if you prefer the
field absent.
Empty object / empty array
The only inline compound values allowed — nothing to separate, no commas needed.
# empty object
meta: {}
# empty array
tags: []
Keyword-like strings need ::
If a string's content happens to equal a keyword (true, false,
null) or begin with { or [, the serializer emits ::
automatically so the round-trip is lossless. On the writing side you
do the same:
# the string "true", not a bool
flag:: true
# the string "null", not a null
noun:: null
regex:: [a-z]+
ipv6:: [::1]:8080
template:: {issue.id}.tpl
Compound values are multi-line
Non-empty { ... } / [ ... ] must span multiple lines, with the
closing bracket on its own line. x: { a: 1 } and x: [1, 2, 3] are
rejected with a clear error — Ktav has no comma-separation rules and
no escape mechanism for them.
# rejected — inline non-empty compound
server: { host: 127.0.0.1, port: 8080 }
tags: [primary, eu, prod]
# accepted — multi-line form
server: {
host: 127.0.0.1
port: 8080
}
tags: [
primary
eu
prod
]
Using it from Rust
Ktav is serde-native. Any type implementing Serialize / Deserialize
(including #[derive]-generated ones) round-trips through Ktav out of
the box.
Parse — decode straight into a typed struct
use ;
const SRC: &str = "\
service: web
port:i 8080
ratio:f 0.75
tls: true
tags: [
prod
eu-west-1
]
db.host: primary.internal
db.timeout:i 30
";
let cfg: Config = from_str?;
println!;
Walk — match on the dynamic Value enum
use Value;
let v = parse?;
let Object = &v else ;
for in top
Build & render — construct a document in code
use ;
let mut top = default;
top.insert;
top.insert;
top.insert;
top.insert;
top.insert;
let text = render?;
For typical app use prefer the serde path — ktav::to_string(&cfg) —
and reach for Value only when the schema is dynamic.
Four public entry points: from_str /
from_file for reading, to_string /
to_file for writing. A complete runnable
example lives in examples/basic.rs.
Typed markers
Rust numeric types (u8..u128, i8..i128, usize, isize, f32,
f64) serialize to Ktav with explicit typed markers: port:i 8080,
ratio:f 0.5. Deserialization accepts both typed-marker and plain-string
forms — documents written without markers still work, exactly as before.
NaN / ±Infinity are rejected by the serializer (Ktav 0.1.0 does not
represent them).
Examples: Ktav → JSON5
JSON5 is on the right because it reads like ordinary JavaScript, allows comments, and shows exactly what the parser produces.
1. Scalars
name: Russia
port: 20082
{
name: "Russia",
port: "20082"
}
All scalars come out as strings at the Value level; numeric / boolean
types are parsed through serde when you deserialize into u16 / bool
/ f64 / …
2. Dotted keys = nested objects
server.host: 127.0.0.1
server.port: 8080
app.debug: true
{
server: { host: "127.0.0.1", port: "8080" },
app: { debug: "true" }
}
Any depth works. The full address is on every line.
3. Nested object as a value
server: {
host: 127.0.0.1
port: 8080
endpoints.api: /v1
endpoints.admin: /admin
}
{
server: {
host: "127.0.0.1",
port: "8080",
endpoints: { api: "/v1", admin: "/admin" }
}
}
4. Array of scalars
banned_patterns: [
.*\.onion:\d+
.*:25
]
{
banned_patterns: [".*\\.onion:\\d+", ".*:25"]
}
5. Array of objects
upstreams: [
{
host: a.example
port: 1080
}
{
host: b.example
port: 1080
}
]
{
upstreams: [
{ host: "a.example", port: "1080" },
{ host: "b.example", port: "1080" }
]
}
6. Arbitrary nesting
Every compound value spans multiple lines (single-line { ... } / [ ... ]
with contents is not accepted — only the empty forms {} / [] are
inline). Nest as deep as needed:
countries: [
{
name: Russia
cities: [
{
name: Moscow
buildings: [
{
name: Kremlin
}
{
name: Saint Basil's
}
]
}
{
name: Saint Petersburg
}
]
}
{
name: France
}
]
7. Literal strings: ::
Some values would otherwise be parsed as compound (because they start
with { or [): regular expressions, IPv6 addresses, template
placeholders. The double-colon :: flags them as "raw string, do not
parse further."
pattern:: [a-z]+
ipv6:: [::1]:8080
template:: {issue.id}.tpl
hosts: [
ok.example
:: [::1]
:: [2001:db8::1]:53
]
{
pattern: "[a-z]+",
ipv6: "[::1]:8080",
template: "{issue.id}.tpl",
hosts: ["ok.example", "[::1]", "[2001:db8::1]:53"]
}
For pairs the marker sits between key and value; for array items it
stands at the start of the line. Serialization emits ::
automatically when a string value begins with { or [, so
round-tripping regexes and IPv6 addresses just works.
8. Comments
# top-level comment
port: 8080
items: [
# this comment does not break the array
a
b
]
Comments are full lines starting with #. Inline comments are not
supported — they get confused with the value too easily.
9. Multi-line strings: ( ... ) and (( ... ))
Values that span multiple lines go inside parentheses. The opening and closing lines are NOT part of the value.
( ... ) — common leading whitespace is stripped, so you can indent
the block to match its surroundings without contaminating the content:
body: (
{
"qwe": 1
}
)
{ body: "{\n \"qwe\": 1\n}" }
(( ... )) — verbatim: every character between the markers ends up in
the value, including leading whitespace:
sig: ((
-----BEGIN-----
QUJDRA==
-----END-----
))
{ sig: " -----BEGIN-----\n QUJDRA==\n -----END-----" }
Inside a block, { / [ / # are just content — no compound parsing,
no comment skipping. The only special sequence is the terminator on
its own line.
Empty inline form: key: () or key: (()) — both yield the empty
string (same as key:).
Serialization: any string containing \n is emitted with (( ... ))
so the round-trip is byte-for-byte lossless. Strings without newlines
use the usual single-line form.
Limitation: a line whose trimmed content is exactly ) / )) always
closes the block, so such a literal cannot appear as content without
using an external file.
10. Empty compounds
meta: {}
tags: []
Inline empty is allowed. Anything with contents must span multiple
lines, and the closing } / ] must sit on its own line.
11. Enums
Ktav uses serde's default externally tagged enum representation.
# unit variant — just the name
mode: fast
# newtype variant — single-entry object
action: {
Log: hello
}
Round-trip
let cfg: MyConfig = from_str?;
let back = to_string?;
let again: MyConfig = from_str?;
assert_eq!;
Serialization preserves:
- Field order —
Value::Objectis backed by anIndexMap, so the order is whatever serde emits (for structs: declaration order). - Literal strings — values starting with
{or[are emitted with the::marker. Nonefields — skipped on output; reappear asNoneon input (via serde'sOptionhandling).
Architecture
ktav/
├── value/ — the Value enum, ObjectMap
├── parser/ — line-by-line parser (text → Value)
├── render/ — pretty-printer (Value → text)
├── ser/ — serde::Serializer (T: Serialize → Value)
├── de/ — serde::Deserializer (Value → T: Deserialize)
├── error/ — Error + serde::Error impls
└── lib.rs — glue: from_str / from_file / to_string / to_file
Each file holds one exported item; implementation details are private to their parent module.
What Ktav does NOT do — and never will
- Inline non-empty compounds like
x: { a: 1, b: 2 }. They'd bring commas, and commas would bring escaping. Compound values are multiline. - Anchors / aliases / merge keys (
&anchor,*ref,<<:). Any line whose meaning depends on a declaration far away stops being self-sufficient. If you want DRY, compose defaults in code. - File includes (
@include,!import). Write a wrapper in code for large configs. - Top-level arrays. The document is always an object.
Installation
Once published:
[]
= "0.1"
= { = "1", = ["derive"] }
Support the project
The author has many ideas that could be broadly useful to IT worldwide — not limited to Ktav. Realizing them requires funding. If you'd like to help, please reach out at phpcraftdream@gmail.com.
License
MIT. See LICENSE.