ktav 0.1.0

Ktav — a plain configuration format. Three rules, zero indentation, zero quoting. Serde-native.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
# Ktav (כְּתָב)

> A plain configuration format. JSON5-shaped, but without the quotes,
> without the commas, with dotted keys for nesting. Native `serde`
> integration.

**Languages:** **English** · [Русский](README.ru.md) · [简体中文](README.zh.md)

**Specification:** this crate implements **Ktav 0.1**. The format is
versioned and maintained independently of this crate — see
[`ktav-lang/spec`](https://github.com/ktav-lang/spec) for the formal
document.

---

## Name

*Ktav* (Hebrew: **כְּתָב**) means "writing, that which is written" — a
thing recorded in a form fixed enough that its meaning does not depend
on who passes it along. The name fits literally: a config file *is*
ktav on disk, and the library reads it and hands you back a live
structure without making anything up along the way.

## Motto

> **Be the config's friend, not its examiner. The config isn't perfect —
> but it's the best one.**

Every rule is local. Every line either stands on its own or depends only
on visible brackets. No indentation pitfalls, no forgotten quotes, no
trailing-comma arithmetic.

## The rules

A Ktav document is an implicit top-level object. Inside any object you
have pairs; inside any array you have items.

```text
# comment              — any line starting with '#'
key: value             — scalar pair; key may be a dotted path (a.b.c)
key:: value            — scalar pair; value is ALWAYS a literal string
key: { ... }           — multi-line object; `}` closes on its own line
key: [ ... ]           — multi-line array; `]` closes on its own line
key: {}   /   key: []  — empty compound, inline
key: ( ... )           — multi-line string; common indent stripped
key: (( ... ))         — multi-line string; verbatim (no stripping)
:: value               — inside an array: literal-string item
```

That's the whole language. No commas, no quotes, no escape inside the
value itself — the only "escape" is the `::` marker, and it lives in the
separator (for pairs) or as a line prefix (for array items).

## Values and special tokens

### Strings

Default for any scalar. Stored internally as `Value::String`. The value
is whatever follows `:` after trimming.

```text
name: Russia
path: /etc/hosts
greeting: hello world
# `::` forces a literal string
pattern:: [a-z]+
```

### Numbers

Numbers are written bare (no quotes). At the `Value` level they are
strings; serde parses them into the target Rust type (`u16`, `i64`,
`f64`, …) using `FromStr` on deserialization, and formats them with
`Display` on serialization.

```text
port: 8080
ratio: 3.14159
offset: -42
huge: 1234567890123
```

A value like `port: abc` parses fine *at the Ktav level* (string
`"abc"`), but `serde::deserialize` into `u16` will return a clear
`ParseError`.

### Booleans: `true` / `false`

Strict lowercase. Anything else is a string.

```text
# Value::Bool(true)
on: true
# Value::Bool(false)
off: false
# Value::String("True")
capitalized: True
# Value::String("FALSE")
yelling:    FALSE
# Value::String("true")
literal:: true
```

### Null: `null`

Strict lowercase. Matches `Option::None` on the Rust side, as well as
`()` for unit.

```text
# Value::Null
label: null
# Value::String("Null")
capitalized: Null
# Value::String("null")
literal:: null
```

When serializing, `Option::None` is emitted as `null`. Suppress with
`#[serde(skip_serializing_if = "Option::is_none")]` if you prefer the
field absent.

### Empty object / empty array

The **only** inline compound values allowed — nothing to separate, no
commas needed.

```text
# empty object
meta: {}
# empty array
tags: []
```

### Keyword-like strings need `::`

If a string's content happens to equal a keyword (`true`, `false`,
`null`) or begin with `{` or `[`, the **serializer emits `::`
automatically** so the round-trip is lossless. On the writing side you
do the same:

```text
# the string "true", not a bool
flag:: true
# the string "null", not a null
noun:: null
regex:: [a-z]+
ipv6:: [::1]:8080
template:: {issue.id}.tpl
```

## Compound values are multi-line

Non-empty `{ ... }` / `[ ... ]` **must** span multiple lines, with the
closing bracket on its own line. `x: { a: 1 }` and `x: [1, 2, 3]` are
rejected with a clear error — Ktav has no comma-separation rules and
no escape mechanism for them.

```text
# rejected — inline non-empty compound
server: { host: 127.0.0.1, port: 8080 }
tags: [primary, eu, prod]

# accepted — multi-line form
server: {
    host: 127.0.0.1
    port: 8080
}

tags: [
    primary
    eu
    prod
]
```

## Using it from Rust

Ktav is serde-native. Any type implementing `Serialize` / `Deserialize`
(including `#[derive]`-generated ones) round-trips through Ktav out of
the box.

```rust
use serde::{Deserialize, Serialize};

#[derive(Debug, Serialize, Deserialize)]
struct Upstream {
    host: String,
    port: u16,
}

#[derive(Debug, Serialize, Deserialize)]
struct Config {
    port: u16,
    banned_patterns: Vec<String>,
    upstreams: Vec<Upstream>,
}

fn main() -> Result<(), ktav::Error> {
    let cfg: Config = ktav::from_file("resocks5.conf")?;
    let text = ktav::to_string(&cfg)?;
    ktav::to_file(&cfg, "resocks5.conf")?;
    Ok(())
}
```

Four public entry points: [`from_str`](https://docs.rs/ktav) /
[`from_file`](https://docs.rs/ktav) for reading, [`to_string`](https://docs.rs/ktav) /
[`to_file`](https://docs.rs/ktav) for writing.

### Typed markers

Rust numeric types (`u8`..`u128`, `i8`..`i128`, `usize`, `isize`, `f32`,
`f64`) serialize to Ktav with explicit typed markers: `port:i 8080`,
`ratio:f 0.5`. Deserialization accepts *both* typed-marker and plain-string
forms — documents written without markers still work, exactly as before.
`NaN` / `±Infinity` are rejected by the serializer (Ktav 0.1.0 does not
represent them).

## Examples: Ktav → JSON5

JSON5 is on the right because it reads like ordinary JavaScript, allows
comments, and shows exactly what the parser produces.

### 1. Scalars

```text
name: Russia
port: 20082
```
```json5
{
  name: "Russia",
  port: "20082"
}
```

All scalars come out as strings at the `Value` level; numeric / boolean
types are parsed through serde when you deserialize into `u16` / `bool`
/ `f64` / …

### 2. Dotted keys = nested objects

```text
server.host: 127.0.0.1
server.port: 8080
app.debug: true
```
```json5
{
  server: { host: "127.0.0.1", port: "8080" },
  app: { debug: "true" }
}
```

Any depth works. The full address is on every line.

### 3. Nested object as a value

```text
server: {
    host: 127.0.0.1
    port: 8080
    endpoints.api: /v1
    endpoints.admin: /admin
}
```
```json5
{
  server: {
    host: "127.0.0.1",
    port: "8080",
    endpoints: { api: "/v1", admin: "/admin" }
  }
}
```

### 4. Array of scalars

```text
banned_patterns: [
    .*\.onion:\d+
    .*:25
]
```
```json5
{
  banned_patterns: [".*\\.onion:\\d+", ".*:25"]
}
```

### 5. Array of objects

```text
upstreams: [
    {
        host: a.example
        port: 1080
    }
    {
        host: b.example
        port: 1080
    }
]
```
```json5
{
  upstreams: [
    { host: "a.example", port: "1080" },
    { host: "b.example", port: "1080" }
  ]
}
```

### 6. Arbitrary nesting

Every compound value spans multiple lines (single-line `{ ... }` / `[ ... ]`
with contents is not accepted — only the empty forms `{}` / `[]` are
inline). Nest as deep as needed:

```text
countries: [
    {
        name: Russia
        cities: [
            {
                name: Moscow
                buildings: [
                    {
                        name: Kremlin
                    }
                    {
                        name: Saint Basil's
                    }
                ]
            }
            {
                name: Saint Petersburg
            }
        ]
    }
    {
        name: France
    }
]
```

### 7. Literal strings: `::`

Some values would otherwise be parsed as compound (because they start
with `{` or `[`): regular expressions, IPv6 addresses, template
placeholders. The double-colon `::` flags them as "raw string, do not
parse further."

```text
pattern:: [a-z]+
ipv6:: [::1]:8080
template:: {issue.id}.tpl

hosts: [
    ok.example
    :: [::1]
    :: [2001:db8::1]:53
]
```
```json5
{
  pattern: "[a-z]+",
  ipv6: "[::1]:8080",
  template: "{issue.id}.tpl",
  hosts: ["ok.example", "[::1]", "[2001:db8::1]:53"]
}
```

For pairs the marker sits between key and value; for array items it
stands at the start of the line. **Serialization emits `::`
automatically** when a string value begins with `{` or `[`, so
round-tripping regexes and IPv6 addresses just works.

### 8. Comments

```text
# top-level comment
port: 8080

items: [
    # this comment does not break the array
    a
    b
]
```

Comments are full lines starting with `#`. Inline comments are not
supported — they get confused with the value too easily.

### 9. Multi-line strings: `( ... )` and `(( ... ))`

Values that span multiple lines go inside parentheses. The opening and
closing lines are NOT part of the value.

`(` ... `)` — common leading whitespace is stripped, so you can indent
the block to match its surroundings without contaminating the content:

```text
body: (
    {
      "qwe": 1
    }
)
```
```json5
{ body: "{\n  \"qwe\": 1\n}" }
```

`((` ... `))` — verbatim: every character between the markers ends up in
the value, including leading whitespace:

```text
sig: ((
  -----BEGIN-----
  QUJDRA==
  -----END-----
))
```
```json5
{ sig: "  -----BEGIN-----\n  QUJDRA==\n  -----END-----" }
```

Inside a block, `{` / `[` / `#` are just content — **no compound parsing,
no comment skipping**. The only special sequence is the terminator on
its own line.

Empty inline form: `key: ()` or `key: (())` — both yield the empty
string (same as `key:`).

Serialization: any string containing `\n` is emitted with `(( ... ))`
so the round-trip is byte-for-byte lossless. Strings without newlines
use the usual single-line form.

Limitation: a line whose trimmed content is exactly `)` / `))` always
closes the block, so such a literal cannot appear as content without
using an external file.

### 10. Empty compounds

```text
meta: {}
tags: []
```

Inline empty is allowed. Anything with contents must span multiple
lines, and the closing `}` / `]` must sit on its own line.

### 11. Enums

Ktav uses serde's default *externally tagged* enum representation.

```rust
#[derive(Serialize, Deserialize)]
#[serde(rename_all = "lowercase")]
enum Mode { Fast, Slow }

#[derive(Serialize, Deserialize)]
enum Action {
    Log(String),
    Count(u32),
}
```

```text
# unit variant — just the name
mode: fast

# newtype variant — single-entry object
action: {
    Log: hello
}
```

## Round-trip

```rust
let cfg: MyConfig = ktav::from_str(text)?;
let back = ktav::to_string(&cfg)?;
let again: MyConfig = ktav::from_str(&back)?;
assert_eq!(cfg, again);
```

Serialization preserves:
- **Field order**`Value::Object` is backed by an `IndexMap`, so the
  order is whatever serde emits (for structs: declaration order).
- **Literal strings** — values starting with `{` or `[` are emitted
  with the `::` marker.
- **`None` fields** — skipped on output; reappear as `None` on input
  (via serde's `Option` handling).

## Architecture

```
ktav/
├── value/            — the Value enum, ObjectMap
├── parser/           — line-by-line parser (text → Value)
├── render/           — pretty-printer (Value → text)
├── ser/              — serde::Serializer (T: Serialize → Value)
├── de/               — serde::Deserializer (Value → T: Deserialize)
├── error/            — Error + serde::Error impls
└── lib.rs            — glue: from_str / from_file / to_string / to_file
```

Each file holds one exported item; implementation details are private to
their parent module.

## What Ktav does NOT do — and never will

- **Inline non-empty compounds** like `x: { a: 1, b: 2 }`. They'd bring
  commas, and commas would bring escaping. Compound values are
  multiline.
- **Anchors / aliases / merge keys** (`&anchor`, `*ref`, `<<:`). Any
  line whose meaning depends on a declaration far away stops being
  self-sufficient. If you want DRY, compose defaults in code.
- **File includes** (`@include`, `!import`). Write a wrapper in code
  for large configs.
- **Top-level arrays.** The document is always an object.

## Installation

Once published:

```toml
[dependencies]
ktav = "0.1"
serde = { version = "1", features = ["derive"] }
```

## Support the project

The author has many ideas that could be broadly useful to IT worldwide —
not limited to Ktav. Realizing them requires funding. If you'd like to
help, please reach out at **phpcraftdream@gmail.com**.

## License

MIT. See [LICENSE](LICENSE).