nightjar-lang 0.1.0

A declarative, prefix-notation DSL for formal verification of structured data.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
# Nightjar Language — Supplement

Comprehensive reference for contributors, maintainers, and AI coding agents.
This document is a superset of [README.md](README.md): it reproduces the formal
grammar, defines every operator's exact semantics (including edge cases),
describes the crate's internal architecture, catalogs every error code, and
provides step-by-step recipes for extending the language.

A new user of the library does **not** need to read this document to use
Nightjar. A contributor designing a new operator, a new quantifier, a new data
type, or a new parser subsystem **does**.

---

## Table of contents

1. [Design philosophy]#design-philosophy
2. [Formal language specification]#formal-language-specification
3. [Operator semantics]#operator-semantics
4. [Symbol table and flattening]#symbol-table-and-flattening
5. [Execution pipeline]#execution-pipeline
6. [Public API reference]#public-api-reference
7. [Error codes — full reference]#error-codes--full-reference
8. [Architecture and module layout]#architecture-and-module-layout
9. [Extending the language]#extending-the-language
10. [Testing strategy]#testing-strategy
11. [Design decisions and rationale]#design-decisions-and-rationale
12. [Known limitations and deferred features]#known-limitations-and-deferred-features
13. [License]#license

---

## Design philosophy

Nightjar is built around five non-negotiable principles. Every design decision
below follows from one of them; when a change proposes to break one, treat it
as a substantial change that requires explicit justification.

1. **Correctness first.** Every syntactically valid expression reduces to a
   well-defined result in `{ True, False, Error }`. There is no "undefined",
   no "maybe", no silent fallback to a default value. Errors are first-class,
   carry spans and codes, and are distinguishable from a well-formed `False`.
2. **Minimal surface.** The language is intentionally tiny. It has no
   variables, no lambdas, no user-defined functions, no I/O, no modules,
   no side effects, no state. Complexity belongs in the host application,
   not in the DSL.
3. **Formal foundations.** Every operator has a precise mathematical
   definition. Quantifiers mean what they mean in first-order logic;
   verifiers are binary relations; connectives are the Boolean algebra.
   The authoritative EBNF lives in-tree ([src/language/grammar.rs]src/language/grammar.rs)
   and this document reproduces it verbatim.
4. **Strictness.** No implicit type coercion, with one carefully bounded
   exception: `Int` promotes to `Float` when the other operand of an
   arithmetic function or comparison verifier is a `Float`. Every other
   cross-type combination is a `TypeError`. Missing symbols are
   `SymbolNotFound`; they do not default to `Null`.
5. **Safety.** Nightjar is always embedded in someone else's application.
   It must not crash the host on adversarial input. The parser enforces a
   configurable depth limit; integer arithmetic is checked; list unrolling
   is O(N) and the host is responsible for bounding the input.

Unicode is not a separate principle — it is a direct consequence of
correctness. Identifiers, string content, and map keys all go through
`char::is_alphanumeric`, which recognises Unicode letter and number
categories (L* and N*), so `.營收`, `.données.résultat`, and
`"紅嘴黑鵯"` are all first-class.

---

## Formal language specification

### EBNF grammar

This block is reproduced verbatim from
[src/language/grammar.rs:23-110](src/language/grammar.rs#L23-L110), which is the
authoritative copy. When editing the grammar, update both places in the same
commit; see the EBNF drift check in [Extending the language](#extending-the-language).

```ebnf
(* A program is a single expression that must reduce to Boolean. *)
program         = bool_expr ;

bool_expr       = bool_literal
                | verifier_expr
                | connective_expr
                | not_expr
                | unary_check_expr
                | quantifier_expr ;

verifier_expr   = "(" verifier_op value_expr value_expr ")" ;
verifier_op     = "EQ" | "NE" | "LT" | "LE" | "GT" | "GE" ;

connective_expr = "(" connective_op bool_expr bool_expr ")" ;
connective_op   = "AND" | "OR" ;
not_expr        = "(" "NOT" bool_expr ")" ;

unary_check_expr = "(" "NonEmpty" value_expr ")" ;

(* Quantifiers: assert a predicate over a List-typed Entity. *)
quantifier_expr = "(" quantifier_op predicate value_expr ")" ;
quantifier_op   = "ForAll" | "Exists" ;

(* Predicates are only legal inside quantifiers.                  *)
(* Partial vs. full predicate is disambiguated by operand count   *)
(* at parse time, not by a syntactic marker:                      *)
(*   (VerifierOp x)       — partial verifier (1 operand)          *)
(*   (VerifierOp x y)     — full bool_expr (2 operands)           *)
(*   NonEmpty (bare)      — unary check                           *)
(*   any other bool_expr  — full bool_expr                        *)
(* The body of a full predicate may use the "@" element-rooted    *)
(* symbol form to refer to the current iteration element.         *)
predicate       = partial_verifier | "NonEmpty" | bool_expr ;
partial_verifier = "(" verifier_op value_expr ")" ;

(* Value expressions produce an Entity. *)
value_expr      = literal
                | symbol
                | func_expr ;

(* Arity is enforced at parse time from FuncOp::expected_arity():  *)
(*   1-ary: Neg, Abs, Length, Upper, Lower,                        *)
(*          Head, Tail, Count, GetKeys, GetValues                  *)
(*   2-ary: Add, Sub, Mul, Div, Mod, Concat, Get                   *)
(*   3-ary: Substring                                              *)
func_expr       = "(" func_op value_expr { value_expr } ")" ;
func_op         = arith_op | string_op | collection_op ;
arith_op        = "Add" | "Sub" | "Mul" | "Div"
                | "Mod" | "Neg" | "Abs" ;
string_op       = "Concat" | "Length" | "Substring"
                | "Upper" | "Lower" ;
collection_op   = "Head" | "Tail" | "Get" | "Count"
                | "GetKeys" | "GetValues" ;

(* Terminals. *)
literal         = int_literal | float_literal
                | string_literal | bool_literal | null_literal ;

(* A leading "-" is part of the numeric literal    *)
int_literal     = [ "-" ] digit { digit } ;
float_literal   = [ "-" ] digit { digit } "." digit { digit } ;

string_literal  = '"' { any_char } '"' ;
bool_literal    = "True" | "False" ;
null_literal    = "Null" ;

(* Symbols have two namespaces:                                    *)
(*   "." — root-rooted (resolved against the whole input).         *)
(*   "@" — element-rooted (resolved against the current iteration  *)
(*         element of the nearest enclosing ForAll/Exists).        *)
(* Bare "." is the whole input; bare "@" is the current element.   *)
(* "@" is only legal inside a quantifier predicate; the parser     *)
(* rejects it elsewhere with a ParseError.                         *)
symbol          = ( "." | "@" ) [ segment { "." segment } ] ;

(* Segment characters are Unicode-aware:                           *)
(* char::is_alphanumeric() covers Unicode categories L* and N*,    *)
(* so keys like ".營收" and ".données.résultat" are valid.         *)
segment         = ident_start { ident_char } ;
ident_start     = unicode_letter | "_" ;
ident_char      = unicode_letter | unicode_digit | "_" ;

digit           = "0" | "1" | "2" | "3" | "4"
                | "5" | "6" | "7" | "8" | "9" ;
```

### Lexical rules

- **Whitespace-insensitive.** Spaces, tabs, and newlines only separate tokens.
  `(GT 1 2)`, `(GT  1   2)`, and `( GT\n  1\n  2\n)` are identical.
- **No comments.** Nightjar expressions have no comment syntax. Comments and
  documentation live in the host program, not in the rule text.
- **Numeric literals.** A `-` immediately followed by a digit is part of the
  number: `-5` and `-3.14` are single tokens. `- 5` (with a space) is a
  `ParseError` because `-` is not a standalone token.
- **String literals.** Double-quoted. There are no escape sequences defined;
  every character between the opening and closing quotes is literal, including
  any whitespace and any Unicode scalar. An unterminated string (missing
  closing quote before EOF) is a `ParseError`.
- **Keywords vs identifiers.** Operator names (`EQ`, `Add`, `ForAll`, …),
  boolean literals (`True`, `False`), and `Null` are keywords. They are case-
  sensitive — `true` is not a literal, `add` is not an operator. Keywords
  cannot be used as symbol segments because they do not start with `.` or `@`.
- **Symbol segments.** A segment starts with `ident_start` (Unicode letter or
  `_`) and continues with `ident_char` (Unicode letter or digit or `_`). This
  means `.1x` is a `ParseError` but `._1` is a valid list-index segment.
  Segments are joined by literal `.` characters.
- **Symbol sigils.** `.` starts a root-rooted symbol; `@` starts an element-
  rooted symbol. Bare `.` and bare `@` (with no segments) refer to the whole
  payload and the whole current element respectively.

### Data types

The seven runtime types are defined by the `Entity` enum in
[src/context/entity.rs:53-61](src/context/entity.rs#L53-L61):

| Entity variant     | Underlying Rust type       | Literal in the language | "Empty" for `NonEmpty` |
|--------------------|----------------------------|-------------------------|------------------------|
| `Entity::Int`      | `i64`                      | `42`, `-7`              | never empty            |
| `Entity::Float`    | `f64`                      | `3.14`, `-0.5`          | never empty            |
| `Entity::String`   | `String`                   | `"hello"`, `"營收"`     | empty iff `""`         |
| `Entity::Bool`     | `bool`                     | `True`, `False`         | never empty            |
| `Entity::List`     | `Vec<Entity>`              | (from host data)        | empty iff `[]`         |
| `Entity::Map`      | `HashMap<String, Entity>`  | (from host data)        | empty iff `{}`         |
| `Entity::Null`     | (unit)                     | `Null`                  | always empty           |

`Entity::type_tag()` projects to a `TypeTag` enum, used throughout the
runtime for type checks and error messages.

### Type coercion

The only implicit coercion in the language is **Int → Float auto-promotion**,
and it applies in exactly two places:

- **Arithmetic functions** `Add`, `Sub`, `Mul`, `Div`, `Mod`: if either
  operand is `Float`, the other (if `Int`) is promoted, and the result is
  `Float`. Both `Int``Int` arithmetic. Both `Float``Float` arithmetic.
- **Comparison verifiers** `EQ`, `NE`, `LT`, `LE`, `GT`, `GE`: when one side
  is `Int` and the other is `Float`, the `Int` is promoted before comparison.

Every other type mismatch is a `TypeError` (E002). `(Add 1 "abc")`,
`(GT "a" 1)`, `(Concat 1 2)`, `(Head 42)` are all errors.

`Null` is never silently converted. A `Null` operand to an arithmetic op is
a `TypeError`, a `Null` operand to `NonEmpty` is always `False`, and
`SymbolNotFound` is the rule for missing keys (not `Null`).

---

## Operator semantics

Every operator below is listed with arity, input types, output type, and
every edge case worth documenting. Operators are grouped by family, matching
the AST enums in [src/language/grammar.rs](src/language/grammar.rs).

### Verifiers — `EQ NE LT LE GT GE`

Binary, two value expressions → `Bool`. Implemented in
[src/context/verifier.rs](src/context/verifier.rs).

- **Equality (`EQ`, `NE`) on `Float`** uses **epsilon-based comparison**:
  `EQ(a, b) ⇔ |a − b| < ε`, where `ε = ExecOptions::float_epsilon`
  (default `1e-10`). This is what makes `(EQ (Add 0.1 0.2) 0.3)` evaluate to
  `True`, despite IEEE 754 representation error. `NE` is the negation.
- **Ordering verifiers (`LT`, `LE`, `GT`, `GE`) on `Float`** use standard
  IEEE 754 comparison (`partial_cmp`). Epsilon does not apply.
- **NaN** — any comparison involving NaN (EQ, NE, LT, LE, GT, GE) returns
  `false`. This matches Rust's `partial_cmp` semantics and IEEE 754.
  Specifically, `(EQ NaN NaN)` is `False` (because
  `|NaN − NaN|` is `NaN`, not `< ε`).
- **Int ↔ Float promotion** applies for mixed-type compares.
- **String equality** is exact byte equality (which is also Unicode scalar
  equality for canonicalised UTF-8 strings).
- **Bool equality** is the obvious thing.
- **Cross-type comparisons** (e.g. `(GT "a" 1)`, `(EQ .list .int)`) are a
  `TypeError`.
- **Null equality**`(EQ Null Null)` is `True`. `(EQ Null anything_else)`
  is a `TypeError`: we deliberately do not let `Null` silently equal scalars.

### Unary check — `NonEmpty`

Unary, one value → `Bool`. Returns the result of `Entity::is_non_empty()`
([src/context/entity.rs:81-89](src/context/entity.rs#L81-L89)):

| Input          | `NonEmpty` result |
|----------------|-------------------|
| `Int`, `Float`, `Bool` (any value) | `True`   |
| `String ""`                          | `False`  |
| `String "anything else"`             | `True`   |
| `List []`                            | `False`  |
| `List [ … ]`                         | `True`   |
| `Map {}`                             | `False`  |
| `Map { … }`                          | `True`   |
| `Null`                               | `False`  |

### Connectives — `AND OR NOT`

`AND` and `OR` are binary boolean-in, boolean-out; `NOT` is unary.
Implemented in [src/context/connective.rs](src/context/connective.rs).

- **No short-circuit evaluation.** Both operands of `AND`/`OR` are always
  evaluated. If one branch produces an `Error`, the error surfaces immediately
  (regardless of whether the other branch would have decided the result).
  This keeps error behaviour deterministic — every error in a rule is
  surfaced, never masked by a short-circuit.
- **Adding short-circuit later is compatible** with the API shape, but would
  change the observable error behaviour. If it is ever added, it must be
  opt-in (e.g. via `ExecOptions`) so existing rules keep their diagnostic
  behaviour.

### Quantifiers — `ForAll Exists`

`(QuantifierOp predicate operand)`. Implemented in
[src/context/quantifier.rs](src/context/quantifier.rs).

- **Predicate forms.** Three shapes are accepted, disambiguated at parse time
  by operand count:
  - `NonEmpty` (bare) — unary check, applied to the element.
  - `(VerifierOp x)`*partial verifier*: the bound value `x` is the second
    operand of the verifier; the element fills the first. So
    `(ForAll (GT 0) xs)` means "∀e ∈ xs. e > 0".
  - Any other `bool_expr`*full predicate*: re-evaluated once per element
    with the element bound as `@` in scope. The body can use `@`, `@.field`,
    `@._i`, etc.
- **Operand must be a `List`.** Passing a `Map` is a `TypeError` — quantifiers
  iterate over ordered sequences. For Maps, convert explicitly with
  `GetKeys` or `GetValues`: `(ForAll (GT 0) (GetValues .m))`.
- **Scalar fallback.** If the operand is a scalar (`Int`, `Float`, `String`,
  `Bool`, `Null`), the quantifier reduces to a single predicate application
  on that scalar. So `(ForAll (GT 0) 5)` is `True`, `(Exists (EQ 2) 10)` is
  `False`. This is intentional and documented — it lets callers treat "one
  value" and "many values" uniformly.
- **Empty list.** `(ForAll p [])` is `True` (vacuously true);
  `(Exists p [])` is `False` (no witness exists).
- **Nested quantifiers.** `@` always refers to the **innermost** enclosing
  element. Outer elements are accessible only through root-rooted paths
  (e.g. `.outer.inner.field`). This is lexical, innermost-wins scoping.
- **`@` outside a quantifier predicate** is a `ScopeError` (E010), caught by
  `validate_scope` during post-parse static analysis (see
  [Execution pipeline]#execution-pipeline).
- **Evaluation strategy.** Partial verifiers and `NonEmpty` use
  `apply_quantifier`, which resolves the bound operand once and applies the
  predicate per element. Full predicates use `apply_quantifier_full`, which
  takes a closure that invokes `eval_bool` per element with the element
  bound in the `scope` parameter. Full predicates therefore re-evaluate
  their body N times for N elements.

### Arithmetic — `Add Sub Mul Div Mod Neg Abs`

Implemented in [src/context/function.rs](src/context/function.rs).

- **Input types.** `Int` or `Float`. Anything else is `TypeError`.
- **Int + Int → Int** using `checked_add`, `checked_sub`, `checked_mul`,
  `checked_div`, `checked_rem`, `checked_neg`. Overflow → `IntegerOverflow`
  (E009). In particular `Abs(i64::MIN)` and `Neg(i64::MIN)` are overflow.
- **Mixed Int/Float** → Int is promoted, result is `Float`.
- **Float + Float → Float** using native IEEE operations. No overflow error;
  inputs that would overflow return `inf`/`-inf`, and NaN arithmetic
  propagates in the usual IEEE way.
- **Integer division truncates.** `(Div 7 2)` is `Int(3)`. For real division,
  promote explicitly: `(Div 7 2.0)` is `Float(3.5)`.
- **Division/modulo by zero** — both `Int 0` and `Float 0.0` divisors produce
  `DivisionByZero` (E006). Nightjar does not produce `inf` or NaN from
  `1.0 / 0.0`; we raise an error for consistency with integer semantics.
- **`Mod` works on floats.** `(Mod 3.5 1.5)` is `Float(0.5)` via Rust's `%`.
- **`Neg`, `Abs`** are unary; every other arithmetic op is binary.

### String — `Concat Length Substring Upper Lower`

- **`Concat`** (2-ary, `String × String → String`).
- **`Length`** (1-ary, `String → Int`). **Counts Unicode scalar values**, not
  bytes. `(Length "abc")` is `3`; `(Length "營收")` is `2`. This is what
  `Substring` indexes into — the two are consistent.
- **`Substring`** (3-ary, `String × Int × Int → String`). `(Substring s start
  len)` returns `len` characters starting at character index `start` (0-based,
  char-indexed). Going off the end of the string is an error; see
  [src/context/function.rs]src/context/function.rs for the exact bounds.
- **`Upper`, `Lower`** (1-ary) — Unicode-aware case folding via Rust's
  `to_uppercase`/`to_lowercase`. Characters without a case variant pass
  through unchanged.
- Any non-String argument is a `TypeError`.

### Collection — `Head Tail Get Count GetKeys GetValues`

- **`Head`** (1-ary) — first element of a list. Empty list → `IndexError`
  (E008). Non-list input → `TypeError`.
- **`Tail`** (1-ary) — list of all but the first element. Empty list →
  `IndexError`. Non-list input → `TypeError`.
- **`Get`** (2-ary) — polymorphic index:
  - `(Get list Int)` returns the element at that 0-based index. Out of range
    `IndexError`. Negative indices are not supported.
  - `(Get map String)` returns the value at that key. Missing key →
    `SymbolNotFound` with a message scoped to `Get`.
  - Any other combination is a `TypeError`.
- **`Count`** (1-ary) — length of a `List` or size of a `Map`. Non-container
  input is a `TypeError`.
- **`GetKeys`** (1-ary) — `Map → List<String>`, sorted by key for
  determinism. Non-map input is a `TypeError`.
- **`GetValues`** (1-ary) — `Map → List<Entity>`, values sorted by key
  (same ordering as `GetKeys`). Non-map input is a `TypeError`.

---

## Symbol table and flattening

Root-rooted (`.`) symbols are resolved against a **flattened symbol table**
built once per evaluation. The construction is in
[src/symbol_table.rs](src/symbol_table.rs).

### Flattening rules

Starting from the root `Entity`, every nested path is registered with its
fully qualified dotted key:

- The root itself is registered under `"."`.
- Each `Map` child is registered under `{parent}.{key}`.
- Each `List` element is registered under `{parent}._{i}` with `i` the
  **0-based** index.
- Recursion continues into nested maps and lists.
- Scalars and `Null` are registered at their current prefix; they are not
  descended into.

### Worked example

```json
{
  "ids":  [10, 20, 30],
  "meta": {"name": "x"}
}
```

Flattens to (all entries live in the same `HashMap<String, Entity>`):

| Key             | Value                   |
|-----------------|-------------------------|
| `.`             | the whole root `Map`    |
| `.ids`          | `List [10, 20, 30]`     |
| `.ids._0`       | `Int 10`                |
| `.ids._1`       | `Int 20`                |
| `.ids._2`       | `Int 30`                |
| `.meta`         | `Map { name: "x" }`     |
| `.meta.name`    | `String "x"`            |

Nested containers chain naturally: `{m: [[1,2],[3,4]]}` produces `.m._0._0 =
1`, `.m._1._1 = 4`, etc.

### Resolution

- **Root-rooted (`.path`).** `HashMap::get` — O(1) amortised. Missing path
  `SymbolNotFound`.
- **Element-rooted (`@path`).** Resolved by `resolve_in_entity` in
  [src/symbol_table.rs]src/symbol_table.rs: walks the `path` directly
  against the current element `Entity`. No flattening involved — cost is
  O(path length), and there's no extra allocation of a per-element table.
  `_N` segments are still list-index segments with the same 0-based convention.

### Invariants to preserve

Anything that touches the symbol table must preserve these invariants, or
quantifiers and lookups will silently disagree:

1. The flattening convention (`.` for maps, `._N` for lists, 0-based) must
   match `resolve_in_entity`'s walking convention.
2. Intermediate containers must be registered (not only leaves), so
   `(NonEmpty .data)` works on the container as a whole.
3. `HashMap` is allowed to iterate in arbitrary order internally, but any
   operator that exposes ordering to the user (today: `GetKeys`, `GetValues`)
   must sort.

---

## Execution pipeline

Nightjar is strictly two-phase. The entry points in
[src/executor.rs](src/executor.rs) drive both phases, but they are cleanly
separable — `parse` / `parse_with_config` give you Phase 1 alone.

```
  source string  ──►  tokens  ──►  AST (Spanned<…>)  ──►  ExecResult
                │              │                     │
                │              │                     └── Phase 2: symbol table + scope
                │              └── Phase 1b: parser + validate_scope
                └── Phase 1a: tokenizer
```

### Phase 1a — Tokenizer

Located in [src/language/parser.rs](src/language/parser.rs). Walks the source
with `char_indices` so all byte offsets land on character boundaries
(UTF-8-safe). Produces `Spanned<Token>` values. Highlights:

- **Negative literals.** `-5` and `-3.14` are single tokens when the `-` is
  immediately followed by a digit. `- 5` (with a space between) is a
  `ParseError` because `-` is not a standalone token.
- **Strings.** No escape sequences. An unterminated string literal
  (`"abc` with EOF before the closing quote) is a `ParseError` with a span
  pointing at the opening quote.
- **Keywords.** Case-sensitive. The tokenizer has an explicit keyword table
  for operator names and reserved literals.
- **Symbols.** `.` and `@` sigils with dot-separated segments. Segment
  characters are validated against `char::is_alphanumeric` (Unicode L* and
  N* categories) plus `_`.

### Phase 1b — Parser

Recursive-descent over the token stream. Key properties:

- Per-operator arity is enforced at parse time using `FuncOp::expected_arity`
  ([src/language/grammar.rs]src/language/grammar.rs), so `(Add 1)` and
  `(Substring "a" 0)` are caught before any evaluation.
- Depth tracking uses `ParserConfig::max_depth` (default 256). Exceeding it
  produces `RecursionError` (E007). The default is tunable via
  `ExecOptions::max_depth``ParserConfig::max_depth`.
- Every AST node is wrapped in `Spanned<T>` carrying the span of the
  originating tokens, so runtime errors can point back into the source
  string.

### Phase 1c — Scope validator

`validate_scope` ([src/language/parser.rs](src/language/parser.rs)) is a
post-parse AST walk that tracks an integer *predicate depth* counter.

- Entering the predicate position of a `Quantifier` increments the counter.
- Leaving it decrements.
- The quantifier's *operand* position stays at the current depth.
- Encountering an `@` symbol with counter `== 0` raises `ScopeError` (E010).

This catches `(EQ @.a 1)` at the top level, or `(AND (ForAll … .xs) (EQ @.a 1))`
where the second `@` is outside any predicate.

### Phase 2 — Executor

[src/executor.rs](src/executor.rs) drives evaluation through two mutually
recursive functions:

- `eval_bool(expr, symbols, opts, scope)` — evaluates a `SpannedBoolExpr` to
  `Result<bool, NightjarLanguageError>`. Dispatches on the `BoolExpr` variant.
- `eval_value(expr, symbols, opts, scope)` — evaluates a `SpannedValueExpr` to
  `Result<Entity, …>`. Dispatches on `ValueExpr`.

The `scope` parameter is `Option<&Entity>` — the current iteration element
bound inside a quantifier predicate, or `None` at the top level. Element-
rooted (`@`) symbol resolution reads from `scope`; a `None` `scope` combined
with an `@` symbol is a defensive `ScopeError` (in practice `validate_scope`
catches this first).

The quantifier arm branches on predicate kind:

- **Partial verifier / `NonEmpty`**`resolve_predicate` pre-evaluates the
  bound operand once, then calls
  `quantifier::apply_quantifier(op, &EvalPredicate, &operand, epsilon, span)`.
- **Full predicate** → calls
  `quantifier::apply_quantifier_full(op, &operand, span, closure)` where
  `closure: &Entity → Result<bool, …>` invokes `eval_bool` with the element
  bound in `scope`. Full predicates re-evaluate their body per element, which
  is how `@` inside the body resolves.

Top-level evaluation always starts with `scope = None`.

---

## Public API reference

All of the following are re-exported from the crate root
([src/lib.rs](src/lib.rs)). Consumers should `use nightjar_lang::{…}`.

### Parser

```rust
pub fn parse(input: &str) -> Result<Program, NightjarLanguageError>;

pub fn parse_with_config(
    input: &str,
    config: &ParserConfig,
) -> Result<Program, NightjarLanguageError>;

pub struct ParserConfig {
    pub max_depth: usize,   // default 256
}
```

`parse` is a convenience wrapper around `parse_with_config` using the default
`ParserConfig`. Both return a `Program` whose top-level expression is a
`SpannedBoolExpr`.

### AST

```rust
pub struct Program { pub expr: SpannedBoolExpr; }

pub struct Spanned<T> { pub node: T, pub span: Span; }
pub type   SpannedBoolExpr  = Spanned<BoolExpr>;
pub type   SpannedValueExpr = Spanned<ValueExpr>;

pub enum BoolExpr {
    Literal(bool),
    Verifier    { op: VerifierOp,    left:  Box<SpannedValueExpr>,
                                     right: Box<SpannedValueExpr> },
    And(Box<SpannedBoolExpr>, Box<SpannedBoolExpr>),
    Or (Box<SpannedBoolExpr>, Box<SpannedBoolExpr>),
    Not(Box<SpannedBoolExpr>),
    UnaryCheck  { op: UnaryCheckOp,  operand: Box<SpannedValueExpr> },
    Quantifier  { op: QuantifierOp,
                  predicate: Spanned<Predicate>,
                  operand:   Box<SpannedValueExpr> },
}

pub enum ValueExpr {
    Literal(Literal),
    Symbol   { root: SymbolRoot, path: String },
    FuncCall { op: FuncOp, args: Vec<SpannedValueExpr> },
}

pub enum Predicate {
    PartialVerifier { op: VerifierOp, bound: Box<SpannedValueExpr> },
    UnaryCheck(UnaryCheckOp),
    Full(Box<SpannedBoolExpr>),
}

pub enum Literal  { Int(i64), Float(f64), String(String), Bool(bool), Null }

pub enum VerifierOp   { EQ, NE, LT, LE, GT, GE }
pub enum UnaryCheckOp { NonEmpty }
pub enum QuantifierOp { ForAll, Exists }
pub enum FuncOp {
    Add, Sub, Mul, Div, Mod, Neg, Abs,
    Concat, Length, Substring, Upper, Lower,
    Head, Tail, Get, Count, GetKeys, GetValues,
}
pub enum Keyword { /* unified keyword enum used by the tokenizer */ }
```

`Spanned<T>` exists so every AST node carries its source span for diagnostics;
future passes that want to annotate nodes should wrap in `Spanned` rather
than threading spans separately.

### Runtime

```rust
pub enum Entity {
    Int(i64), Float(f64), String(String), Bool(bool),
    List(Vec<Entity>), Map(std::collections::HashMap<String, Entity>), Null,
}

pub enum TypeTag { Int, Float, String, Bool, List, Map, Null }

impl Entity {
    pub fn type_tag(&self) -> TypeTag;
    pub fn is_non_empty(&self) -> bool;
}

// Always-on conversions:
impl From<i64>     for Entity;
impl From<f64>     for Entity;
impl From<bool>    for Entity;
impl From<String>  for Entity;
impl From<&str>    for Entity;

// With the `json` feature:
#[cfg(feature = "json")]
impl From<serde_json::Value> for Entity;
```

```rust
pub struct SymbolTable { /* private */ }

impl SymbolTable {
    pub fn from_entity(root: Entity) -> Self;
    pub fn resolve(&self, symbol: &str, span: Span)
        -> Result<Entity, NightjarLanguageError>;
    pub fn resolve_root_path(&self, path: &str, span: Span)
        -> Result<Entity, NightjarLanguageError>;
    pub fn len(&self) -> usize;
    pub fn is_empty(&self) -> bool;
    pub fn contains(&self, symbol: &str) -> bool;
}

#[cfg(feature = "json")]
impl SymbolTable {
    pub fn from_json(value: serde_json::Value) -> Self;
}
```

```rust
pub struct ExecOptions {
    pub float_epsilon: f64,   // default 1e-10
    pub max_depth:     usize, // default 256
}
impl Default for ExecOptions { /* the defaults above */ }

pub enum ExecResult { True, False, Error(NightjarLanguageError) }

impl ExecResult {
    pub fn is_true(&self) -> bool;
    pub fn is_false(&self) -> bool;
    pub fn is_error(&self) -> bool;
}

impl From<Result<bool, NightjarLanguageError>> for ExecResult;

pub fn exec_entity(expression: &str, data: Entity, options: ExecOptions)
    -> ExecResult;

#[cfg(feature = "json")]
pub fn exec(expression: &str, data: serde_json::Value, options: ExecOptions)
    -> ExecResult;
```

### Errors

```rust
pub struct Span { pub start: usize, pub end: usize }
impl Span {
    pub const fn new(start: usize, end: usize) -> Self;
    pub const fn point(at: usize)               -> Self;
}

pub enum ErrorCode { E001, E002, E003, E004, E005, E006, E007, E008, E009, E010 }

pub enum NightjarLanguageError {
    ParseError       { span: Span, code: ErrorCode, message: String },
    TypeError        { span: Span, code: ErrorCode, message: String },
    ArgumentError    { span: Span, code: ErrorCode, message: String },
    SymbolNotFound   { span: Span, code: ErrorCode, message: String },
    AmbiguousSymbol  { span: Span, code: ErrorCode, message: String },
    DivisionByZero   { span: Span, code: ErrorCode, message: String },
    RecursionError   { span: Span, code: ErrorCode, message: String },
    IndexError       { span: Span, code: ErrorCode, message: String },
    IntegerOverflow  { span: Span, code: ErrorCode, message: String },
    ScopeError       { span: Span, code: ErrorCode, message: String },
}

impl NightjarLanguageError {
    pub fn span(&self)    -> Span;
    pub fn code(&self)    -> ErrorCode;
    pub fn message(&self) -> &str;
}
```

Error construction helpers (`parse_error`, `type_error`, …) live in
[src/error.rs](src/error.rs) and are `pub(crate)` — they are internal
conveniences, not part of the public API. Downstream code inspects errors
through `.code()`, `.span()`, `.message()`.

---

## Error codes — full reference

Every variant of `ErrorCode` that the implementation can actually raise,
with minimal reproducing expressions or conditions.

| Code | Variant            | Raised by              | Minimal reproducer                                                 |
|------|--------------------|------------------------|--------------------------------------------------------------------|
| E001 | `ParseError`       | Tokenizer, parser      | `GT 1 2` (no parens); `(GT 1 2` (unclosed); `"abc` (unterminated). |
| E002 | `TypeError`        | Verifier, functions, quantifier | `(GT "a" 1)`; `(Head 42)`; `(ForAll (GT 0) .map)`.        |
| E003 | `ArgumentError`    | Parser (arity check)   | `(GT 1 2 3)`; `(Add 1)`; `(Substring "a" 0)`.                      |
| E004 | `SymbolNotFound`   | Symbol resolver, `Get` on Map | `(GT .absent 0)` against `{}`; `(Get .m "missing")`.       |
| E005 | `AmbiguousSymbol`  | Reserved — not raised today | *(no reproducer; placeholder for future shorthand lookup)*     |
| E006 | `DivisionByZero`   | `Div`, `Mod`           | `(Div 1 0)`; `(Mod 1 0.0)`.                                        |
| E007 | `RecursionError`   | Parser (depth guard)   | `(NOT (NOT (NOT …)))` deeper than `max_depth` (default 256).       |
| E008 | `IndexError`       | `Head`, `Tail`, `Get` on List | `(Head [])`; `(Tail [])`; `(Get [1,2] 5)`.                  |
| E009 | `IntegerOverflow`  | Checked arithmetic     | `(EQ (Add 9223372036854775807 1) 0)`.                              |
| E010 | `ScopeError`       | `validate_scope` (and defensive runtime check) | `(EQ @.a 1)` at top level.                 |

E005 is reserved for a future shorthand-lookup mode (leaf-name resolution
with ambiguity detection). Tools should accept it as a valid code but should
not expect to see it from the current executor.

---

## Architecture and module layout

Everything lives under `src/`.

| Path | Responsibility |
|------|----------------|
| [src/lib.rs]src/lib.rs | Crate root and public re-exports. The authoritative list of what is `pub`. |
| [src/error.rs]src/error.rs | `NightjarLanguageError`, `ErrorCode`, `Span`, internal `pub(crate)` helper constructors. |
| [src/language/grammar.rs]src/language/grammar.rs | AST types, operator enums (`VerifierOp`, `FuncOp`, `QuantifierOp`, `UnaryCheckOp`, `Keyword`), `Predicate`, `Literal`, `SymbolRoot`, `Spanned`, `FuncOp::expected_arity`, authoritative EBNF in the module doc-comment. |
| [src/language/parser.rs]src/language/parser.rs | Tokenizer, recursive-descent parser, `ParserConfig`, `parse`, `parse_with_config`, post-parse `validate_scope`. |
| [src/symbol_table.rs]src/symbol_table.rs | `SymbolTable`, flattening algorithm, `resolve_in_entity` (element-rooted walker). |
| [src/executor.rs]src/executor.rs | `ExecOptions`, `ExecResult`, `exec`, `exec_entity`, private `eval_bool` / `eval_value` / `resolve_predicate`. |
| [src/context/mod.rs]src/context/mod.rs | Module grouping. |
| [src/context/entity.rs]src/context/entity.rs | `Entity`, `TypeTag`, `is_non_empty`, `From` impls (including `serde_json::Value` under the `json` feature). |
| [src/context/verifier.rs]src/context/verifier.rs | `apply_verifier` — EQ/NE/LT/LE/GT/GE dispatch, epsilon equality, NaN handling. |
| [src/context/function.rs]src/context/function.rs | `apply_function` — arithmetic, string, collection functions. |
| [src/context/quantifier.rs]src/context/quantifier.rs | `EvalPredicate`, `apply_predicate`, `apply_quantifier`, `apply_quantifier_full`. |
| [src/context/connective.rs]src/context/connective.rs | `apply_and`, `apply_or`, `apply_not`. |
| [tests/test_parser.rs]tests/test_parser.rs | Phase-1 integration tests. |
| [tests/test_executor.rs]tests/test_executor.rs | Phase-2 integration tests. |

The directory structure mirrors the two-phase pipeline: `language/*` is
everything the parser needs, `context/*` is everything the runtime needs,
and `executor.rs` + `symbol_table.rs` glue them together.

---

## Extending the language

All recipes below assume you are editing the crate in-place. Every extension
should ship with tests — see [Testing strategy](#testing-strategy).

### Recipe A — Add a new built-in function

Suppose you are adding a `Reverse` function that takes a `String` or a `List`
and returns the reversed value.

1. **Grammar layer**[src/language/grammar.rs]src/language/grammar.rs:
   - Add `Reverse` to `FuncOp`.
   - Add an entry in `FuncOp::expected_arity` returning `1`.
   - Add a keyword constant for `"Reverse"` to the `Keyword` enum (and any
     operator-name → `Keyword` mapping used by the tokenizer).
   - Update the EBNF comment to list `Reverse` under `arith_op` /
     `string_op` / `collection_op` as appropriate. Keep this block in sync
     with this document's `## Formal language specification` section.
2. **Tokenizer**[src/language/parser.rs]src/language/parser.rs:
   - Register the keyword string so the tokenizer emits the new `Keyword`
     variant.
3. **Parser**[src/language/parser.rs]src/language/parser.rs:
   - `func_expr` parsing is driven by `FuncOp::expected_arity`, so usually
     nothing new is needed. Verify by adding a parse test.
4. **Runtime**[src/context/function.rs]src/context/function.rs:
   - Extend the match arms in `apply_function` to handle `FuncOp::Reverse`.
   - Return the right `TypeTag`-tagged result; use `type_error` for bad
     input types; reuse the existing error helpers.
5. **Public re-exports**[src/lib.rs]src/lib.rs:
   - No change is needed if `FuncOp` is already re-exported (it is).
6. **Tests**:
   - Add unit tests in `#[cfg(test)] mod tests` inside
     [src/context/function.rs]src/context/function.rs for the happy path
     and each error branch.
   - Add at least one integration test in
     [tests/test_parser.rs]tests/test_parser.rs (parses) and
     [tests/test_executor.rs]tests/test_executor.rs (evaluates).
7. **Documentation**:
   - Update the operator table in [README.md]README.md under *Operator
     cheat-sheet*.
   - Update the relevant subsection under *Operator semantics* in this file.

### Recipe B — Add a new verifier

Adding, say, `Contains` (string contains substring):

1. Add `Contains` to `VerifierOp` (or, if it's genuinely a new family,
   create a new enum alongside `VerifierOp`). If in doubt, prefer a new
   family — verifiers are currently defined as total orders plus equality,
   and `Contains` breaks that.
2. If it lands in `VerifierOp`: extend `apply_verifier` in
   [src/context/verifier.rs]src/context/verifier.rs with the new arm,
   including type checks and `TypeError` for bad inputs.
3. Extend tokenizer, parser arity, and EBNF as in Recipe A.
4. Tests + docs as in Recipe A.

### Recipe C — Add a new quantifier

Example: `Count` (count elements satisfying a predicate) — note this would
return an `Int`, not a `Bool`, so it belongs in a new family (value-producing
quantifier), not in `QuantifierOp`.

1. Decide whether it is boolean-returning (goes alongside `ForAll`/`Exists`)
   or value-returning (goes alongside `FuncOp`). Boolean quantifiers reuse
   the `Quantifier` arm of `BoolExpr`; value-returning quantifiers need a
   new AST variant — plan that change first.
2. For a boolean quantifier: add a variant to `QuantifierOp`; extend
   `apply_quantifier` / `apply_quantifier_full` with the new reduction;
   extend `eval_bool`'s quantifier arm if new predicate shapes are needed.
3. For a value-returning quantifier: add a new `ValueExpr` variant (e.g.
   `ValueQuantifier { op, predicate, operand }`), extend the parser with a
   new parse arm, add an executor arm in `eval_value`. Re-export the new
   AST types from `lib.rs`.
4. Scope validator: entering the predicate position must still increment
   `predicate_depth`, otherwise `@` will escape.
5. Tests + docs as in Recipe A.

### Recipe D — Add a new data type

Any change to `Entity` is load-bearing; every operator that inspects
`TypeTag` potentially needs updating.

1. Add the variant to `Entity` and `TypeTag` in
   [src/context/entity.rs]src/context/entity.rs. Implement `type_tag()`
   and `is_non_empty()` — both must remain total.
2. Provide `From` impls as appropriate for host integrations. If the `json`
   feature has to represent the new type, update `From<serde_json::Value>`.
3. Update the flattener in [src/symbol_table.rs]src/symbol_table.rs so
   that the new type flattens correctly (either descend or not, but make
   the choice explicitly).
4. Update `apply_verifier` in
   [src/context/verifier.rs]src/context/verifier.rs — decide equality
   semantics for the new type, and whether ordering makes sense. Cross-type
   comparisons must remain `TypeError`.
5. Update `apply_function` in
   [src/context/function.rs]src/context/function.rs — every existing op
   must either accept or reject the new type explicitly (current match arms
   must gain a `_ => TypeError` path if they don't already).
6. Update `apply_quantifier` scalar fallback path to decide whether the new
   type supports iteration or scalar fallback.
7. Update `resolve_in_entity` in
   [src/symbol_table.rs]src/symbol_table.rs: if the new type is
   path-addressable (like Map/List) add a walker arm; otherwise let the
   `_ => TypeError` branch catch it.
8. Tests + docs; update the type table in both [README.md]README.md and
   the *Data types* subsection here.

### Recipe E — Swap the Map backing or the `Clone` strategy

If you replace `HashMap<String, Entity>` with a different container, the
only externally-visible invariant that must survive is that `GetKeys` and
`GetValues` produce sorted output. If you replace `Entity: Clone` with
`Rc<Entity>`-sharing, every `From` impl, every `apply_*` signature, and
every executor arm that clones will need touching — plan the change as a
whole crate refactor, not an incremental one, and keep the public API
stable.

### EBNF drift check

The EBNF in this file must match the EBNF block in
[src/language/grammar.rs:23-110](src/language/grammar.rs#L23-L110) exactly.
When you add an operator, update both and diff them in your commit. If they
drift, the parser and the documentation disagree and the next contributor
will act on the wrong one.

---

## Testing strategy

Nightjar has three layers of tests.

1. **Module-local unit tests.** Every non-trivial module has
   `#[cfg(test)] mod tests { … }` right at the bottom. These are the first
   line of defence for new behaviour. Every helper function and every match
   arm should have at least one happy-path test and one error-branch test
   (where an error branch exists).
2. **Integration tests.** [tests/test_parser.rs]tests/test_parser.rs and
   [tests/test_executor.rs]tests/test_executor.rs exercise the public API
   end-to-end: `parse`, `exec`, `exec_entity`, `ExecResult`, error variants.
   When you add an operator, add at least one parser test (it parses) and
   one executor test (it evaluates correctly on real data).
3. **Property-based testing.** `proptest` is in `[dev-dependencies]`. For
   operators with algebraic properties (associativity of `Concat`,
   commutativity of `Add` on `Int`, idempotence of `Upper ∘ Upper`, …),
   property tests are the appropriate form. Prefer them to hand-rolled
   edge-case tables for anything fuzz-adjacent.

### Running tests

```sh
cargo test                           # default features (json on)
cargo test --no-default-features     # core-only build (no serde_json)
cargo test --features yaml           # yaml dep compiled in
```

CI should run all three to prevent feature-gated regressions.

---

## Design decisions and rationale

### Why prefix notation?

Prefix (S-expression-style) notation removes operator precedence and
associativity entirely. There is no "does `AND` bind tighter than `OR`?"
question because every expression is fully parenthesised. The parser is a
few hundred lines, the grammar is small enough to fit in this document,
and the AST shape is exactly the expression's surface shape. An infix
surface syntax could be added externally later as a layer that compiles
to this AST — the canonical form stays prefix.

### Why a three-valued `ExecResult`?

Formal verification loses its meaning if a missing key silently becomes
`Null` and the rule silently becomes `False`. The host cannot tell a
rule-was-false from rule-could-not-be-evaluated. By carving `Error` out
from the result type, Nightjar forces the host to decide how to handle
each case (log, fail-open, fail-closed, retry, …) rather than collapsing
them at the library boundary.

### Why epsilon equality on floats but IEEE ordering?

Equality is the comparison most sensitive to IEEE 754 representation
error: `0.1 + 0.2 != 0.3` is a foot-gun that Nightjar rules should not
step on. Ordering is much less sensitive to the same error (the relative
ordering of two floats is preserved even when their binary representations
drift a ulp), and the IEEE rules for ordering are already what users expect
from comparisons. Mixing the two would require users to reason about an
epsilon in contexts where it doesn't help them.

### Why 0-based list indexing via `_N`?

0-based aligns with Rust, JavaScript, Python, C, and nearly every modern
language; 1-based would surprise most implementers. The `_` prefix keeps
the index segment syntactically distinct from map keys (which start with a
letter or digit-less identifier), and the same convention is used both in
the flat symbol table and in `resolve_in_entity`.

### Why flatten into a HashMap?

Most Nightjar rules look up several fields of the same payload; a flat
table makes each lookup O(1) after a single O(N) build. Path-walking at
every symbol reference would be cheaper in memory but much more expensive
per lookup, especially for rules with many references. The trade-off
matters most for wide, shallow data (typical API payloads); it's worse for
very long lists, which is why the host is expected to bound list size.

### Why no short-circuit in AND / OR today?

Error visibility. If `AND` short-circuits and the right-hand side would
have errored, the rule's author never learns. Non-short-circuit evaluation
surfaces every error, which is the behaviour a verification tool wants.
If a future release adds opt-in short-circuit (via `ExecOptions`), it must
document that errors in the skipped branch are hidden.

### Why `@` as a separate sigil, not a lambda?

A lambda would bring first-class functions, closures over names, and a
name-resolution layer into the language. Nightjar is deliberately first-
order — predicates are syntactic forms, not values. `@` is a lexical
marker that means "the current element of the innermost quantifier". It
has no runtime representation other than a value binding, and it cannot
escape its quantifier.

---

## Known limitations and deferred features

- **Shorthand symbol resolution (E005).** Looking up a leaf name like
  `revenue` without the full path `.data.revenue` and reporting
  `AmbiguousSymbol` when it matches multiple paths is planned but not
  implemented. The strict, fully-qualified form is the only form today.
- **Short-circuit evaluation.** Not available today; see the rationale
  above.
- **REPL.** There is no interactive shell for Nightjar rules; the batch/
  CLI pattern in the README serves the same purpose.
- **Infix → prefix converter.** An external convenience tool would let
  users write `1 + 2 > 0` and compile it to `(GT (Add 1 2) 0)`. Out of
  scope for the language itself; a reasonable standalone crate.
- **Currying beyond quantifier predicates.** The partial-verifier form
  `(GT 0)` is the only currying the language does. Generalising it is
  possible (arity-based disambiguation already discriminates partial from
  full) but explicitly deferred.
- **`no_std` / WASM.** Not a current target. Neither `std` removal nor a
  dedicated WASM build is in scope today.
- **Unbounded list unrolling.** The flattener registers one symbol-table
  entry per list element. The host is responsible for bounding list
  sizes before passing data to Nightjar; there is no configurable
  upper bound in the library.
- **`Program`-accepting `exec`.** Today, `exec` / `exec_entity` re-parse
  on every call. A future release may add a variant that accepts a
  pre-built `Program` for hot loops; for now, consumers that need
  parse-once behaviour can drive evaluation themselves using the public
  AST.

---

## License

Licensed under the Apache License, Version 2.0.
See [LICENSE](LICENSE) for the full text.

Copyright © Wayne Hong (h-alice) &lt;contact@halice.art&gt;.