# `has` array-RHS overload — implementation spec
Status: proposal, unimplemented.
Owner: TBD.
Affected crates: `jetro-core` (parser, parse tests, builtin ops).
## 1. Problem
`lhs has rhs` is parsed today (`parse/parser.rs::parse_contains`, lines 531-544)
as syntactic sugar for `lhs.includes(rhs)`. `includes` is single-needle
membership — it expects `rhs` to be a scalar.
When a user writes an array literal on the right:
```text
$..books!.find_all(@.attribute_code has ["a", "b"])
```
the predicate silently degenerates:
- If `@.attribute_code` is a **string**: `includes_apply` (ops/misc.rs:162)
hits the `Val::Str` arm and computes `s.contains(item.as_str().unwrap_or_default())`.
`as_str()` on `Val::Arr` returns `None`, `unwrap_or_default()` yields `""`,
every string contains `""`, predicate is `true` for every row, `find_all`
returns the entire input. This is the bug we want to fix.
- If `@.attribute_code` is an **array**: `val_to_key` serialises the literal
to the JSON string `["a","b"]`, no array element matches that string, the
predicate is `false` for every row.
Either branch is wrong relative to user intent ("the field contains both `a`
and `b`"). The string branch is the dangerous one because failure is silent.
## 2. Proposed semantics
Make `has` accept an array literal on the right and dispatch to a containment
check that means "every element of RHS is present in LHS". Keep existing
scalar-RHS behaviour unchanged.
| `Arr` / typed vec | element equality (`includes`) | every literal appears as an element (subset) |
| `Str` / `StrSlice` | substring containment (`includes`) | every needle is a substring |
| `Obj` / `ObjSmall` | key existence (`includes`) | every key exists |
| other | `false` | `false` |
Design rules:
1. Overload `has`, do not introduce a second keyword. `in` stays reserved for
`for x in xs` / `let ... in ...` (grammar.pest:14, 185-215).
2. The overload triggers only when the RHS *is syntactically an array literal*
(`Expr::Array(...)`) at parse time. A runtime array produced by a
sub-expression keeps `includes` semantics — we do not want runtime branching
inside `includes` itself, and we do not want to make `has` polymorphic on a
value whose shape isn't visible to the planner.
3. RHS element types must be scalar literals (`Val::Str`, `Val::Int`, `Val::Float`,
`Val::Bool`, `Val::Null`). Reject non-literal or nested array elements at
parse time with a clear error — do not silently fall through to `includes`.
4. Empty RHS (`lhs has []`) is `true` (vacuous truth — every element of the
empty set is present). Document this explicitly in the parser test.
## 3. Parser change
File: `jetro-core/src/parse/parser.rs`, function `parse_contains`
(currently lines 531-544).
Today:
```rust
fn parse_contains(pair: Pair<Rule>) -> Expr {
let mut inner = pair.into_inner();
let lhs = parse_expr(inner.next().unwrap());
match inner.next() {
None => lhs,
Some(_op_pair) => {
let rhs = parse_expr(inner.next().unwrap());
Expr::Chain(
Box::new(lhs),
vec![Step::Method("includes".to_string(), vec![Arg::Pos(rhs)])],
)
}
}
}
```
Change: when `rhs` is `Expr::Array(elems)` and every element is a literal,
lower the chain step to a new builtin `has_all` taking the literal vector as
its argument. Otherwise keep the existing `includes` lowering.
Pseudocode:
```rust
Some(_op_pair) => {
let rhs = parse_expr(inner.next().unwrap());
let step = match &rhs {
Expr::Array(elems) if elems.iter().all(is_scalar_literal) => {
let lits = elems.iter().map(literal_to_val).collect::<Vec<_>>();
Step::Method("has_all".into(), vec![Arg::Pos(Expr::Literal(Val::arr(lits)))])
}
Expr::Array(_) => {
// Non-literal element inside `has [...]`. Reject loudly.
return Expr::ParseError(
"has [...] requires scalar literal elements; \
use contains_all/contains_any for dynamic arrays".into(),
);
}
_ => Step::Method("includes".into(), vec![Arg::Pos(rhs)]),
};
Expr::Chain(Box::new(lhs), vec![step])
}
```
Helpers to add in the same file (private):
- `fn is_scalar_literal(e: &Expr) -> bool` — `true` for `Expr::Literal` whose
inner `Val` is `Str | Int | Float | Bool | Null`.
- `fn literal_to_val(e: &Expr) -> Val` — unwrap the literal; panic with a
clear message if called on a non-literal (the call site already filtered).
Check whether the codebase has an `Expr::ParseError` variant. If not, emit the
error through the parser's existing error channel (look at how other lowering
failures are surfaced — e.g. `classify_chain_write` in `parse/write_terminal.rs`
for a precedent of raising parse-time errors during AST rewriting).
## 4. Builtin: `has_all`
Two implementation strategies. Pick **B** unless there is a reason to prefer A.
### A. Reuse existing builtins (parser-only change)
Lower to different existing builtins depending on a syntactic guess:
`contains_all` if the elements are strings, a synthesised `.includes(x) &&
.includes(y) && ...` chain for arrays. Fragile because the planner can't see
the receiver type at parse time and `contains_all` is string-only
(`ops/regex.rs:186-189`). Rejected.
### B. New builtin `has_all` (recommended)
Add a single builtin that does the right thing per receiver type at runtime.
#### B.1 Builtin definition
File: `jetro-core/src/builtins/defs.rs`. Add a new struct in the same style as
`Has` (lines 3469-3494):
```rust
/// `has_all([a, b, ...])` — every literal in the argument is present in
/// the receiver. Arrays: element equality. Strings: substring. Objects:
/// key existence. Empty argument: always `true`. Returns `Val::Bool`.
pub(crate) struct HasAll;
impl Builtin for HasAll {
const METHOD: BuiltinMethod = BuiltinMethod::HasAll;
const NAME: &'static str = "has_all";
fn spec() -> BuiltinSpec {
BuiltinSpec::new(BuiltinCategory::Scalar, BuiltinCardinality::OneToOne)
.indexed()
.view_native()
.demand_law(BuiltinDemandLaw::MapLike)
.order_effect(BuiltinPipelineOrderEffect::Preserves)
}
#[inline]
fn apply_args(
recv: &crate::data::value::Val,
args: &super::BuiltinArgs,
) -> Option<crate::data::value::Val> {
match args {
super::BuiltinArgs::Val(v) => super::has_all_apply(recv, v),
_ => None,
}
}
}
```
The `BuiltinMethod::HasAll` variant must be registered:
1. Add `HasAll` to the `BuiltinMethod` enum in `builtins/mod.rs`.
2. Add `HasAll` to the `for_each_builtin!` macro list in `builtins/mod.rs`.
Follow the existing `Has` registration as a template — both registrations
appear next to each other.
Do not add an alias for `has_all`. Users should reach it only through the
`has [...]` sugar; exposing it as a dotted method is a separate decision.
(If we later expose it, do it intentionally with a CHANGELOG note.)
#### B.2 Runtime helper
File: `jetro-core/src/builtins/ops/path.rs` (next to `has_apply` at line 258).
```rust
/// Returns `Val::Bool(true)` when every element of `needles` is present
/// in `recv`. Receiver dispatch mirrors `has_apply`:
/// - `Arr` / typed vecs: element equality (string-coerced).
/// - `Str` / `StrSlice`: substring containment per needle.
/// - `Obj` / `ObjSmall`: key existence per needle.
/// - other: `None` (caller falls back to receiver-passthrough).
///
/// Empty `needles` returns `Val::Bool(true)` (vacuous truth).
#[inline]
pub fn has_all_apply(recv: &Val, needles: &Val) -> Option<Val> {
let Val::Arr(items) = needles else { return None };
if items.is_empty() {
return Some(Val::Bool(true));
}
let all_present = items.iter().all(|n| {
let key = crate::util::val_to_key(n);
matches!(has_apply(recv, &key), Some(Val::Bool(true)))
});
Some(Val::Bool(all_present))
}
```
This delegates per-needle to the existing `has_apply` so receiver-type rules
stay in one place. Cost is O(N·M) but N (needles) is statically bounded by
the array literal, so this is fine.
If profiling shows the per-needle `val_to_key` allocation matters, pre-convert
the literal array once at parse time into a `Val::StrVec` and add a fast path
that takes `BuiltinArgs::StrVec`. Out of scope for the initial change.
## 5. Tests
All tests live under `jetro-core/src/tests/`. Add a new file
`tests/has_array_rhs.rs` (or extend the existing `has` tests if there is a
matching file — grep for `has_apply` / `fn test_has` first).
Required cases:
1. **String receiver, all needles present** —
`{"a":"hello world"}` with `$.a has ["hello","world"]` → `true`.
2. **String receiver, one needle missing** —
`$.a has ["hello","xyz"]` → `false`.
3. **Array receiver, subset** —
`{"a":["x","y","z"]}` with `$.a has ["x","y"]` → `true`.
4. **Array receiver, not a subset** —
`$.a has ["x","q"]` → `false`.
5. **Array receiver, numeric elements coerced** —
`{"a":[1,2,3]}` with `$.a has [1,2]` → `true`.
6. **Object receiver, key existence** —
`{"a":{"x":1,"y":2}}` with `$.a has ["x","y"]` → `true`,
`$.a has ["x","z"]` → `false`.
7. **Empty array** —
any receiver with `has []` → `true` (vacuous truth).
8. **Scalar RHS untouched** —
`$.a has "x"` still lowers to `includes` and behaves as today.
9. **The original bug reproduction** —
`$..books!.find_all(@.attribute_code has ["a","b"])` over a fixture where
`attribute_code` is a string equal to `"abc"`:
- Old behaviour (regression guard if we keep it): all books returned.
- New behaviour: only books whose `attribute_code` contains both `a` and `b`.
Write the test against the new behaviour; do not preserve the bug.
10. **Non-literal array RHS is rejected at parse time** —
`@.field has [@.x, @.y]` produces a parse error pointing at the array.
Confirms strategy B's parser guard (Section 3).
Also rerun `cargo test --lib parse::` and `cargo test --lib builtins::` to
catch incidental breakage.
## 6. Documentation
Update:
- `jetro-core/README.md` (or whichever doc enumerates operators) — note that
`has` accepts an array literal on the right and document semantics per
receiver type.
- `CLAUDE.md` "v2 Tier 1" section under `has`/membership — one-line note.
- `CHANGELOG.md` for the next release — under "Language": "`has` now accepts
an array literal RHS, lowering to `has_all`."
## 7. Non-goals
- No new `in` operator. Grammar comment at line 162 already declares `has`
the membership keyword; adding a second spelling is bikeshed bait.
- No runtime polymorphism inside `includes` for `Val::Arr` arguments. Keep
`includes` strictly single-needle so its semantics are predictable.
- No "any" variant via punctuation. If users want OR semantics they write
`has "a" || has "b"` or call `contains_any` directly. A future
`has_any([...])` sugar (e.g. `has ?[a, b]`) is a separate proposal.
- No change to the `contains_all` / `contains_any` builtins themselves; they
remain string-only barrier ops.
## 8. Risk and rollout
- **Risk of regression**: low. Scalar-RHS `has` is the dominant case and is
untouched. The change is gated on `Expr::Array` at parse time.
- **Backwards compatibility**: anyone relying on the current
always-true-on-string behaviour (which is a bug) breaks. That's intended.
Call it out in the CHANGELOG.
- **Rollout**: single PR. Parser change, builtin registration, runtime
helper, tests, docs. No flag.
## 9. Estimated size
Around 200 LOC plus tests: parser (~30), builtin def + registration (~40),
runtime helper (~25), tests (~80), docs (~25). One focused PR.