Expand description
A query language for JSON data that searches for matching regular paths in the JSON tree, using a derivation of regular expressions.
§Quick Start
let input = r#"{"users": [{"name": "Alice"}, {"name": "Bob"}]}"#;
let json: jsongrep::Value = serde_json::from_str(input).unwrap();
let results = jsongrep::grep(&json, "users[*].name").unwrap();
assert_eq!(results.len(), 2);
assert_eq!(results[0].value.to_string(), r#""Alice""#);
assert_eq!(results[1].value.to_string(), r#""Bob""#);For more examples, see the examples
directory in the repository.
§Overview
The engine is implemented as a deterministic finite automaton (DFA). The DFA is constructed from a query AST, which is a tree-like structure that represents the query. The DFA is then used to search for matches in the input JSON data.
A JSON data structure is represented as a tree, where each node is a JSON value (string, number, boolean, null, or object/array) and each edge is either a field name or an index. For example, let’s consider the following JSON data:
{
"name": "John Doe",
"age": 30,
"foo": [1, 2, 3]
}The corresponding tree structure would be the root node, with three edges:
"name", "age", and "foo". The "name" edge would point to the string
"John Doe" and the "age" edge would point to the number 30. The "foo"
edge would point to a node with three edges of the array access [0], [1],
and [2], which point to the numbers 1, 2, and 3, respectively.
To query the JSON document, the query and document are both parsed into intermediary ASTs. The query AST is then used to construct first a non-deterministic finite automaton (NFA) which is then determinized into a deterministic finite automaton (DFA) that can be directly simulated against the input JSON document.
For more details on the automaton constructions, see the dfa and
nfa modules of the query module.
§Query Language
The query language relies on regular expression syntax, with some modifications to support JSON.
| Operator | Example | Description |
|---|---|---|
| Sequence | foo.bar.baz | Concatenation: match path foo → bar → baz |
| Disjunction | foo | bar | Union: match either foo or bar |
| Kleene star | ** | Match zero or more field accesses |
| Repetition | foo* | Repeat the preceding step zero or more times |
| Wildcards | * or [*] | Match any single field or array index |
| Optional | foo?.bar | Optional foo field access |
| Field access | foo or "foo bar" | Match a specific field (quote if spaces) |
| Array index | [0] or [1:3] | Match specific index or slice (exclusive end) |
These queries can be arbitrarily nested with parentheses. For example,
foo.(bar|baz).qux matches foo.bar.qux or foo.baz.qux.
This also means that you can recursively descend any path with (* | [*])*,
e.g., (* | [*])*.foo to find all paths matching foo field at any
depth.
Here are some example queries and their meanings:
name: Matches thenamefield in the root object (e.g.,"John Doe").address.street: Matches thestreetfield inside theaddressobject.address.*: Matches any field in theaddressobject (e.g.,street,city, etc.).address.[*]: Matches all elements in an array ifaddresswere an array.(name|age): Matches either thenameoragefield in the root object.address.([*] | *)*: Matches any value at any depth underaddress.
We can also use ranges to match specific indices in arrays:
foo.[2:4]: Matches elements at indices 2 and 3 in thefooarray.foo.[2:]: Matches all elements in thefooarray from index 2 onward.
Finally, we can use wildcards to match any field or index:
*: Matches any single field in the root object.[*]: Matches any single array index in the root array.[*].*: Matches any field inside each element of an array.([*] | *)*: Matches any field or index at any level of the JSON tree.
§Playground
You can try queries interactively in the playground.
Modules§
- commands
- Available subcommands for jsongrep binary.
- query
- This module provides the main query engine implementation, as well as the parser for the query language and the intermediary AST representations of queries.
- utils
- Miscellaneous utility functions.
Macros§
- field
- Constructs a query that matches a specific field name.
Enums§
- Value
- Re-export
serde_json_borrow::Valueso downstream users don’t need to depend onserde_json_borrowdirectly. Represents any valid JSON value.
Functions§
- grep
- Query a JSON document with a query string, returning all matches.