jq_rs/lib.rs
1//! ## Overview
2//!
3//! > Prior to v0.4.0 this crate was named [json-query].
4//!
5//! This rust crate provides access to [jq] 1.6 via the `libjq` C API (rather than
6//! "shelling out").
7//!
8//! By leveraging [jq] we can extract data from json strings using `jq`'s dsl.
9//!
10//! This crate requires Rust **1.32** or above.
11//!
12//! ## Usage
13//!
14//! The interface provided by this crate is very basic. You supply a jq program
15//! string and a string to run the program over.
16//!
17//! ```rust
18//! use jq_rs;
19//! // ...
20//!
21//! let res = jq_rs::run(".name", r#"{"name": "test"}"#);
22//! assert_eq!(res.unwrap(), "\"test\"\n".to_string());
23//! ```
24//!
25//! In addition to running one-off programs with `jq_rs::run()`, you can also
26//! use `jq_rs::compile()` to compile a jq program and reuse it with
27//! different inputs.
28//!
29//! ```rust
30//! use jq_rs;
31//!
32//! let tv_shows = r#"[
33//! {"title": "Twilight Zone"},
34//! {"title": "X-Files"},
35//! {"title": "The Outer Limits"}
36//! ]"#;
37//!
38//! let movies = r#"[
39//! {"title": "The Omen"},
40//! {"title": "Amityville Horror"},
41//! {"title": "The Thing"}
42//! ]"#;
43//!
44//! let mut program = jq_rs::compile("[.[].title] | sort").unwrap();
45//!
46//! assert_eq!(
47//! &program.run(tv_shows).unwrap(),
48//! "[\"The Outer Limits\",\"Twilight Zone\",\"X-Files\"]\n"
49//! );
50//!
51//! assert_eq!(
52//! &program.run(movies).unwrap(),
53//! "[\"Amityville Horror\",\"The Omen\",\"The Thing\"]\n",
54//! );
55//! ```
56//!
57//! ## A Note on Performance
58//!
59//! While the benchmarks are far from exhaustive, they indicate that much of the
60//! runtime of a simple jq program goes to the compilation. In fact, the compilation
61//! is _quite expensive_.
62//!
63//! ```text
64//! run one off time: [48.594 ms 48.689 ms 48.800 ms]
65//! Found 6 outliers among 100 measurements (6.00%)
66//! 3 (3.00%) high mild
67//! 3 (3.00%) high severe
68//!
69//! run pre-compiled time: [4.0351 us 4.0708 us 4.1223 us]
70//! Found 15 outliers among 100 measurements (15.00%)
71//! 6 (6.00%) high mild
72//! 9 (9.00%) high severe
73//! ```
74//!
75//! If you have a need to run the same jq program multiple times it is
76//! _highly recommended_ to retain a pre-compiled `JqProgram` and reuse it.
77//!
78//! ## Handling Output
79//!
80//! The return values from jq are _strings_ since there is no certainty that the
81//! output will be valid json. As such the output will need to be parsed if you want
82//! to work with the actual data types being represented.
83//!
84//! In such cases you may want to pair this crate with [serde_json] or similar.
85//!
86//! For example, here we want to extract the numbers from a set of objects:
87//!
88//! ```rust
89//! use jq_rs;
90//! use serde_json::{self, json};
91//!
92//! // ...
93//!
94//! let data = json!({
95//! "movies": [
96//! { "title": "Coraline", "year": 2009 },
97//! { "title": "ParaNorman", "year": 2012 },
98//! { "title": "Boxtrolls", "year": 2014 },
99//! { "title": "Kubo and the Two Strings", "year": 2016 },
100//! { "title": "Missing Link", "year": 2019 }
101//! ]
102//! });
103//!
104//! let query = "[.movies[].year]";
105//! // program output as a json string...
106//! let output = jq_rs::run(query, &data.to_string()).unwrap();
107//! // ... parse via serde
108//! let parsed: Vec<i64> = serde_json::from_str(&output).unwrap();
109//!
110//! assert_eq!(vec![2009, 2012, 2014, 2016, 2019], parsed);
111//! ```
112//!
113//! Barely any of the options or flags available from the [jq] cli are exposed
114//! currently.
115//! Literally all that is provided is the ability to execute a _jq program_ on a blob
116//! of json.
117//! Please pardon my dust as I sort out the details.
118//!
119//! ## Linking to libjq
120//!
121//! This crate requires access to `libjq` at build and/or runtime depending on the
122//! your choice.
123//!
124//! When the `bundled` feature is enabled (**off by default**) `libjq` is provided
125//! and linked statically to your crate by [jq-sys] and [jq-src]. Using this feature
126//! requires having autotools and gcc in `PATH` in order for the to build to work.
127//!
128//! Without the `bundled` feature, _you_ will need to ensure your crate
129//! can link to `libjq` in order for the bindings to work.
130//!
131//! You can choose to compile `libjq` yourself, or perhaps install it via your
132//! system's package manager.
133//! See the [jq-sys building docs][jq-sys-building] for details on how to share
134//! hints with the [jq-sys] crate on how to link.
135//!
136//! [jq]: https://github.com/stedolan/jq
137//! [serde_json]: https://github.com/serde-rs/json
138//! [jq-rs]: https://crates.io/crates/jq-rs
139//! [json-query]: https://crates.io/crates/json-query
140//! [jq-sys]: https://github.com/onelson/jq-sys
141//! [jq-sys-building]: https://github.com/onelson/jq-sys#building
142//! [jq-src]: https://github.com/onelson/jq-src
143
144#![deny(missing_docs)]
145
146extern crate jq_sys;
147#[cfg(test)]
148#[macro_use]
149extern crate serde_json;
150
151mod errors;
152mod jq;
153
154use std::ffi::CString;
155
156pub use errors::{Error, Result};
157
158/// Run a jq program on a blob of json data.
159///
160/// In the case of failure to run the program, feedback from the jq api will be
161/// available in the supplied `String` value.
162/// Failures can occur for a variety of reasons, but mostly you'll see them as
163/// a result of bad jq program syntax, or invalid json data.
164pub fn run(program: &str, data: &str) -> Result<String> {
165 compile(program)?.run(data)
166}
167
168/// A pre-compiled jq program which can be run against different inputs.
169pub struct JqProgram {
170 jq: jq::Jq,
171}
172
173impl JqProgram {
174 /// Runs a json string input against a pre-compiled jq program.
175 pub fn run(&mut self, data: &str) -> Result<String> {
176 if data.trim().is_empty() {
177 // During work on #4, #7, the parser test which allows us to avoid a memory
178 // error shows that an empty input just yields an empty response BUT our
179 // implementation would yield a parse error.
180 return Ok("".into());
181 }
182 let input = CString::new(data)?;
183 self.jq.execute(input)
184 }
185}
186
187/// Compile a jq program then reuse it, running several inputs against it.
188pub fn compile(program: &str) -> Result<JqProgram> {
189 let prog = CString::new(program)?;
190 Ok(JqProgram {
191 jq: jq::Jq::compile_program(prog)?,
192 })
193}
194
195#[cfg(test)]
196mod test {
197
198 use super::{compile, run, Error};
199 use matches::assert_matches;
200 use serde_json;
201
202 #[test]
203 fn reuse_compiled_program() {
204 let query = r#"if . == 0 then "zero" elif . == 1 then "one" else "many" end"#;
205 let mut prog = compile(&query).unwrap();
206 assert_eq!(prog.run("2").unwrap(), "\"many\"\n");
207 assert_eq!(prog.run("1").unwrap(), "\"one\"\n");
208 assert_eq!(prog.run("0").unwrap(), "\"zero\"\n");
209 }
210
211 #[test]
212 fn jq_state_is_not_global() {
213 let input = r#"{"id": 123, "name": "foo"}"#;
214 let query1 = r#".name"#;
215 let query2 = r#".id"#;
216
217 // Basically this test is just to check that the state pointers returned by
218 // `jq::init()` are completely independent and don't share any global state.
219 let mut prog1 = compile(&query1).unwrap();
220 let mut prog2 = compile(&query2).unwrap();
221
222 assert_eq!(prog1.run(input).unwrap(), "\"foo\"\n");
223 assert_eq!(prog2.run(input).unwrap(), "123\n");
224 assert_eq!(prog1.run(input).unwrap(), "\"foo\"\n");
225 assert_eq!(prog2.run(input).unwrap(), "123\n");
226 }
227
228 fn get_movies() -> serde_json::Value {
229 json!({
230 "movies": [
231 { "title": "Coraline", "year": 2009 },
232 { "title": "ParaNorman", "year": 2012 },
233 { "title": "Boxtrolls", "year": 2014 },
234 { "title": "Kubo and the Two Strings", "year": 2016 },
235 { "title": "Missing Link", "year": 2019 }
236 ]
237 })
238 }
239
240 #[test]
241 fn identity_nothing() {
242 assert_eq!(run(".", "").unwrap(), "".to_string());
243 }
244
245 #[test]
246 fn identity_empty() {
247 assert_eq!(run(".", "{}").unwrap(), "{}\n".to_string());
248 }
249
250 #[test]
251 fn extract_dates() {
252 let data = get_movies();
253 let query = "[.movies[].year]";
254 let output = run(query, &data.to_string()).unwrap();
255 let parsed: Vec<i64> = serde_json::from_str(&output).unwrap();
256 assert_eq!(vec![2009, 2012, 2014, 2016, 2019], parsed);
257 }
258
259 #[test]
260 fn extract_name() {
261 let res = run(".name", r#"{"name": "test"}"#);
262 assert_eq!(res.unwrap(), "\"test\"\n".to_string());
263 }
264
265 #[test]
266 fn unpack_array() {
267 let res = run(".[]", "[1,2,3]");
268 assert_eq!(res.unwrap(), "1\n2\n3\n".to_string());
269 }
270
271 #[test]
272 fn compile_error() {
273 let res = run(". aa12312me dsaafsdfsd", "{\"name\": \"test\"}");
274 assert_matches!(res, Err(Error::InvalidProgram));
275 }
276
277 #[test]
278 fn parse_error() {
279 let res = run(".", "{1233 invalid json ahoy : est\"}");
280 assert_matches!(res, Err(Error::System { .. }));
281 }
282
283 #[test]
284 fn just_open_brace() {
285 let res = run(".", "{");
286 assert_matches!(res, Err(Error::System { .. }));
287 }
288
289 #[test]
290 fn just_close_brace() {
291 let res = run(".", "}");
292 assert_matches!(res, Err(Error::System { .. }));
293 }
294
295 #[test]
296 fn total_garbage() {
297 let data = r#"
298 {
299 moreLike: "an object literal but also bad"
300 loveToDangleComma: true,
301 }"#;
302
303 let res = run(".", data);
304 assert_matches!(res, Err(Error::System { .. }));
305 }
306
307 pub mod mem_errors {
308 //! Attempting run a program resulting in bad field access has been
309 //! shown to sometimes trigger a use after free or double free memory
310 //! error.
311 //!
312 //! Technically the program and inputs are both valid, but the
313 //! evaluation of the program causes bad memory access to happen.
314 //!
315 //! https://github.com/onelson/json-query/issues/4
316
317 use super::*;
318
319 #[test]
320 fn missing_field_access() {
321 let prog = ".[] | .hello";
322 let data = "[1,2,3]";
323 let res = run(prog, data);
324 assert_matches!(res, Err(Error::System { .. }));
325 }
326
327 #[test]
328 fn missing_field_access_compiled() {
329 let mut prog = compile(".[] | .hello").unwrap();
330 let data = "[1,2,3]";
331 let res = prog.run(data);
332 assert_matches!(res, Err(Error::System { .. }));
333 }
334 }
335}