jq_rs/
lib.rs

1//! ## Overview
2//!
3//! > Prior to v0.4.0 this crate was named [json-query].
4//!
5//! This rust crate provides access to [jq] 1.6 via the `libjq` C API (rather than
6//! "shelling out").
7//!
8//! By leveraging [jq] we can extract data from json strings using `jq`'s dsl.
9//!
10//! This crate requires Rust **1.32** or above.
11//!
12//! ## Usage
13//!
14//! The interface provided by this crate is very basic. You supply a jq program
15//! string and a string to run the program over.
16//!
17//! ```rust
18//! use jq_rs;
19//! // ...
20//!
21//! let res = jq_rs::run(".name", r#"{"name": "test"}"#);
22//! assert_eq!(res.unwrap(), "\"test\"\n".to_string());
23//! ```
24//!
25//! In addition to running one-off programs with `jq_rs::run()`, you can also
26//! use `jq_rs::compile()` to compile a jq program and reuse it with
27//! different inputs.
28//!
29//! ```rust
30//! use jq_rs;
31//!
32//! let tv_shows = r#"[
33//!     {"title": "Twilight Zone"},
34//!     {"title": "X-Files"},
35//!     {"title": "The Outer Limits"}
36//! ]"#;
37//!
38//! let movies = r#"[
39//!     {"title": "The Omen"},
40//!     {"title": "Amityville Horror"},
41//!     {"title": "The Thing"}
42//! ]"#;
43//!
44//! let mut program = jq_rs::compile("[.[].title] | sort").unwrap();
45//!
46//! assert_eq!(
47//!     &program.run(tv_shows).unwrap(),
48//!     "[\"The Outer Limits\",\"Twilight Zone\",\"X-Files\"]\n"
49//! );
50//!
51//! assert_eq!(
52//!     &program.run(movies).unwrap(),
53//!     "[\"Amityville Horror\",\"The Omen\",\"The Thing\"]\n",
54//! );
55//! ```
56//!
57//! ## A Note on Performance
58//!
59//! While the benchmarks are far from exhaustive, they indicate that much of the
60//! runtime of a simple jq program goes to the compilation. In fact, the compilation
61//! is _quite expensive_.
62//!
63//! ```text
64//! run one off             time:   [48.594 ms 48.689 ms 48.800 ms]
65//! Found 6 outliers among 100 measurements (6.00%)
66//!   3 (3.00%) high mild
67//!   3 (3.00%) high severe
68//!
69//! run pre-compiled        time:   [4.0351 us 4.0708 us 4.1223 us]
70//! Found 15 outliers among 100 measurements (15.00%)
71//!   6 (6.00%) high mild
72//!   9 (9.00%) high severe
73//! ```
74//!
75//! If you have a need to run the same jq program multiple times it is
76//! _highly recommended_ to retain a pre-compiled `JqProgram` and reuse it.
77//!
78//! ## Handling Output
79//!
80//! The return values from jq are _strings_ since there is no certainty that the
81//! output will be valid json. As such the output will need to be parsed if you want
82//! to work with the actual data types being represented.
83//!
84//! In such cases you may want to pair this crate with [serde_json] or similar.
85//!
86//! For example, here we want to extract the numbers from a set of objects:
87//!
88//! ```rust
89//! use jq_rs;
90//! use serde_json::{self, json};
91//!
92//! // ...
93//!
94//! let data = json!({
95//!     "movies": [
96//!         { "title": "Coraline", "year": 2009 },
97//!         { "title": "ParaNorman", "year": 2012 },
98//!         { "title": "Boxtrolls", "year": 2014 },
99//!         { "title": "Kubo and the Two Strings", "year": 2016 },
100//!         { "title": "Missing Link", "year": 2019 }
101//!     ]
102//! });
103//!
104//! let query = "[.movies[].year]";
105//! // program output as a json string...
106//! let output = jq_rs::run(query, &data.to_string()).unwrap();
107//! // ... parse via serde
108//! let parsed: Vec<i64> = serde_json::from_str(&output).unwrap();
109//!
110//! assert_eq!(vec![2009, 2012, 2014, 2016, 2019], parsed);
111//! ```
112//!
113//! Barely any of the options or flags available from the [jq] cli are exposed
114//! currently.
115//! Literally all that is provided is the ability to execute a _jq program_ on a blob
116//! of json.
117//! Please pardon my dust as I sort out the details.
118//!
119//! ## Linking to libjq
120//!
121//! This crate requires access to `libjq` at build and/or runtime depending on the
122//! your choice.
123//!
124//! When the `bundled` feature is enabled (**off by default**) `libjq` is provided
125//! and linked statically to your crate by [jq-sys] and [jq-src]. Using this feature
126//! requires having autotools and gcc in `PATH` in order for the to build to work.
127//!
128//! Without the `bundled` feature, _you_ will need to ensure your crate
129//! can link to `libjq` in order for the bindings to work.
130//!
131//! You can choose to compile `libjq` yourself, or perhaps install it via your
132//! system's package manager.
133//! See the [jq-sys building docs][jq-sys-building] for details on how to share
134//! hints with the [jq-sys] crate on how to link.
135//!
136//! [jq]: https://github.com/stedolan/jq
137//! [serde_json]: https://github.com/serde-rs/json
138//! [jq-rs]: https://crates.io/crates/jq-rs
139//! [json-query]: https://crates.io/crates/json-query
140//! [jq-sys]: https://github.com/onelson/jq-sys
141//! [jq-sys-building]: https://github.com/onelson/jq-sys#building
142//! [jq-src]: https://github.com/onelson/jq-src
143
144#![deny(missing_docs)]
145
146extern crate jq_sys;
147#[cfg(test)]
148#[macro_use]
149extern crate serde_json;
150
151mod errors;
152mod jq;
153
154use std::ffi::CString;
155
156pub use errors::{Error, Result};
157
158/// Run a jq program on a blob of json data.
159///
160/// In the case of failure to run the program, feedback from the jq api will be
161/// available in the supplied `String` value.
162/// Failures can occur for a variety of reasons, but mostly you'll see them as
163/// a result of bad jq program syntax, or invalid json data.
164pub fn run(program: &str, data: &str) -> Result<String> {
165    compile(program)?.run(data)
166}
167
168/// A pre-compiled jq program which can be run against different inputs.
169pub struct JqProgram {
170    jq: jq::Jq,
171}
172
173impl JqProgram {
174    /// Runs a json string input against a pre-compiled jq program.
175    pub fn run(&mut self, data: &str) -> Result<String> {
176        if data.trim().is_empty() {
177            // During work on #4, #7, the parser test which allows us to avoid a memory
178            // error shows that an empty input just yields an empty response BUT our
179            // implementation would yield a parse error.
180            return Ok("".into());
181        }
182        let input = CString::new(data)?;
183        self.jq.execute(input)
184    }
185}
186
187/// Compile a jq program then reuse it, running several inputs against it.
188pub fn compile(program: &str) -> Result<JqProgram> {
189    let prog = CString::new(program)?;
190    Ok(JqProgram {
191        jq: jq::Jq::compile_program(prog)?,
192    })
193}
194
195#[cfg(test)]
196mod test {
197
198    use super::{compile, run, Error};
199    use matches::assert_matches;
200    use serde_json;
201
202    #[test]
203    fn reuse_compiled_program() {
204        let query = r#"if . == 0 then "zero" elif . == 1 then "one" else "many" end"#;
205        let mut prog = compile(&query).unwrap();
206        assert_eq!(prog.run("2").unwrap(), "\"many\"\n");
207        assert_eq!(prog.run("1").unwrap(), "\"one\"\n");
208        assert_eq!(prog.run("0").unwrap(), "\"zero\"\n");
209    }
210
211    #[test]
212    fn jq_state_is_not_global() {
213        let input = r#"{"id": 123, "name": "foo"}"#;
214        let query1 = r#".name"#;
215        let query2 = r#".id"#;
216
217        // Basically this test is just to check that the state pointers returned by
218        // `jq::init()` are completely independent and don't share any global state.
219        let mut prog1 = compile(&query1).unwrap();
220        let mut prog2 = compile(&query2).unwrap();
221
222        assert_eq!(prog1.run(input).unwrap(), "\"foo\"\n");
223        assert_eq!(prog2.run(input).unwrap(), "123\n");
224        assert_eq!(prog1.run(input).unwrap(), "\"foo\"\n");
225        assert_eq!(prog2.run(input).unwrap(), "123\n");
226    }
227
228    fn get_movies() -> serde_json::Value {
229        json!({
230            "movies": [
231                { "title": "Coraline", "year": 2009 },
232                { "title": "ParaNorman", "year": 2012 },
233                { "title": "Boxtrolls", "year": 2014 },
234                { "title": "Kubo and the Two Strings", "year": 2016 },
235                { "title": "Missing Link", "year": 2019 }
236            ]
237        })
238    }
239
240    #[test]
241    fn identity_nothing() {
242        assert_eq!(run(".", "").unwrap(), "".to_string());
243    }
244
245    #[test]
246    fn identity_empty() {
247        assert_eq!(run(".", "{}").unwrap(), "{}\n".to_string());
248    }
249
250    #[test]
251    fn extract_dates() {
252        let data = get_movies();
253        let query = "[.movies[].year]";
254        let output = run(query, &data.to_string()).unwrap();
255        let parsed: Vec<i64> = serde_json::from_str(&output).unwrap();
256        assert_eq!(vec![2009, 2012, 2014, 2016, 2019], parsed);
257    }
258
259    #[test]
260    fn extract_name() {
261        let res = run(".name", r#"{"name": "test"}"#);
262        assert_eq!(res.unwrap(), "\"test\"\n".to_string());
263    }
264
265    #[test]
266    fn unpack_array() {
267        let res = run(".[]", "[1,2,3]");
268        assert_eq!(res.unwrap(), "1\n2\n3\n".to_string());
269    }
270
271    #[test]
272    fn compile_error() {
273        let res = run(". aa12312me  dsaafsdfsd", "{\"name\": \"test\"}");
274        assert_matches!(res, Err(Error::InvalidProgram));
275    }
276
277    #[test]
278    fn parse_error() {
279        let res = run(".", "{1233 invalid json ahoy : est\"}");
280        assert_matches!(res, Err(Error::System { .. }));
281    }
282
283    #[test]
284    fn just_open_brace() {
285        let res = run(".", "{");
286        assert_matches!(res, Err(Error::System { .. }));
287    }
288
289    #[test]
290    fn just_close_brace() {
291        let res = run(".", "}");
292        assert_matches!(res, Err(Error::System { .. }));
293    }
294
295    #[test]
296    fn total_garbage() {
297        let data = r#"
298        {
299            moreLike: "an object literal but also bad"
300            loveToDangleComma: true,
301        }"#;
302
303        let res = run(".", data);
304        assert_matches!(res, Err(Error::System { .. }));
305    }
306
307    pub mod mem_errors {
308        //! Attempting run a program resulting in bad field access has been
309        //! shown to sometimes trigger a use after free or double free memory
310        //! error.
311        //!
312        //! Technically the program and inputs are both valid, but the
313        //! evaluation of the program causes bad memory access to happen.
314        //!
315        //! https://github.com/onelson/json-query/issues/4
316
317        use super::*;
318
319        #[test]
320        fn missing_field_access() {
321            let prog = ".[] | .hello";
322            let data = "[1,2,3]";
323            let res = run(prog, data);
324            assert_matches!(res, Err(Error::System { .. }));
325        }
326
327        #[test]
328        fn missing_field_access_compiled() {
329            let mut prog = compile(".[] | .hello").unwrap();
330            let data = "[1,2,3]";
331            let res = prog.run(data);
332            assert_matches!(res, Err(Error::System { .. }));
333        }
334    }
335}