lexpr/lib.rs
1#![deny(missing_docs)]
2#![warn(rust_2018_idioms)]
3
4//! This crate provides facilities for parsing, printing and
5//! manipulating S-expression data. S-expressions are the format used
6//! to represent code and data in the Lisp language family.
7//!
8//! ```scheme
9//! ((name . "John Doe")
10//! (age . 43)
11//! (address
12//! (street "10 Downing Street")
13//! (city "London"))
14//! (phones "+44 1234567" "+44 2345678"))
15//! ```
16//!
17//! `lexpr` also supports more complex types; including keywords and
18//! configurable tokens for `true`, `false` and `nil`, by default
19//! using Scheme syntax:
20//!
21//! ```scheme
22//! (define-class rectangle ()
23//! (width
24//! #:init-value #nil ;; Nil value
25//! #:settable #t ;; true
26//! #:guard (> width 10)
27//! )
28//! (height
29//! #:init-value 10
30//! #:writable #f ;; false
31//! ))
32//! ```
33//!
34//! Note that keywords, and the corresponding `#:` notation, is not
35//! part of standard Scheme, but is supported by `lexpr`'s default
36//! parser settings.
37//!
38//! There are three common ways that you might find yourself needing
39//! to work with S-expression data in Rust:
40//!
41//! - **As text data**. An unprocessed string of S-expression data
42//! that you receive from a Lisp program, read from a file, or
43//! prepare to send to a Lisp program.
44//!
45//! - **As an dynamically typed representation**. Maybe you want to check that
46//! some S-expression data is valid before passing it on, but without knowing
47//! the structure of what it contains. Or you want to handle arbirarily
48//! structured data, like Lisp code.
49//!
50//! - **As a statically typed Rust data structure**. When you expect all
51//! or most of your data to conform to a particular structure and
52//! want to get real work done without the dynamically typed nature
53//! of S-expressions tripping you up.
54//!
55//! Only the first two items of this list are handled by `lexpr`; for conversion
56//! from and to statically typed Rust data structures see the [`serde-lexpr`]
57//! crate.
58//!
59//! # Operating on dynamically typed S-expression data
60//!
61//! Any valid S-expression can be manipulated using the [`Value`] data
62//! structure.
63//!
64//! ## Constructing S-expression values
65//!
66//! ```
67//! use lexpr::{Value, parse::Error};
68//!
69//! # fn main() -> Result<(), Error> {
70//! // Some s-expressions a &str.
71//! let data = r#"((name . "John Doe")
72//! (age . 43)
73//! (phones "+44 1234567" "+44 2345678"))"#;
74//!
75//! // Parse the string of data into lexpr::Value.
76//! let v = lexpr::from_str(data)?;
77//!
78//! // Access parts of the data by indexing with square brackets.
79//! println!("Please call {} at the number {}", v["name"], v["phones"][1]);
80//!
81//! Ok(())
82//! # }
83//! ```
84//!
85//! # What are S-expressions?
86//!
87//! S-expressions, as mentioned above, are the notation used by various dialects
88//! of Lisp to represent data (and code). As a data format, it is roughly
89//! comparable to JSON (JavaScript Object Notation), but syntactically more
90//! lightweight. Also, JSON is designed for consumption and generation by
91//! machines, which is reflected by the fact that it does not specify a syntax
92//! for comments. S-expressions on the other hand, are intended to be written
93//! and read by humans as well as machines. In this respect, they are more like
94//! YAML, but have a simpler and less syntactically rigid structure. For
95//! example, indentation does not convey any information to the parser, but is
96//! used only to allow for easier digestion by humans.
97//!
98//! Different Lisp dialects have notational differences for some data types, and
99//! some may lack specific data types completely. This section tries to give an
100//! overview over the different types of values representable by the [`Value`]
101//! data type and how it relates to different Lisp dialects. All examples are
102//! given in the syntax used in [Guile](https://www.gnu.org/software/guile/)
103//! Scheme implementation.
104//!
105//! The parser and serializer implementation in `lexpr` can be
106//! tailored to parse and generate S-expression data in various
107//! "dialects" in use by different Lisp variants; the aim is to cover
108//! large parts of R6RS and R7RS Scheme with some Guile and Racket
109//! extensions, as well as Emacs Lisp.
110//!
111//! In the following, the S-expression values that are modeled by
112//! `lexpr` are introduced, In general, S-expression values can be
113//! split into the two categories primitive types and compound types.
114//!
115//! ## Primitive types
116//!
117//! Primitive, or non-compound types are those that can not
118//! recursively contain arbitrary other values. Numbers,
119//! strings and booleans fall into this category.
120//!
121//! ### Symbols and keywords
122//!
123//! Lisp has a data type not commonly found in other languages, namely
124//! "symbols". A symbol is conceptually similar to identifiers in other
125//! languages, but allow for a much richer set of characters than typically
126//! allowed for identifiers in other languages. Also, identifiers in other
127//! languages can usually not be used in data; Lisps expose them as a
128//! primitive data type, a result of the
129//! [homoiconicity](https://en.wikipedia.org/wiki/Homoiconicity) of the Lisp
130//! language family.
131//!
132//!
133//! ```scheme
134//! this-is-a-symbol ; A single symbol, dashes are allowed
135//! another.symbol ; Periods are allowed as well
136//! foo$bar!<_>? ; As are quite a few other characters
137//! ```
138//!
139//! Another data type, present in some Lisp dialects, such as Emacs
140//! Lisp, Common Lisp, and several Scheme implementations, are
141//! keywords. These are also supported by `lexpr`. Keywords are very
142//! similiar to symbols, but are typically prefixed by `:` or `#:` and
143//! are used for different purposes in the language.
144//!
145//! ```lisp
146//! #:foo ; A keyword named "foo", written in Guile/Racket notation
147//! :bar ; A keyword named "bar", written in Emacs Lisp or Common Lisp notation
148//! ```
149//!
150//! ### Booleans
151//!
152//! While Scheme has a primitive boolean data type, more traditional Lisps such
153//! as Emacs Lisp and Common Lisp do not; they instead use the symbols `t` and
154//! `nil` to represent boolean values. Using parser options, `lexpr` allows to
155//! parse these symbols as booleans, which may be desirable in some
156//! circumstances, as booleans are simpler to handle than symbols.
157//!
158//! ```scheme
159//! #t ; The literal representing true
160//! #f ; The literal representing false
161//! ```
162//!
163//! ### The empty list and "nil"
164//!
165//! In traditional Lisps, the end of list is represented as by a
166//! special atom written as `nil`. In Scheme, the empty list is an
167//! atom written as `()`, and there `nil` is just a regular
168//! symbol. Both `nil` and the empty list are present and
169//! distinguishable in `lexpr`.
170//!
171//! ### Numbers
172//!
173//! Numbers are represented by the [`Number`] abstract data type. It can handle
174//! signed and unsigned integers, each up to 64 bit size, as well as floating
175//! point numbers. The Scheme syntax for hexadecimal, octal, and binary literals
176//! is supported.
177//!
178//! ```scheme
179//! 1 -4 3.14 ; A postive, negative, and a floating point number
180//! #xDEADBEEF ; An integer written using decimal notation
181//! #o0677 ; Octal
182//! #b10110 ; Binary
183//! ```
184//!
185//! Scheme has an elaborate numerical type hierarchy (called "numeric tower"),
186//! which supports fractionals, numbers of arbitrary size, and complex
187//! numbers. These more advanced number types are not yet supported by `lexpr`.
188//!
189//!
190//! ### Characters
191//!
192//! Characters are unicode codepoints, represented by Rust's `char` data type
193//! embedded in the [`Value::Char`] variant.
194//!
195//! ### Strings
196//!
197//! ```scheme
198//! "Hello World!"
199//! ```
200//!
201//! ## Lists
202//!
203//! Lists are a sequence of values, of either atoms or lists. In fact,
204//! Lisp does not have a "real" list data type, but instead lists are
205//! represented by chains of so-called "cons cells", which are used to
206//! form a singly-linked list, terminated by the empty list (or `nil`
207//! in tradional Lisps). It is also possible for the terminator to not
208//! be the empty list, but instead be af an arbitrary other data type.
209//! In this case, the list is refered to as an "improper" or "dotted"
210//! list. Here are some examples:
211//!
212//! ```scheme
213//! ("Hello" "World") ; A regular list
214//! ;; A list having with another, single-element, list as
215//! ;; its second item
216//! ("Hello" ("World"))
217//! (1 . 2) ; A cons cell, represented as an improper list by `lexpr`
218//! (1 2 . 3) ; A dotted (improper) list
219//! ```
220//!
221//! Lists are not only used to represent sequences of values, but also
222//! associative arrays, also known as maps. A map is represented as a list
223//! containing cons cells, where the first field of each cons cell, called
224//! `car`, for obscure historical reasons, is the key, and the second field
225//! (`cdr`) of the cons cell is the associated value.
226//!
227//! ```scheme
228//! ;; An association list with the symbols `a` and `b` as keys
229//! ((a . 42) (b . 43))
230//! ```
231//!
232//! ## Vectors
233//!
234//! In contrast to lists, which are represented as singly-linked chains of "cons
235//! cells", vectors allow O(1) indexing, and thus are quite similar to Rusts
236//! `Vec` datatype.
237//!
238//! ```scheme
239//! #(1 2 "three") ; A vector in Scheme notation
240//! ```
241//!
242//! ## Byte vectors
243//!
244//! Byte vectors are similar to regular vectors, but are uniform: each element
245//! only holds a single byte, i.e. an exact integer in the range of 0 to 255,
246//! inclusive.
247//!
248//! ```scheme
249//! #u8(41 42 43) ; A byte vector
250//! ```
251//!
252//! [Serde]: https://crates.io/crates/serde
253//! [`serde-lexpr`]: https://docs.rs/serde-lexpr
254
255/// Construct a [`Value`] using syntax similar to regular S-expressions.
256///
257/// The macro is intended to have a feeling similiar to an implicitly
258/// quasiquoted Scheme expression.
259///
260/// For interpolation, use `unquote` (aka "`,`"), like this:
261///
262/// ```
263/// # use lexpr::{sexp, Value};
264/// let number = 42;
265/// let list = sexp!((41 ,number 43));
266/// assert_eq!(list.to_vec(), Some(vec![Value::from(41), Value::from(42), Value::from(43)]));
267/// ```
268///
269/// You can also provide a *Rust* expression to interpolate by using
270/// parentheses:
271///
272/// ```
273/// # use lexpr::{sexp, Value};
274/// let number = 40;
275/// let alist = sexp!(((answer . ,(number + 2))));
276/// assert_eq!(alist.get("answer"), Some(&Value::from(42)));
277/// ```
278///
279/// The interpolated variable (or expression) must yield a value that
280/// is convertible to [`Value`] using the `From` trait.
281///
282/// # Booleans
283///
284/// ```
285/// # use lexpr::sexp;
286/// let t = sexp!(#f);
287/// let f = sexp!(#t);
288/// ```
289///
290/// # Symbols and keywords
291///
292/// Due to syntactic restrictions of Rust's macro system, to use
293/// kebab-case, you need to use the `#"..."` syntax.
294///
295/// ```
296/// # use lexpr::sexp;
297/// let sym = sexp!(symbol);
298/// let kw = sexp!(#:keyword);
299/// assert!(sym.is_symbol());
300/// assert!(kw.is_keyword());
301///
302/// let kebab_sym = sexp!(#"kebab-symbol");
303/// let kebab_kw = sexp!(#:"kebab-keyword");
304/// assert!(kebab_sym.is_symbol());
305/// assert!(kebab_kw.is_keyword());
306/// ```
307///
308/// Since `lexpr` version 0.2.7, symbols following the R7RS (Scheme)
309/// syntax, which additionally consist of *only* characters that Rust
310/// considers punctuation can be written without quotation:
311///
312/// ```
313/// # use lexpr::sexp;
314/// let expr = sexp!((+ 1 2));
315/// assert!(expr.is_list());
316///
317/// let strange_symbol = sexp!(!$%&*+-./:<=>?@^~);
318/// assert_eq!(strange_symbol.as_symbol(), Some("!$%&*+-./:<=>?@^~"));
319/// ```
320///
321/// # Characters
322///
323/// Characters can be written using Rust's character syntax:
324///
325/// ```
326/// # use lexpr::sexp;
327/// let ch = sexp!('λ');
328/// assert!(ch.is_char());
329/// assert_eq!(ch.as_char(), Some('λ'));
330/// ```
331///
332/// # Lists
333///
334/// Lists can be formed by using the same syntax as in Lisp, including dot
335/// notation.
336///
337/// ```
338/// # use lexpr::sexp;
339/// let l1 = sexp!((1 2 3));
340/// let l2 = sexp!((1 . (2 . (3 . ()))));
341/// let l3 = sexp!((1 2 . (3 . ())));
342/// assert_eq!(l1, l2);
343/// assert_eq!(l2, l3);
344/// ```
345///
346/// Improper (aka dotted) lists are supported as well:
347///
348/// ```
349/// # use lexpr::sexp;
350/// let dotted = sexp!((1 2 . three));
351/// assert!(dotted.is_dotted_list());
352/// let tail = dotted.as_cons().unwrap().cdr();
353/// assert!(tail.is_cons());
354/// assert_eq!(tail, &sexp!((2 . three)));
355/// ```
356///
357/// # Vectors
358///
359/// Vectors can be written using Scheme notation, e.g.:
360///
361/// ```
362/// # use lexpr::sexp;
363/// let v = sexp!(#(1 2 "three"));
364/// assert!(v.is_vector());
365/// assert_eq!(v[2], sexp!("three"));
366/// ```
367///
368/// [`Value`]: enum.Value.html
369pub use lexpr_macros::sexp;
370
371mod syntax;
372
373pub mod cons;
374pub mod datum;
375pub mod number;
376pub mod parse;
377pub mod print;
378pub mod value;
379
380#[doc(inline)]
381pub use self::parse::{
382 from_reader, from_reader_custom, from_slice, from_slice_custom, from_str, from_str_custom,
383 Parser,
384};
385
386#[doc(inline)]
387pub use self::print::{
388 to_string, to_string_custom, to_vec, to_vec_custom, to_writer, to_writer_custom, Printer,
389};
390
391#[doc(inline)]
392pub use value::Value;
393
394#[doc(inline)]
395pub use datum::Datum;
396
397#[doc(inline)]
398pub use cons::Cons;
399
400#[doc(inline)]
401pub use value::Index;
402
403#[doc(inline)]
404pub use number::Number;
405
406#[cfg(test)]
407mod tests;