[][src]Crate lexpr

This crate provides facilities for parsing, printing and manipulating S-expression data. S-expressions are the format used to represent code and data in the Lisp language family.

((name . "John Doe")
 (age . 43)
 (address
  (street "10 Downing Street")
  (city "London"))
 (phones "+44 1234567" "+44 2345678"))

lexpr also supports more complex types; including keywords and configurable tokens for true, false and nil, by default using Scheme syntax:

(define-class rectangle ()
 (width
   #:init-value #nil ;; Nil value
   #:settable #t     ;; true
   #:guard (> width 10)
 )
 (height
   #:init-value 10
   #:writable #f ;; false
  ))

Note that keywords, and the corresponding #: notation, is not part of standard Scheme, but is supported by lexpr's default parser settings.

There are three common ways that you might find yourself needing to work with JSON data in Rust:

  • As text data. An unprocessed string of S-expression data that you receive from a Lisp program, read from a file, or prepare to send to a Lisp program.

  • As an dynamically typed representation. Maybe you want to check that some JSON data is valid before passing it on, but without knowing the structure of what it contains. Or you want to handle arbirarily structured data, like Lisp code.

  • As a statically typed Rust data structure. When you expect all or most of your data to conform to a particular structure and want to get real work done without the dynamically typed nature of S-expressions tripping you up.

Currently, lexpr only handles the first two items of this list; the last item, also known as Serde support, is the next big item the agenda.

Operating on dynamically typed S-expression data

Any valid S-expression can be manipulated using the Value data structure.

Constructing S-expression values

 use lexpr::{Value, Error};

 fn example() -> Result<(), Error> {
     // Some s-expressions a &str.
     let data = r#"((name . "John Doe")
                    (age . 43)
                    (phones "+44 1234567" "+44 2345678"))"#;

     // Parse the string of data into sexpr::Sexp.
     let v: Value = lexpr::from_str(data)?;

     // Access parts of the data by indexing with square brackets.
     println!("Please call {} at the number {}", v["name"], v["phones"][1]);

     Ok(())
 }

More about S-expressions and their representation

Note that the representation chosen is intended for serialization and deserialization, not for manipulation with the same complexity guarantees as in a Lisp implementation. In particular, the representation of lists is based on Rust's Vec data type, which has quite different characteristics from the singly-linked lists used in Lisp. As long as you don't attempt to use the lexpr::Value type as the value representation of a "regular" Lisp implementation (which would also be made impossible by the fact that Lisp demands garbage collection), or rely on efficently forming suffixes of lists, this should be no issue.

What are S-expressions?

S-expressions, as mentioned above, is the notation used by various dialects of Lisp to represent data (and code). As a data format, it is roughly comparable to JSON (JavaScript Object Notation), but syntactically more lightweight and simpler. Note that different Lisp dialects have notational differences for some data types, and some may lack specific data types completely. This section tries to give an overview over the different types of values representable by the Value data type and how it relates to different Lisp dialects. All examples are given in the syntax used in Guile Scheme implementation.

The parser and serializer implementation in lexpr can be tailored to parse and generate S-expression data in various "dialects" in use by different Lisp variants; the aim is to cover large parts of R6RS and R7RS Scheme with some Guile and Racket extensions, as well as Emacs Lisp.

In the following, the S-expression values that are modeled by lexpr are introduced, In general, S-expression values can be split into the two categories of "atoms" and lists.

Atoms

Atoms are primitive (i.e., non-compound) data type such as numbers, strings and booleans.

Symbols and keywords

Lisp also has a data type not commonly found in other languages, namely "symbols". A symbol is conceptually similar to identifiers in other languages, but allow for a much richer set of characters than allowed for identifiers in other languages. Also, identifiers in other languages can typically not be used in data; lisps expose them as a primitive data type, a result of the homoiconicity of the Lisp language family.

this-is-a-symbol ; A single symbol, dashes are allowed
another.symbol   ; Periods are allowed as well
foo$bar!<_>?     ; As are quite a few other characters

Another data type, present in some Lisp dialects, such as Emacs Lisp, Common Lisp, and several Scheme implementations, are keywords. These are also supported by lexpr. Keywords are very similiar to symbols, but are typically prefixed by : or #: and are used for different purposes in the language.

#:foo ; A keyword named "foo", written in Guile/Racket notation
:bar  ; A keyword named "bar", written in Emacs Lisp or Common Lisp notation

Booleans

#t ; The literal representing true
#f ; The literal representing false

The empty list and "nil"

In traditional Lisps, the end of list is represented as by a special atom written as nil. In Scheme, the empty list is an atom written as (), and there is no special nil symbol. Both nil and the empty list are present and distinguishable in lexpr, but the empty list is not considered an atom (see also below for more on list representation in lexpr).

Numbers

Numbers are represented by the Number abstract data type. It can handle signed and unsigned integers, each up to 64 bit size, as well as floating point numbers.

There is nothing surprising about the number syntax, extensions such as binary, octal and hexadecimal numbers are not yet implemented.

1 -4 3.14 ; A postive, negative, and a floating point number

Strings

"Hello World!"

Lists

Lists are a sequence of values, of either atoms or lists. In fact, Lisp does not have a "real" list data type, but instead lists are represented by chains of so-called "cons cells", which are used to form a singly-linked list, terminated by the empty list (or nil in tradional Lisps). It is also possible for the terminator to not be the empty list, but instead be an arbitrary primitive data type (i.e., an atom). In this case, the list is refered to as an "improper" or "dotted" list. Here are some examples:

("Hello" "World")   ; A regular list
;; A list having with another, single-element, list as
;; its second item
("Hello" ("World"))
(1 . 2) ; A cons cell, represented as an improper list by `lexpr`
(1 2 . 3) ; A dotted (improper) list

Lists are not only used to represent sequences of values, but also associative arrays, also known as maps. A map is represented as a list containing sub-lists, where the first element of each sub-list is the key, and the remainder of the list is the associated value.

;; An association list with the symbols `a` and `b` as keys
((a . 42) (b . 43))

In lexpr, lists are implemented not as singly-linked lists, but using vectors, which is more efficient generally. However, that choice precludes an efficient implementation of taking a suffix of an existing list.

Modules

atom

Non-compound S-expression value type.

number

Dynamically typed number type.

parse

S-expression parser and options.

print

Converting S-expression values into text.

value

The Value enum, a dynamically typed way of representing any valid S-expression value.

Macros

sexp

Construct a Value using syntax similar to regular S-expressions.

Structs

Error

This type represents all possible errors that can occur when serializing or deserializing S-expression data.

Number

Represents an S-expression number, whether integer or floating point.

Parser

Parser for the S-expression text representation.

Printer

A printer for S-expression values.

Enums

Atom

A Lisp atom.

Value

Represents an S-expression value.

Functions

from_reader

Parse a value from an IO stream of S-expressions, using the default parser options.

from_reader_custom

Parse a value from an IO stream containing a single S-expression.

from_slice

Parse a value from bytes representing a single S-expressions, using the default parser options.

from_slice_custom

Parse a value from bytes representing a single S-expression.

from_str

Parse a value from a string slice representing a single S-expressions, using the default parser options.

from_str_custom

Parse a value from a string slice representing a single S-expression.

to_string

Serialize the given value an S-expression string, using the default printer options.

to_string_custom

Serialize the given value an S-expression string.

to_vec

Serialize the given value as byte vector containing S-expression text, using the default printer options.

to_vec_custom

Serialize the given value as byte vector containing S-expression text.

to_writer

Serialize the given value value as S-expression text into the IO stream, using the default printer options.

to_writer_custom

Serialize the given value value as S-expression text into the IO stream.

Type Definitions

Result

Alias for a Result with the error type lexpr::Error.