1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
//! A library to assist in the creation of an interpreter or compiler and its
//! associated runtime.
//!
//! This is the third release of the Kamo project. This crate adds type-system
//! and type-checking support to the project. The type system is used to infer
//! the types of the intermediate representation and the AST. The type-checker
//! is used to check the types of the intermediate representation and the AST.
//!
//! This is the second release of the Kamo project. This crate contains the
//! parser library and the memory management library, which also includes the
//! runtime representation of values.
//!
//! ## Memory Management
//!
//! The [`Mutator`](crate::mem::Mutator) implements automatic memory management.
//! It holds an [`Arena`](crate::mem::Arena) for each type. Memory collection is
//! done by a mark and sweep garbage collector.
//!
//! The mutator implementation is thread local. The arena allocator is
//! implemented as a vector of [`Bucket`](crate::mem::Bucket)s. Each bucket
//! holds a vector of [`Slot`](crate::mem::Slot)s. The slots are used to store
//! the values. Buckets have a fixed size and are allocated on the heap. The
//! buckets are allocated on demand when the slots are full.
//!
//! The hierarchy of types is as follows:
//!
//! * Mutator contains Arenas.
//! * Arena contains Buckets.
//! * Bucket contains Slots.
//! * Slot contains a value.
//! * a value is represented by a Pointer.
//!
//! Values managed by the mutator must be traceable. The mutator uses the
//! [`Trace`](crate::mem::Trace) trait to trace the values. The `Trace` trait is
//! implemented for all types that can contain values managed by the mutator.
//!
//! [`Pointer`](crate::mem::Pointer)s use reference counting to keep track of
//! the number of references to the value. When the number of references reaches
//! zero the value may be subject to garbage collection. To avoid cycles
//! [`Pair`](crate::value::Pair)s, also known as `Cons` cells, and
//! [`Vector`](crate::value::Vector)s do not hold locks to their elements.
//! Instead they are traced by the mutator when memory collection is done. This
//! is the reason why a drop of the reference count of a pointer to zero does
//! not immediately free the value.
//!
//! The garbage collector is implemented as a mark and sweep collector,
//! [`collect_garbage()`](crate::mem::Mutator::collect_garbage()). All values
//! which hold a lock are roots. The garbage collector traces all roots and
//! marks all values that are reachable from the roots. All values that are not
//! marked are unreachable and are freed. Tracing is done iteratively by
//! repeatedly calling the [`trace()`](crate::mem::Trace::trace()) method on the
//! values and adding the values to a work list. This continues until the work
//! list is empty.
//!
//! ### `Allocator` Trait
//!
//! The mutator uses the global allocator to allocate memory for buckets,
//! arenas, vectors and strings etc. Dynamic memory allocation is done by the
//! allocator provided by the [`std::alloc`] crate. Since the implementation of
//! the allocator provided by the standard library is highly optimized it would
//! be a waste of effort to reimplement it.
//!
//! As a consequence the mutator is not able to free memory when the global
//! allocator fails to allocate memory. This is because the mutator does not
//! know which memory is allocated by the global allocator. The mutator only
//! knows which memory is allocated by its arenas. Collection of memory is
//! triggered only by the allocation pressure.
//!
//! When the allocator trait is stabilized the mutator will be able to use the
//! the trait to mitigate allocation failures by freeing memory when the global
//! allocator fails.
//!
//! ## Runtime Values
//!
//! [`Value`](crate::value::Value)s can either hold immediate values or pointers
//! to values allocated in the mutator.
//!
//! The following immediate values are supported:
//!
//! * `nil`, the empty list
//! * `bool`
//! * `char`, a Unicode scalar value
//! * `i64`
//! * `f64`
//!
//! The following pointer values are supported:
//!
//! * `pair`, a pair of values, also known as `Cons` cell
//! * `symbol`, a symbol, an interned string
//! * `string`, a UTF-8 string
//! * `vector`, a vector of values
//! * `bytevector`, a vector of bytes
//!
//! Value provides a safe interface to the values in the runtime. It is
//! implemented as a wrapper around [`ValueKind`](crate::value::ValueKind). It
//! provides methods to convert the value into a Rust type, to check the type of
//! the value, and to visit the value. Heap values are automatically locked and
//! unlocked when necessary.
//!
//! Value implements the visitor pattern. The [`Visitor`](crate::value::Visitor)
//! trait is used to visit the value. When implementing the `Visitor` trait it
//! must be considered that the value may be cyclic. The visitor must keep track
//! of the values it has already visited to avoid infinite loops.
//!
//! The `Visitor` trait is used to implement printing of values. The `Value`
//! itself does not implement printing. Instead the
//! [`Display`](std::fmt::Display) trait is implemented for a specific vistor.
//! This makes it possible to implement different printing strategies for
//! values.
//!
//! An implementation of the `Visitor` trait is provided to print values in a
//! Scheme-like syntax,
//! [`SimplePrinterVisitor`](crate::value::SimplePrinterVisitor).
//!
//! ## Combination Parser
//!
//! The parser library is a general purpose parser library that can be used to
//! parse any language. It is designed to be used in conjunction with the future
//! parts of this project, but it can also be used on its own.
//!
//! The memory management library is a library for automatic memory management.
//! Values are allocated in a mutator which holds an arena allocator for each
//! type. Memory collection is done by a mark and sweep garbage collector.
//!
//! The future parts of this project will extend this crate with a library for
//! compile time evaluation and transformation into an intermediate
//! representation, an implemenation of an interpreter for the intermediate
//! representation, and a reference implementation of Scheme. The implementation
//! will be based on the R7RS standard and will implement a subset of the
//! standard.
//!
//! The design goal of this project is to create a framework that can be used to
//! create a language interpreter or compiler. AST nodes will be created by the
//! parser and then passed to the evaluator. The AST is represented as symbolic
//! expressions (s-expressions) and is designed to be language agnostic. The
//! evaluator will then evaluate the AST into a combination of s-expressions and
//! intermediate representation. This is the compile time representation of the
//! program. The intermediate representation can then be either interpreted or
//! compiled into a target representation.
//!
//! ### Parser Example
//!
//! Example of a parser that parses a Scheme-like byte-vector and returns it as
//! `Vec<u8>`.
//!
//! The grammar is defined as follows:
//!
//! ```text
//! ByteVector = "#u8(" Byte* ')'
//! Byte = <any exact integer between 0 and 255>
//! ```
//!
//! The parser is defined as follows:
//!
//! ```rust
//! use kamo::parser::{code, prelude::*, Input, ParseError, ParseResult, Span};
//!
//! fn main() {
//! assert_eq!(
//! parse_bytevec(Input::new("#u8(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)")),
//! Ok((
//! vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
//! Input::new("")
//! ))
//! );
//! }
//!
//! fn parse_bytevec(input: Input<'_>) -> ParseResult<'_, Vec<u8>> {
//! context_as(
//! delimited(
//! pair(tag("#u8("), ascii::whitespace0),
//! list0(parse_bytevec_value, ascii::whitespace1),
//! context_as(
//! preceded(ascii::whitespace0, char(')')),
//! code::ERR_CUSTOM + 2,
//! "expecting a closing parenthesis: )",
//! ),
//! ),
//! code::ERR_CUSTOM + 1,
//! "expecting a byte vector: #u8(<byte>*)",
//! )(input)
//! }
//!
//! fn parse_bytevec_value(input: Input<'_>) -> ParseResult<'_, u8> {
//! let result = preceded(
//! ascii::whitespace0,
//! any((
//! parse_binary_natural,
//! parse_octal_natural,
//! parse_decimal_natural,
//! parse_hexadecimal_natural,
//! parse_natural,
//! )),
//! )(input);
//!
//! match result {
//! Ok((value, cursor)) => {
//! if value > u8::MAX as u64 {
//! Err(ParseError::new(
//! Span::new(input.position(), cursor.position()),
//! code::ERR_CUSTOM + 3,
//! "expecting a byte value: 0..255",
//! ))
//! } else {
//! Ok((value as u8, cursor))
//! }
//! }
//! Err(err) => Err(err),
//! }
//! }
//!
//! #[inline]
//! fn parse_binary_natural(input: Input<'_>) -> ParseResult<u64> {
//! preceded(tag("#b"), literal::natural(literal::Radix::Binary))(input)
//! }
//!
//! #[inline]
//! fn parse_octal_natural(input: Input<'_>) -> ParseResult<u64> {
//! preceded(tag("#o"), literal::natural(literal::Radix::Octal))(input)
//! }
//!
//! #[inline]
//! fn parse_decimal_natural(input: Input<'_>) -> ParseResult<u64> {
//! preceded(tag("#d"), parse_natural)(input)
//! }
//!
//! #[inline]
//! fn parse_hexadecimal_natural(input: Input<'_>) -> ParseResult<u64> {
//! preceded(tag("#x"), literal::natural(literal::Radix::Hexadecimal))(input)
//! }
//!
//! #[inline]
//! fn parse_natural(input: Input<'_>) -> ParseResult<u64> {
//! literal::natural(literal::Radix::Decimal)(input)
//! }
//! ```
//!
//! ## Type System
//!
//! The type system is available when the `types` feature is enabled. It is not
//! enabled by default. When the type system is enabled the [`mod@env`] module is
//! also available. It is used by the
//! [`TypeChecker`](crate::types::TypeChecker).
//!
//! Types are defined by a type code which is a byte array. The type code is
//! used to represent the type of a value. This allows for a simple and
//! efficient way to represent types. The syntax of the type code is
//! straightforward and easy to understand and parse. The type code allows the
//! construction of complex types with multiple levels of nesting by combining
//! simpler types.
//!
//! For more information see the [`types`] module.
//!
//! ## Environment
//!
//! The environment module is available when the `types` feature is enabled. It
//! is not enabled by default.
//!
//! The environment defines the scopes and bindings of an interpreter. The
//! environment is used to store the bindings of variables and functions. During
//! the execution or compilation of a program, the environment is used to lookup
//! the bindings of variables and functions. The environment is also used to
//! define new bindings.
//!
//! The [`TypeChecker`](crate::types::TypeChecker) uses the environment to store
//! or to lookup the types of variables and functions.
//!
//! ## Macros
//!
//! Macros are available when the `macros` feature is enabled. It is not
//! enabled by default.
//!
//! The macros are used to parse a string literal into a single or an array of
//! `kamo::value::Value`s. The macros are defined as follows:
//!
//! - `sexpr!(<mutator>, <expression>)` - A macro for parsing a single
//! s-expression from a string. It returns a `kamo::value::Value`.
//!
//! - `sexpr_file!(<mutator>, <filename>)` - A macro for parsing multiple
//! s-expressions from a file. It returns an array of `kamo::value::Value`s.
//! The array may be empty.
//!
//! - `sexpr_script!(<mutator>, <expressions>)` - A macro for parsing multiple
//! s-expressions from a string. It returns an array of `kamo::value::Value`s.
//! The array may be empty.
//!
//! The macros all take an optional `MutatorRef` identifier. This is used to allocate
//! values on the heap. If the expression does not contain any values that need to
//! be allocated on the heap, then the `Mutator` identifier can be omitted.
//!
//! The syntax for the macros is as defined by the Scheme standard R7RS for the
//! `read` procedure. The syntactic definition is the `<datum>` in section
//! [`7.1.2 External representations`](https://standards.scheme.org/official/r7rs.pdf)
//! of the standard.
//!
//! The syntax deviations from the standard are:
//!
//! - The extactness (`#e` and `#i`) of numbers is not supported. Floating-point
//! numbers are always inexact and integers are always exact.
//!
//! - Numbers may only be signed 64-bit integers or IEEE 754 double precision
//! floating-point numbers. The standard allows for arbitrary precision
//! integers, rationals and complex numbers.
//!
//! - Labels are not supported.
//!
//! - The `#;` comment syntax is only supported in the macros which parse
//! multiple s-expressions. The `#;` comment syntax may not be nested.
//!
//! - Character literals defined by a hex escape sequence may have 1 to 6
//! digits. The standard excepts 1 or more digits. The code must be a valid
//! Unicode code point.
//!
//! The parser is implemented with the [`pest`](https://crates.io/crates/pest)
//! crate. The grammar is defined in `src/sexpr/sexpr.pest`. This is necessary
//! because the combination parser library defined in `kamo::parser` cannot be
//! used here. It would be cyclic dependency. There will be an implementation
//! of a parser for s-expressions in the `kamo::form` module in the future. It
//! will be based on the `kamo::parser` crate and will be used by the scheme
//! interpreter.
//!
//! ### Examples
//!
//! ```rust
//! use kamo::{mem::Mutator, value::{print, Value}};
//! use kamo_macros::sexpr;
//!
//! let m = Mutator::new_ref();
//! let value = sexpr!(m, "(1 2 3)");
//!
//! assert_eq!(print(value).to_string(), "(1 2 3)");
//! ```
//!
//! ```rust
//! use kamo::value::{print, Value};
//! use kamo_macros::sexpr_file;
//!
//! let values: &[Value] = &sexpr_file!("tests/macros/values.scm");
//!
//! assert_eq!(values.len(), 3);
//! assert_eq!(print(values[0].clone()).to_string(), "()");
//! assert_eq!(print(values[1].clone()).to_string(), "100");
//! assert_eq!(print(values[2].clone()).to_string(), "#t");
//!
//! let values: &[Value] = &sexpr_file!("tests/macros/empty.scm");
//! assert_eq!(values.len(), 0);
//! ```
//!
//! ```rust
//! use kamo::{mem::Mutator, value::{print, Value}};
//! use kamo_macros::sexpr_script;
//!
//! let m = Mutator::new_ref();
//! let values: &[Value] = &sexpr_script!(m, "(define a 1)\n(define b 2)\n(+ a b)");
//!
//! assert_eq!(values.len(), 3);
//! assert_eq!(print(values[0].clone()).to_string(), "(define a 1)");
//! assert_eq!(print(values[1].clone()).to_string(), "(define b 2)");
//! assert_eq!(print(values[2].clone()).to_string(), "(+ a b)");
//!
//! let values: &[Value] = &sexpr_script!("");
//! assert_eq!(values.len(), 0);
//! ```
//!
//! ## Feature List
//!
//! - [x] Module `kamo::parser` for parsing UTF-8 text. A parser combinator
//! library for parsing UTF-8 text in a safe and mostly zero-copy way.
//! - [x] Module `kamo::mem` for automatic memory management. Values are allocated
//! in a mutator which holds an arena allocator for each type. Memory collection
//! is done by a mark and sweep garbage collector.
//! - [x] Module `kamo::value` for values. Values can either hold immediate values
//! or pointers to values allocated in the mutator.
//! - [x] Module `kamo::types` for types. The type system is used to infer the
//! types of the intermediate representation and the AST.
//! - [ ] Module `kamo::eval` for evaluation. The evaluator processes an AST, which
//! is an symbolic expression tree, and evaluates it to an intermediate
//! representation. The intermediate representation can then be interpreted or
//! compiled to a target representation. The interpreter is generic and can be
//! used to interpret any intermediate representation.
//! - [ ] Module `kamo::repl` for a read-eval-print-loop. The REPL is used to
//! interactively evaluate expressions and is generic and can be used to
//! evaluate any intermediate representation.
//! - [ ] Module `kamo::lang::scheme` for the Scheme language. The Scheme language
//! is implemented as a library on top of the `kamo` modules. It implements a
//! subset of the R7RS standard.
extern crate lazy_static;
pub use Position;
pub use ;