Expand description
A persistent string type with the same API as a linked-list of characters.
§Contents
§FiniteCharList
The FiniteCharList
type is probably what you want to use. It behaves just
like a linked list of characters but with better (TODO: verify) performance.
§CharList<Tail>
The CharList<Tail>
type allows customization of the data structure by
allowing a custom tail. I’m using this to emulate Prolog’s open-ended lists
where the tail of the list may be an uninstantiated Prolog variable.
A FiniteCharList
is just a type alias for CharList<()>
.
Generic CharList
s have much fewer abilities provided by this crate because
the nature of the tail is unknown. For instance, CharList<Tail>
does not
implement AsRef<str>
or even PartialEq
!
The intention is that you will wrap a CharList<YourTail>
in a newtype and
implement those traits if it makes sense for your problem space. For
example, I (plan to) use a CharList<ast::Expression>
which will
implement PartialEq
because I want syntactic equality.
§DISCLAIMER: unsafe
This crate is a work in progress. Specifically Not all uses of unsafe
have been validated! Please don’t use this for anything serious yet.
Safety audits are welcome and appreciated! I’m still quite new to writing
unsafe
code.
Also, this crate depends on front-vec
which is also badly in need of auditing.
§Internal Representation
This crate relies on front_vec::FrontString
to store arrays of bytes which allow for
efficient prepending (pushing characters onto the front of the array).
Each FrontString
is owned by a PqRcCell
(internal to this crate, soon
to be it’s own crate) which is sorta like a cross between a RefCell
and a
RcBox
(the thing an Rc
pointer points to).
A CharList
is just a pointer + length (though it’s not a slice) which
refers to a specific slice of a PqRcCell
-managed FrontString
.
The PqRcCell
ensures that out of all CharList
s which refer to it’s
FrontString
, only those with the longest length are allowed to mutate
the inner FrontString
value. And they can only* mutate it in ways that
other CharList
s won’t notice.
* (Ok, at this point we make developers pinky-swear 🤞 they’ll follow those
rules, and they gotta wrap their use in an unsafe
block. Maybe there’s a
better way to do this in the future though 🤔)
That was a lotta words, let’s look at an example with memory diagrams.
§Memory Diagram
§Note: Simplified Diagrams
Every time the word len
or refs
appears in these diagrams, in the code
it corresponds to either priority
or priorities
. The internal
representation of a CharList
is actually just a PqRc
(Priority Queue
Reference Counted) pointer which abstracts over its priority. For
CharList
s, priority is the same as string length.
§Example
Consider this code:
use assert2::assert; // For nicer assertions.
let icon = FiniteCharList::from("icon");
let nomicon = icon.cons_str("nom");
let rustonomicon = nomicon.cons_str("rusto");
let rustonomicon2 = rustonomicon.clone();
assert!(icon == "icon");
assert!(nomicon == "nomicon");
assert!(rustonomicon == "rustonomicon");
assert!(rustonomicon == rustonomicon2);
These values would be represented in memory like this:
(The arrows with dashed lines represent the implied beginnings of
CharList
slices, i.e. icon
implicitly “refers to” the last four bytes
in the buffer.)
Notice that all cons_str
and clone
operations are immutable, i.e. they
create new CharList
s instead of mutating the original.
§One More cons
let the_book = rustonomicon.cons_str("the ");
assert!(the_book == "the rustonomicon");
Here’s what memory looks like now:
Because rustonomicon
has the longest length in the refs
table, it’s
perfectly safe to call the underlying FrontString
’s push_char_front
method (mutably!).
Notice that by pushing a character onto the front, rustonomicon
doesn’t
affect any other CharList
’s view of the underlying FrontString
.
§Dropping
Ok now what happens if we drop
the longest three CharList
s?
drop(rustonomicon);
drop(the_book);
drop(rustonomicon2);
Notice that the internal FrontString
’s len
went down to 7. This means
we can reuse the memory that used to be held by the longer strings!
§Tricky cons
Here’s a problem though. What if we want to cons
the character 'b'
onto
the front of icon
?
let janelle_monae = icon.cons('b');
assert!(janelle_monae == "bicon");
Well, if we overwrite the 'm'
that currently sits in front of “icon”,
we’d be modifying nomicon
’s view of the string. That’s no good, so we
gotta clone icon
’s view of the FrontString
and mutate the copy.
Modules§
Structs§
- An efficient string type with the same API as a linked-list of characters.