Expand description
A persistent string type with the same API as a linked-list of characters.
§Contents
§FiniteCharList
The FiniteCharList type is probably what you want to use. It behaves just
like a linked list of characters but with better (TODO: verify) performance.
§CharList<Tail>
The CharList<Tail> type allows customization of the data structure by
allowing a custom tail. I’m using this to emulate Prolog’s open-ended lists
where the tail of the list may be an uninstantiated Prolog variable.
A FiniteCharList is just a type alias for CharList<()>.
Generic CharLists have much fewer abilities provided by this crate because
the nature of the tail is unknown. For instance, CharList<Tail> does not
implement AsRef<str> or even PartialEq!
The intention is that you will wrap a CharList<YourTail> in a newtype and
implement those traits if it makes sense for your problem space. For
example, I (plan to) use a CharList<ast::Expression> which will
implement PartialEq because I want syntactic equality.
§DISCLAIMER: unsafe
This crate is a work in progress. Specifically Not all uses of unsafe
have been validated! Please don’t use this for anything serious yet.
Safety audits are welcome and appreciated! I’m still quite new to writing
unsafe code.
Also, this crate depends on front-vec
which is also badly in need of auditing.
§Internal Representation
This crate relies on front_vec::FrontString to store arrays of bytes which allow for
efficient prepending (pushing characters onto the front of the array).
Each FrontString is owned by a PqRcCell (internal to this crate, soon
to be it’s own crate) which is sorta like a cross between a RefCell and a
RcBox (the thing an Rc pointer points to).
A CharList is just a pointer + length (though it’s not a slice) which
refers to a specific slice of a PqRcCell-managed FrontString.
The PqRcCell ensures that out of all CharLists which refer to it’s
FrontString, only those with the longest length are allowed to mutate
the inner FrontString value. And they can only* mutate it in ways that
other CharLists won’t notice.
* (Ok, at this point we make developers pinky-swear 🤞 they’ll follow those
rules, and they gotta wrap their use in an unsafe block. Maybe there’s a
better way to do this in the future though 🤔)
That was a lotta words, let’s look at an example with memory diagrams.
§Memory Diagram
§Note: Simplified Diagrams
Every time the word len or refs appears in these diagrams, in the code
it corresponds to either priority or priorities. The internal
representation of a CharList is actually just a PqRc (Priority Queue
Reference Counted) pointer which abstracts over its priority. For
CharLists, priority is the same as string length.
§Example
Consider this code:
use assert2::assert; // For nicer assertions.
let icon = FiniteCharList::from("icon");
let nomicon = icon.cons_str("nom");
let rustonomicon = nomicon.cons_str("rusto");
let rustonomicon2 = rustonomicon.clone();
assert!(icon == "icon");
assert!(nomicon == "nomicon");
assert!(rustonomicon == "rustonomicon");
assert!(rustonomicon == rustonomicon2);These values would be represented in memory like this:
(The arrows with dashed lines represent the implied beginnings of
CharList slices, i.e. icon implicitly “refers to” the last four bytes
in the buffer.)
Notice that all cons_str and clone operations are immutable, i.e. they
create new CharLists instead of mutating the original.
§One More cons
let the_book = rustonomicon.cons_str("the ");
assert!(the_book == "the rustonomicon");Here’s what memory looks like now:
Because rustonomicon has the longest length in the refs table, it’s
perfectly safe to call the underlying FrontString’s push_char_front
method (mutably!).
Notice that by pushing a character onto the front, rustonomicon doesn’t
affect any other CharList’s view of the underlying FrontString.
§Dropping
Ok now what happens if we drop the longest three CharLists?
drop(rustonomicon);
drop(the_book);
drop(rustonomicon2);Notice that the internal FrontString’s len went down to 7. This means
we can reuse the memory that used to be held by the longer strings!
§Tricky cons
Here’s a problem though. What if we want to cons the character 'b' onto
the front of icon?
let janelle_monae = icon.cons('b');
assert!(janelle_monae == "bicon");Well, if we overwrite the 'm' that currently sits in front of “icon”,
we’d be modifying nomicon’s view of the string. That’s no good, so we
gotta clone icon’s view of the FrontString and mutate the copy.
Modules§
Structs§
- An efficient string type with the same API as a linked-list of characters.