[−][src]Crate smart_access

Smart accessors

Let's begin with a few words on naming.

What is commonly named “smart pointer” is usually associated with trivial access (dereference) semantics and nontrivial clone/drop semantics.

Smart accessors provided by this crate also serve the purpose of accessing some data, but in a different way: they have trivial drop semantics and nontrivial access semantics.

TLDR

If you do not want to read a long text, just proceed to the essential part of the documentation.

Also there is the version migration guide.

High level overview

The goal of this crate is threefold:

to offer one possible solution to the problem that the current (rustc 1.44) borrowchecker doesn't understand functions with multiple exit points (Polonius doesn't have this problem but it is not stable yet)
to provide a way to do bidirectional programming (when updating some view of data updates the data viewed to match the updated view)
last but not least, to reduce amount of callback hell while programming in continuation-passing-style

If you are aqcuainted with optics in functional languages you can think of this crate as a minimalistic “lens” (more precisely, affine traversal) library using an opionated imperative approach.

Note

As a side-effect of the library design one can use a “build” pattern with standard &mut references (see below).

Usage examples

This crate already implements accessors for stdlib collections:

use smart_access::Cps;

let mut foo = vec![vec![1,2,3], vec![4,5,6]];

let bar = foo.at(0).at(1).replace(7);
assert!(foo == vec![vec![1,7,3], vec![4,5,6]]);
assert!(bar == Some(2));

let baz = foo.at(2).at(1).replace(8);
assert!(foo == vec![vec![1,7,3], vec![4,5,6]]);
assert!(baz == None);  // None is returned if path doesn't make sense

// Any mutable reference can be used as a "data location":
assert!(foo[0][0].replace(9) == Some(1));
assert!(foo == vec![vec![9,7,3], vec![4,5,6]]);

A somewhat more interesting example:

let mut foo = vec![1,2,3,4,5,6];

let bar = foo.at(1..=3).replace(vec![7,8]);
assert!(foo == vec![1,7,8,5,6]);
assert!(bar == Some(vec![2,3,4]));

An arbitrary mutating operation can be used instead of replacement:

let mut foo = vec![1,2,3,4,5,6];

let bar = foo.at(1..4).access(|v| { *v = vec![v.iter().sum()]; "baz" });
assert!(foo == vec![1,9,5,6]);
assert!(bar == Some("baz"));

// And with mutable references you get a sort of the "build" part of the Builder pattern
foo[0].access(|x| { /* do something with the element */ });

Usage guide

To add a smart accessor to your own datatype Data you need to:

choose some type Index
add trait At<Index> to the type Data
implement access_at method
at the usage site write use smart_access::Cps;
PROFIT!

Motivation (part I: lifetimes)

Suppose you have HashMap but without “Entry API” (Entry API is an implementation feature: not every datastructure in the wild provides any analogue).

Suppose also that you want to implement something akin to |m, k, v| m.entry(k).or_insert(v).

You could write

// for simplicity we use usize keys in the examples below
fn or_insert<V>(hm: &mut HashMap<usize,V>, k: usize, v: V) -> &mut V {
    if let Some(v) = hm.get_mut(&k) { 
        return v; 
        // this is the first exit point but the borrow checker
        // doesn't distinguish between it and the second exit point
    }     

    hm.insert(k, v);  // Oops: hm is already borrowed! 
                      // (It _MUST_ be borrowed until the exit point)

    hm.get_mut(&k).unwrap()
    // the second exit point
}

but it would not compile because of limitations of the borrow checker.

It seems there is no way to write such a function without additional queries to the HashMap and without resorting to reference-pointer-reference conversions or other unsafe techniques.

This crate provides a not-so-clumsy workaround:

use std::collections::HashMap;
use smart_access::{At, Cps};

struct Ensure<K,V> { key: K, value: V }

impl<V> At<Ensure<usize, V>> for HashMap<usize, V> 
{
    type View = V;

    fn access_at<R, F>(&mut self, kv: Ensure<usize, V>, f: F) -> Option<R> where
        F: FnOnce(&mut V) -> R
    {
        if let Some(v) = self.get_mut(&kv.key) { 
            return Some(f(v)); 
            // We use the so called CPS-transformation: we wrap each 
            // return site with a call to the provided function.
        } 

        self.insert(kv.key, kv.value);
        Some(f(self.get_mut(&kv.key).unwrap()))
    }
}

// now you can write or_insert (note the return type!):
fn or_insert<'a, V>(hm: &'a mut HashMap<usize,V>, k: usize, v: V) -> impl Cps<View=V> + 'a {
    hm.at(Ensure{ key: k, value: v })
}

There are some peculiarities though:

&mut V is eager: all code which is needed to obtain a reference to the value is executed at the site of access
impl Cps<View=V> is lazy: access is a zero-cost operation and all the machinery needed to reach the value is run at the site of modification
&'a mut V can be reborrowed, i.e. cloned for some subperiod of 'a, making it possible to modify the value referenced more than once
impl Cps<View=V> can be used only once but has batching. It comes in two flavors: compile-time batching which can't be used across any control flow and runtime batching which can't be used in no_std contexts

Note

The forementioned accessor Ensure { key: K, value: V } is defined in stdlib_impls simply as a pair (K,V) so for example you can write

map.at( ("foo".to_string(), "bar".to_string()) ).touch();

instead of

map.entry("foo".to_string()).or_insert("bar".to_string());

Motivation (part II: bidirectional programming)

We give a simple illustration: a toy example of a bidirectional vector parser.

Not only can it parse a vector but also can print it back (note that two bidirectional parsers can be combined into a bidirectional translator from one textual representation to another).

A combinator library greatly facilitating writing such parsers can be implemented but it is not a (current-time) goal of this crate.

Note

Some function definitions in the following code are hidden. To see them look at the full module source.

// A little showcase:
assert!(vector_parser().bi_left((Some(vec![1,2,3]),"".into())) == "[1,2,3]".to_string());
assert!(vector_parser().bi_right(&mut "[1,2,3] foo".into()).0  == Some(vec![1,2,3]));
assert!(vector_parser().bi_right(&mut "[1,2,3,]bar".into()).0  == Some(vec![1,2,3]));
assert!(vector_parser().bi_right(&mut "[,]".into()).0          == None);
assert!(vector_parser().bi_right(&mut "[]".into()).0           == Some(vec![]));
assert!(vector_parser().bi_right(&mut "]1,2,3[".into()).0      == None);

// The code:
use smart_access::{At, Cps};

// a minimal set of parser combinators
#[derive(Clone)] struct _Number;
#[derive(Clone)] struct _Char(char);
#[derive(Clone)] struct _Many<T>(T);
#[derive(Clone)] struct _Optional<T>(T);
#[derive(Clone)] struct _Cons<Car,Cdr>(Car,Cdr);
#[derive(Clone)] struct _Iso<Parser,F,G>(Parser,F,G);

fn vector_parser() -> impl Bidirectional<String, Parse<Vec<usize>>> {
    let grammar = 
        _Cons(_Char('['), 
        _Cons(_Many(_Cons(_Number, _Char(','))),
        _Cons(_Optional(_Number),
              _Char(']'))));
     
    let from_grammar = |(_bl, (xs, (ox, _br))): (_, (Vec<_>, (Option<_>, _)))| 
    {
        xs.into_iter().map(|(x, _comma)| x).chain(ox.into_iter()).collect()
    };

    let to_grammar = |mut vec: Vec<_>| {
        let last = vec.pop();

        ('[', (vec.into_iter().map(|x| (x, ',')).collect(), (last, ']')))
    };

    _Iso(grammar, from_grammar, to_grammar)
}

trait Bidirectional<A,B> {
    fn bi_left(self, b: B) -> A;
    fn bi_right(self, a: &mut A) -> B;
}

// DO NOT USE IN PRODUCTION: efficient parsing is incompatible 
// with using copies of tails of the parsed string
type Parse<T> = (Option<T>, String);

// a very simplistic blanket implementation
impl<A,B,I> Bidirectional<A,B> for I where
    A: At<I, View=B> + Default,
    B: Clone
{
    fn bi_left(self, b: B) -> A {
        let mut a = A::default();

        a.at(self).access(|x| { *x = b; });

        a
    }

    fn bi_right(self, a: &mut A) -> B {
        a.at(self).access(|b| b.clone()).unwrap()
    }
}
 
impl At<_Number> for String {
    type View = Parse<usize>;

    // access_at is hidden
}

impl At<_Char> for String {
    type View = Parse<char>;

    // access_at is hidden
}
 
impl<V, Parser> At<_Many<Parser>> for String where
    String: At<Parser, View=Parse<V>>,
    Parser: Bidirectional<String, Parse<V>> + Clone,
{
    type View = Parse<Vec<V>>;

    // access_at is hidden
}

impl<V, Parser> At<_Optional<Parser>> for String where
    String: At<Parser, View=Parse<V>>,
    Parser: Bidirectional<String, Parse<V>> + Clone,
{
    type View = Parse<Option<V>>;

    // access_at is hidden
}

impl<V1, V2, P1, P2> At<_Cons<P1, P2>> for String where
    String: At<P1, View=Parse<V1>>,
    String: At<P2, View=Parse<V2>>,
    P1: Bidirectional<String, Parse<V1>> + Clone,
    P2: Bidirectional<String, Parse<V2>> + Clone,
{
    type View = Parse<(V1,V2)>;

    // access_at is hidden
}

impl<Parser, FromParser, ToParser, T, V> 
At<_Iso<Parser, FromParser, ToParser>> for String where
    String: At<Parser, View=Parse<T>>,
    Parser: Bidirectional<String, Parse<T>> + Clone,
    T: Clone,
    FromParser: FnOnce(T) -> V,
    ToParser: FnOnce(V) -> T,
{
    type View = Parse<V>;

    // access_at is hidden
}

Connection to functional programming

Rust type fn(&mut V) roughly corresponds to Haskell type v -> v.

Thus Rust access_at type could be written in Haskell (after some argument-swapping) as

accessAt :: ix -> (v -> (v,r)) -> t -> (t, Maybe r)

Suppose that access_at always returns Some(..). In such a case the Haskell type of access_at can be simplified to

accessAt :: ix -> (v -> (v,r)) -> t -> (t,r)

Its type is isomorphic to any of the following

ix -> t -> (v -> (v,r))     -> (t,r)  -- by arg-swapping
ix -> t -> (v->v, v->r)     -> (t,r)  -- by universal property of products
ix -> t -> (v->v) -> (v->r) -> (t,r)  -- by currying

Recall that a lens is uniquely defined by a getter and a setter:

type Lens t v = (t -> v, t -> v -> t)

This type is isomorphic to

type Lens t v = t -> (v, v -> t)

Notice that the types (v, v->t) and forall r. (v->v) -> (v->r) -> (t,r) are rather similiar. Define

right :: (v, v->t) -> (v->v) -> (v->r) -> (t,r)
right (v, v_t) v_v v_r = (v_t (v_v v), v_r v)

left :: (forall r. (v->v) -> (v->r) -> (t,r)) -> (v, v->t)
left f = (snd (f     id        id    ),  -- getter
    \v -> fst (f (\_ -> v) (\_ -> ())))  -- setter

Now we prove (left . right) ~ id:

left (right (v, v_t)) = (v, \x -> v_t x) ~ (v, v_t)

I.e. right is an injection: every value lens :: Lens t v can be presented as left (accessAt ix): it suffices to define

accessAt ix = right lens  -- modulo aforementioned type-fu

In fact the full type (with Maybe)

accessAt ix :: (v -> (v,r)) -> t -> (t, Maybe r)

can house any lens, prism or affine traversal.

Version migration guide

From 0.6 to 0.7

Difference #1

Now the crate is fully independent of std. There is one thing however: HashMap and HashSet are not yet in the alloc::collections.

There are two options:

use the optional hashbrown dependency
link to std by enabling the std_hashmap feature

Difference #2

The first iteration of a general traversal (not affine, i.e. supporting more than one location) API is available. Obviously, it's fully backward-compatible.

From 0.5 to 0.6

Difference #1

The std_collections feature is now named collections.

The std feature is deprecated. In the next version it will very likely be replaced by the alloc feature, responsible for linking to the alloc crate.

There is a new feature iter_mut. It depends on the external crate multiref. That crate doesn't use any complex type-level programming, thus making the increase in the compilation time insignificant.

But if you do not tolerate any dependencies or unsafe code, you can opt it out, losing some expressiveness.

From 0.4 to 0.5

Difference #1

The AT type changed its representation.

Now it has simpler and more flat structure:

AT<CPS, (..(((), I1), I2) .. In)>

instead of

AT<..AT<AT<CPS, I1>, I2> .. In>

Unfortunately, the new structure doesn't play well with type inference but this issue has been circumvented by separating the Cps implementation into a helper trait (this trait isn't exposed to the public API).

Because the AT type isn't to be used explicitly, usually there is no need to change any code.

Nevertheless there may exist some code which has compiled on 0.4 and does not compile on 0.5.

Difference #2

Relevant only to the detach feature.

Now the detach method returns not only the detached part but also the left part (an accessor with the rest of the path attached).

A new method cut is provided. It allows one to mark the place from which the detach starts.

Difference #3

Also about detach.

The Attach trait changed its parameter to View instead of CPS.

Usually it's sufficient to change CPS to CPS::View in generic code.

Interestingly, there are some cases when

fn foo<CPS: Cps, Path: Attach<CPS::View>> // ...

is not equivalent to

fn foo<CPS: Cps<View=V>, Path: Attach<V>, V: ?Sized> // ...

For example, impl Attach<V> works after being returned from a function but impl Attach<CPS::View> doesn't (it seems to be connected with the fact that Rust currenlty doesn't “elaborate” (i.e. reduce) bounds on the associated types).

Cargo features

Currently there are following features:

alloc: Links to alloc.
collections: Provides accessors for some collections. Implies alloc.
hashbrown: Accessors for HashMap and HashSet from the hashbrown crate. Pulls the hashbrown crate, implies alloc.
std_hashmap: Accessors for HashMap and HashSet from std. Warning: links to std.
batch_rt: Provides runtime batching. Implies alloc.
batch_ct: Provides compile-time batching.
batch: An alias for batch_rt and batch_ct enabled simultaneously.
detach: Makes AT-paths detachable.
iter_mut: Accessors for iterators. Pulls the multiref crate, implies alloc.
traversal: Bidirectional iterators in continuation passing style.

All features except std_hashmap are enabled by default.

Modules

collections	Implementation of `At` for some collections. Requires the `collections` feature.
core_impls	Implementation of `At` for core datatypes.
iter_mut	Support for arbitrary mutating iterators. Requires `iter_mut`.
traversal	Support for general traversals. Requires `traversal`.

Macros

path	Constructs a pathlike type for `AT`, `DetachedPath`, and `CpsBatch`.

Structs

AT	A “reference” to some “location”.
CpsBatch	A builder for complex mutations. Requires `batch_ct` or `batch_rt`.

Traits

At	A smart access protocol.
Attach	A detached path. Requires `detach` feature.
Batch	An abstraction over compile-time and runtime batches. Requires `batch_ct` or `batch_rt`.
BatchCt	A compile-time batch. Requires `batch_ct` feature.
BatchRt	A runtime batch. Requires `batch_rt` feature.
Cps	Anything that can provide (or refuse to provide) a mutable parameter for a function.

Functions

detached_at

Creates a detached path. Requires detach feature.

Type Definitions

DetachedPath

A concrete type of detached paths. Requires detach feature.