haskell_bits 0.2.0

Rust implementations of various Haskell typeclasses and functions
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
# haskell_bits

Various typeclasses and concepts from Haskell, implemented in Rust.

## Synopsis

Currently this library is an implementation of the Functor/Applicative/Monad hierarchy in Rust. It's very much a work in progress, and I'm releasing it now largely as a request for comment, I'd like to have some discussions about my design choices with others before I go too far down this path.

There has been some work in this space, examples are:

* [`fp-core`](https://crates.io/crates/fp-core)
* [`functional`](https://crates.io/crates/functional)

## So what makes this library different?!

The one thing (I believe) makes this library unique is that one can define generic functions on Monads, Applicatives, Functors etc.

The libraries mentioned above, seem to define Functor, Applicative, Monad for various types, so `fmap`, `bind` etc, work on those types, but, there doesn't seem to be a way to write one generic function that works on all types defined as say, Monad. 

This I think is a very important issue. The great thing about these typeclasses like Monad in Haskell is that if you've got `M` different Monads, and `N` different functions defined on Monads, you've now got effectively `M*N` functions defined. Make a new Monad, you've got a huge infrastucture for free. Make a new Monad function, it works on 100s of Monadic types already defined by others. 

This is one of the key things that makes Monads and associated concepts, and more generally Haskell, so powerful, in that you can combine things in a multiplicative numbers of ways, so much so that many problems just consist of putting together existing bits of code, of which the type system ensures you can only do safely, and almost always, correctly. This is a huge advantage in terms of productivity and in particular, reliability. 


## Example

Here is a simple Haskell function:

```
monadic_pair :: Monad m => m a -> m a -> m (a, a)
monadic_pair x y = do
    x_val <- x
    y_val <- y
    pure (x_val, y_val)
```

And this is the equivalent using this library in Rust:

```
fn monadic_pair<TCon, T, TArg>(x: &TArg, y: &TArg) -> <TCon as WithTypeArg<(T, T)>>::Type
where
    TCon: Monad + WithTypeArg<T> + WithTypeArg<(T, T)>,
    TArg: TypeApp<TCon, T>,
    T: Clone,
{
    mdo! {
        x_val =<< x;
        y_val =<< y;
        ret<TCon> (x_val.clone(), y_val.clone());
    }
}
```

We can then apply this to vectors:

```
let v1: Vec<u32> = vec![1, 2, 3];
let v2: Vec<u32> = vec![4, 5, 6];        
let v_result = monadic_pair(&v1, &v2);

assert_eq!(
    v_result,
    vec![
        (1, 4),
        (1, 5),
        (1, 6),
        (2, 4),
        (2, 5),
        (2, 6),
        (3, 4),
        (3, 5),
        (3, 6)
    ]
);
```

Or options:

```
let o1: Option<u32> = Some(1);
let o2: Option<u32> = Some(2);
let o3: Option<u32> = None;

let o1_result = monadic_pair(&o1, &o2);
let o2_result = monadic_pair(&o1, &o3);

assert_eq!(o1_result, Some((1, 2)));
assert_eq!(o2_result, None);
```

All with the expected results.

We can even do the standard applicative style function application:

```
let _applicative_option_result = (|x| move |y| x + y).lmap(o1).lap(o2);
assert_eq!(Some(3), _applicative_option_result);
```

With some caveats I'll mention later.

## So how does it work?

Rust does not have higher kinded types, well at least ones we can abstract over. What I mean by this is that whilst Rust happily allows `Vec<T>` for some generic `T`, Rust doesn't allow `T<u32>` for some generic `T`. It allows `Vec<u32>`, but we can't abstract over `Vec` like we can `u32`. 

This has some particular consequences. Lets say we want to write a trait in Rust that encapsulates the idea of applying a function to all parts of a structure. We might try:

```
trait Mapable {
  map<TIn, TOut>(f: impl Func<TIn, TOut>, x: Self<TIn>) -> Self<TOut>;
}

impl Mapable Vec {
  ...
}

impl Mapable Option {
  ...
}

// etc...
```

#### A slight snag

Except we can't do this. Rust will refuse to compile this, because `Vec`, `Option` (and by inference, `Self`) are not types. `Option<u32>` is, but `Option` is not. 

So we need to do a trick:

```
pub trait WithTypeArg<T>
{
    type Type;
}
```

And then:

```
pub struct VecTypeCon;

impl<T> WithTypeArg<T> for VecTypeCon {
    type Type = Vec<T>;
}
```

We can then define `Mapable` like so:

```
pub trait Mapable {
    fn map<TIn, TOut>(
        f: impl Fn(TIn) -> TOut,
        x: <Self as WithTypeArg<TIn>>::Type,
    ) -> <Self as WithTypeArg<TOut>>::Type
}
```

And here's a simple implementation for `Option`s:

```
impl Mapable for VecTypeCon {
    fn lmap<TIn, TOut>(
        f: impl Fn(TIn) -> TOut,
        x: <OptionTypeCon as WithTypeArg<TIn>>::Type,
    ) -> <OptionTypeCon as WithTypeArg<TOut>>::Type {
        Option::map(x, f) // Option itself already has a specific "map" function
    }
}
```

and for `Vec`s:

```
impl Mapable for VecTypeCon {
    fn map<TIn, TOut>(
        f: impl Fn(TIn) -> TOut,
        x: <VecTypeCon as WithTypeArg<TIn>>::Type,
    ) -> <VecTypeCon as WithTypeArg<TOut>>::Type {
        let size = x.capacity();
        let mut v: Vec<TOut> = Vec::with_capacity(size);
        for e in x {
            v.push(f(e));
        }
        v
    }
}
```

This is pretty good so far, but there's a small issue, which we sort out.

Notice the main motivation for this library is to be able to combine these functions in a generic way.

Well lets make a silly function, that just does two maps in a row (this is a silly function as it would be more efficient just to combine the functions, but lets go with it):

```
fn map2<TCon, TIn, TMid, TOut>(
    f: impl Fn(TIn) -> TMid,
    g: impl Fn(TMid) -> TOut,
    x: <TCon as WithTypeArg<TIn>>::Type,
) -> <TCon as WithTypeArg<TOut>>::Type
{
    map(g, map(f, x))
}
```    

The unfortunately doesn't work, even though it looks quite type correct. I think it's because Rust won't match against a type unless it's fully resolved, and `<TCon as WithTypeArg<TIn>>::Type` can't be resolved because we don't know what `TCon` or `TIn` is. Therefore we can't pass it to `map`, even though it wants an argument of `<TCon as WithTypeArg<TIn>>::Type` (the exact same thing), because I think Rust wants to resolve them both before checking if they're equal.

So what we do is adjust our definition of `WithTypeArg`

```
pub trait WithTypeArg<T> {
    type Type: TypeApp<Self, T>;
}
```

And write this new trait `TypeApp`:

```
pub trait TypeApp<TCon, T>
where
    TCon: WithTypeArg<T>,
    Self: Is<Type = <TCon as WithTypeArg<T>>::Type>,
{
}
```

Yes, notice these traits refer to each other. Also of importance is the `Is` trait. This does take some getting your head around, I would suggest looking at the source of my [`is_type`](https://docs.rs/crate/is_type/0.2.0/source/src/lib.rs) library. But basically what this allows one to do is to force a one-to-one correspondence between a trait and an actual type. 

So you can basically go back and forth between them at will. 

So then we can redefine `map` as follows, replacing any occurance of an argument (but not return type) of:

`<TCon as WithTypeArg<T>>::Type`

with:

`impl TypeApp<TCon, T>`


```
pub trait Mapable {
    fn map<TIn, TOut>(
        f: impl Fn(TIn) -> TOut,
        x: impl TypeApp<TCon, TIn>,
    ) -> <Self as WithTypeArg<TOut>>::Type
}
```

and likewise change `map2`:

```
fn map2<TCon, TIn, TMid, TOut>(
    f: impl Fn(TIn) -> TMid,
    g: impl Fn(TMid) -> TOut,
    x: impl TypeApp<TCon, TIn>,
) -> <TCon as WithTypeArg<TOut>>::Type
{
    map(g, map(f, x))
}
```

and we're good.

It seems for some reason whilst won't pass `<TCon as WithTypeArg<T>>::Type` to an argument of type `<TCon as WithTypeArg<T>>::Type`, it will pass `<TCon as WithTypeArg<T>>::Type` to a trait argument of `TypeApp<TCon, T>`.

Note in the implementation, that means we'll have to replace all occurances of `x` with `x.into_val()`, where `into_val` is part of the trait `Is` (in the where clause of `TypeApp`).

Basically, this is the core of the approach. We just do a similar thing for Applicative and Monad, but there are a few details and snags that are run into which I'll detail below. 

Note that called the above class `Mappable`, whereas actually in the library what I've detailed above is actually called `LinearFunctor` in the library, which I'll also talk about now:

## The structure of the library.

Currently, there are 7 main traits in the library:

* `Functor`
* `Lift`
* `Applicative`
* `Monad`
* `LinearFunctor`
* `LinearApplicative`
* `LinearMonad`

Functor, Applicative and Monad are basically copies from Haskell, except the `pure` function of `Applicative` is split into the trait `Lift`. So both `Lift` and `Applicative` extend `Functor`, and `Monad` extends both `Lift` and `Applicative`.

The `LinearFunctor/LinearApplicative/LinearMonad` are the "by value" versions of `Functor/Applicative/Monad`. As in they consume their arguments. This can be an efficiency gain because they don't need to copy their arguments, it also means they can be defined on types which aren't `Cloneable`.

The standard prefix for the normal (i.e. by reference versions) is either `f` (in Functor) or no prefix (otherwise), the by value versions (i.e. linear versions) always are prefixed by an `l`. 

Note out of the `Linear*` classes, `LinearFunctor` is a bit special. `Functor` extends `LinearFunctor`, that is every `Functor` is also a `LinearFunctor` (but not the other way around). This is because once one defines `Functor`, you can always define `LinearFunctor` just by taking a reference.

But both `LinearApplicative` and `LinearMonad` take their function argument as an `FnOnce` argument. This is, particularly in the case of `LinearApplicative`, to allow chaining, running a produced closure twice will probably require an explicit `.clone()` at some point. That means `LinearApplicative` and `LinearMonad` can be defined for far less types, generally ones that don't "multiply". Ie. whilst `Vec` is a `LinearFunctor` it's not a `LinearApplicative` or `LinearMonad`, `Option` however is all of these. 

Also all trait functions also have plain old top level functions that call them, as well as often having other trait functions that call them, although these trait functions are only for the purpose of allowing `.` notation. I'll go into some technical details about the reasoning for these now.

## Technical details (particularly for implementors of traits)

I'm going to give the `LinearFunctor` trait as an example. You've seen code similar to this before, but this is the actual code:

```
// Implement this trait for LinearFunctor
pub trait LinearFunctor {
    fn lmap<TIn, TOut>(
        f: impl Fn(TIn) -> TOut,
        x: <Self as WithTypeArg<TIn>>::Type,
    ) -> <Self as WithTypeArg<TOut>>::Type
    where
        Self: WithTypeArg<TIn> + WithTypeArg<TOut>;
}

// Call this for lmap(f, x) syntax
pub fn lmap<TCon, TIn, TOut>(
    f: impl Fn(TIn) -> TOut,
    x: impl TypeApp<TCon, TIn>,
) -> <TCon as WithTypeArg<TOut>>::Type
where
    TCon: LinearFunctor + WithTypeArg<TIn> + WithTypeArg<TOut> + ?Sized,
{
    <TCon as LinearFunctor>::lmap(f, x.into_val())
}

// This is for x.lmapop(f) syntax
pub trait LMapExt {
    fn lmap<TCon, TIn, TOut>(self, x: impl TypeApp<TCon, TIn> + Sized) -> <TCon as WithTypeArg<TOut>>::Type
    where
        Self: Fn(TIn) -> TOut + Sized,
        TCon: LinearFunctor + WithTypeArg<TIn> + WithTypeArg<TOut>
    {
        lmap(self, x)
    }
}

impl<T> LMapExt for T {}
```

Note there's three functions here, one is just a definition, the second is an implementation that calls the first, and the third calls the second.

The first is what you actually implement. Note that the arguments here are actual types. There's no `x: impl TypeApp<TCon, TIn>`. This means, as discussed above, this function [works poorly](#A-slight-snag) when it comes to type inference. 

So the second function here, top level function, instead takes the `TypeApp` trait as it argument, which means it works better with type inference. But then here we need to worry about converting from the trait type to the "real" type, so we call `into_val()` from the `is_type` crate to do this.

The third function is just a trait so we can use the `.` syntax if we want. This becomes useful if we want to chain things like operators, i.e. `f.lmap(x).lap(y)`.

Note in the case of `Functor` only, the function `map` works with both by value and by reference arguments, and calls either `fmap` or `lmap` depending on whether the arguments are by value or by reference.

## Do notation

The macro `!mdo` allows one to write in "do-notation" form. This code is largely stolen (slightly modified) from [rust-mdo](https://github.com/TeXitoi/rust-mdo/blob/master/src/lib.rs).

Note that do notation is currently only defined with the by-reference version of the Monad functions, so you'll see `&` and `Clone::clone(...)` in many places. Not also sometimes `Clone::clone` is better for type inference than `.clone()`, I believe because the latter works for both values and references.

## Quirks

There's a few little quirks one gets from this approch, which makes things slightly messier than equivalent Haskell code in some cases. I'll talk about features that could be added to Rust in [this later section](#Features-that-could-be-added-to-Rust-to-make-things-nicer).

### `WithTypeArg` constraints everywhere

When writing generic functions over Functors, Applicatives or Monads, one has to use the constraint `WithTypeArg` all over the place. Indeed the function `map2` above doesn't actually work, one needs to write it like this:

```
    fn lmap2<TIn, TMid, TOut, TCon>(
        f: impl Fn(TIn) -> TMid,
        g: impl Fn(TMid) -> TOut,
        x: impl TypeApp<TCon, TIn>,
    ) -> <TCon as WithTypeArg<TOut>>::Type
    where
        TCon: LinearFunctor + WithTypeArg<TIn> + WithTypeArg<TMid> + WithTypeArg<TOut>
    {
        lmap(g, lmap(f, x))
    }
```

Notice all the `WithTypeArg<TIn> + WithTypeArg<TMid> + WithTypeArg<TOut>` constraints. One has to put this constraint for every type argument application that even exists in the function. For example in `lmap2` above, we need `WithTypeArg<TMid>` even though it's not an input or output argument, just because it's the result of the innermost map call. 

### Closure types can't be named

In stable rust, closure types can't be named, nor (as far as I could work out) can one define types that implement `Fn` (I think this requires extensions). This comes into play when trying to define some functions. For example, consider the definition of `Applicative`:

```
pub trait Applicative: Functor + Lift {
    fn ap<TIn, TOut, TFunc>(
        f: &<Self as WithTypeArg<TFunc>>::Type,
        x: &<Self as WithTypeArg<TIn>>::Type,
    ) -> <Self as WithTypeArg<TOut>>::Type
    where
        Self: WithTypeArg<TFunc> + WithTypeArg<TIn> + WithTypeArg<TOut>,
        TFunc: Fn(&TIn) -> TOut,
    {
        <Self as Applicative>::lift2(|y1: &TFunc, y2: &TIn| y1(y2), f, x)
    }

    fn lift2<TIn1, TIn2, TOut, TFunc>(
        f: TFunc,
        x1: &<Self as WithTypeArg<TIn1>>::Type,
        x2: &<Self as WithTypeArg<TIn2>>::Type,
    ) -> <Self as WithTypeArg<TOut>>::Type
    where
        Self: WithTypeArg<TIn1> + WithTypeArg<TIn2> + WithTypeArg<TOut>,
        TFunc: Fn(&TIn1, &TIn2) -> TOut;
}
```

We've defined `ap` in terms of `lift2`, but it would be nice to define it the other way around also, so implementers could choose which to implement. Roughly speakking, the [identity](http://hackage.haskell.org/package/base-4.12.0.0/docs/Control-Applicative.html#t:Applicative) is as follows:

```
lift2(f, x, y) = f.fmap(x).ap(y)
```

But in Rust, without automatic currying, we'll need something like:

```
lift2(f, x, y) = (|x| |y| f(x,y)).fmap(x).ap(y)
```

So in this case, the output of `fmap` is an Applicative of Functions. Eventually `ap` applies and this becomes just values, but the intermediate step is a function.

But this function doesn't have a type we can name. But as above, Rust wants a `WithTypeArg` constraint above for every argument to our functions, but we can't write the type for this function because is doesn't have a type name.

A practical issue with this is that you can't create say, and M<F>, where F is some function, inside a function. You have to always pass it in. 

### Some fiddling with type inference

Notice there's actually two `lift()` functions defined (`lift` is just `pure`, but `pure` was a former keyword so I've avoided it).

```
// lift(x)
pub fn lift<TCon, T>(x: T) -> <TCon as WithTypeArg<T>>::Type
where
    <TCon as WithTypeArg<T>>::Type : TypeApp<TCon, T>,
    TCon: Lift + WithTypeArg<T>,
{
    <TCon as Lift>::lift::<T>(x)
}


// lift_c(x)
pub fn lift_c<TCon, T, U>(x: U::Param) -> U
where
    T : Is<Type = U::Param>,
    U : TypeApp<TCon, T>,
    <TCon as WithTypeArg<T>>::Type : TypeApp<TCon, T>,
    TCon: Lift + WithTypeArg<T>,
{
    Is::from_val(lift::<TCon, T>(Is::from_val(x)))
}
```

They both do the same thing, indeed the second just calls the first, but their effect on type inference is different.

The first doesn't allow the inference to go backwards. For example, if one goes:

```
x: Option<u32> = lift(5);
```

then the compiler will not be able to work this out, because `5` could be many different numeric types.

But if one goes like this:

```
x: Option<u32> = lift_c(5);
```

It will be able to work out from the result type that `5` is a `u32`. 

The problem is that `lift_c` doesn't work as well in generic functions over say all monads.

So think of `lift_c` as "lift concrete", as in we have a concrete type.

There's a similar division with `bind` and `bind_c`.

As a result also, there's both `!mdo` and `!mdo_c` macros, `!mdo` should be used in generic code, but `!mdo_c` may be used when one knows the concrete types as there may be better inference in that case.

One example of this is as follows:

```
let o1: Option<u32> = lift_c(5);
let o2: Option<u32> = lift_c(7);

let _do_result: Option<(u32, u32)> = mdo_c! {
    x =<< &o1;
    y =<< &o2;
    ret (Clone::clone(x), Clone::clone(y));
};
```

Notice here one doesn't have to be explicit about the type constructor of the return value out of the macro.

But in `monadic_pair`:

```
    fn monadic_pair<TCon, T, TArg>(x: &TArg, y: &TArg) -> <TCon as WithTypeArg<(T, T)>>::Type
    where
        TCon: Monad + WithTypeArg<T> + WithTypeArg<(T, T)>,
        TArg: TypeApp<TCon, T>,
        T: Clone,
    {
        mdo! {
            x_val =<< x;
            y_val =<< y;
            ret<TCon> (x_val.clone(), y_val.clone());
        }
    }
```

We have to explicitly specify the type constructor of the result of the do-block, namely `TCon`. Rust can't infer this unfortunately.

## Features that could be added to Rust to make things nicer

Overall, despite this looking a bit messy in some places, for client code, it's not too bad. Generic functions over monads are a bit messy to define, but if what you're doing mostly is just using these functions on a particular concrete monad, the code is actually fairly clean. But here's some things that would help:

### Higher kinded types (HKT)

An actual implementation and syntax for higher kinded types would be ideal, as we wouldn't need to do the "hack things into a trait" thing to make this work, and presumably type inference would also be nicer if this was done. But failing that there's a few features which would go a long way to help:

### Forall in where clauses

If we could define something like:

```
trait TypeCon<TCon> where TCon : forall U. WithTypeArg<U> 
```

This would solve the [`WithTypeArg` constraints everywhere](#WithTypeArg-constraints-everywhere) issue. It would also solve the [closure types can't be named](#Closure-types-can't-be-named) issue because they'd be no need to explicitly name such temporary types.

### Generic associated types (GAT)

This is an actual feature underdevelopment which may help, similarly to a "forall" option, but the particular part of this feature needed hasn't landed in nightly yet and has no particular timeline, according to [this issue reply](https://github.com/rust-lang/rust/issues/44265#issuecomment-560941760)

## Future work

There's a few obvious things to be done, firstly implementing Functor/Applicative/Monad for types other than `Option` and `Vec`, `Result` is the obvious next choice, along with other things in the standard Rust library.

`IO` is also a possibility, and I think implementing some of Haskell's [`Parsec`](https://hackage.haskell.org/package/parsec) could allow for some good illistrative examples of the power of this.

Also, some of the more useful monadic functions, like [`mapM`](http://hackage.haskell.org/package/base-4.12.0.0/docs/Control-Monad.html#v:mapM) require [traversable](http://hackage.haskell.org/package/base-4.12.0.0/docs/Data-Traversable.html#t:Traversable) and [foldable](http://hackage.haskell.org/package/base-4.12.0.0/docs/Data-Foldable.html#t:Foldable), so these are the obvious traits to implement next.

## Where to start digging around the code.

There's a whole lot of test code dumped in the `test()` function in [`lib.rs`](https://github.com/clintonmead/haskell_bits/blob/master/src/lib.rs), that's probably the best place to start as there's plenty of examples of usage there. 

## Discussion and comments welcome!

The whole point of this was to open up discussion, particularly on core aspects of the design at this point, so please feel free to open up a github issue if you have any suggestions. Or even open up a PR if those changes are more concrete or would just like to add some functions, traits or implementations.