0.0.4
=====
* design a backend (OpenCL, Cuda) abstraction
* clBlas integration
* hdf5 reader
* hdf5 writer
0.0.5
=====
* cuBlas integration
* matlab reader
* matlab writer
0.1
===
* More tests and examples
* Documentation improvements
* Public release