fekan
A library to build and train Kolmogorov-Arnold neural networks.
The fekan crate contains utilities to build and train Kolmogorov-Arnold Networks (KANs) in Rust, including both a struct to represent a full model; and a struct to represant an individual KAN layer, for use on its own or in other models.
Issues and pull requests are welcome!
What is a Kolmogorov-Arnold Network?
Rather than perform a weighted sum of the activations of the previous layer and passing the sum through a fixed non-linear function, each node in a KAN passes each activation from the previous layer through a different, trainable non-linear function, then sums and outputs the result. This allows the network to be more interpretable than, and in some cases be significantly more accurate with a smaller memory footprint than, traditional neural networks.
Because the activation of each KAN layer can not be calculated using matrix multiplication, training a KAN is currently much slower than training a traditional neural network of comparable size. It is the author's hope, however, that the increased accuracy of KANs will allow smaller networks to be used in many cases, offsetting most increased training time; and that the interpretability of KANs will more than justify whatever aditional training time remains.
For more information on the theory behind this library and examples of problem-sets well suited to KANs, see the arXiv paper KAN: Kolmogorov-Arnold Neural Networks
Binary usage
The fekan crate includes a command-line tool to build and train KANs. Install with
Build a new model and train it on a dataset:
or
Load an existing model for further training:
Load an existing model to make predictions on a dataset:
Example full command to build a classification model to determine whether a set of features maps to a dog or a cat
fekan will assign each distinct string found in the label field to a different output node, and store the mapping with the model for future use
The data for a regression model with a single output would look like:
And for a model with multiple regression outputs, where every sample includes a target for each regression:
for a multi-regression model where a sample may only contain valid target values for certain outputs:
in the above example, the first sample will only be trained on the first label, and the second sample will only be trained on the second label, because those are the only labels with a corresponding true value in their resepective samples' label_mask field
For complete usage details use the help command like fekan help [COMMAND]
The CLI supports reading data from .pkl, .json, and files (avro currently bugged). Features must be in a single 'list'-type column/field named "features", and labels (for training) must be in a single column named "labels"; Labels should be strings for classification models and floats for regression models. Models can be saved as pickle, json, or cbor files, and the format is inferred from the provided file extension..avro
Code Example
Build, train and save a full KAN regression model with a 2-dimensional input, 1 hidden layer with 3 nodes, and 1 output node, where each layer uses degree-3 B-splines with 10 coefficients (AKA control points):
use ;
use ;
use tempfile;
// initialize the model
let model_options = KanOptions;
let mut untrained_model = new;
// train the model
let training_data: = Vecnew;
/* Load training data */
# let sample_1 = new;
# let sample_2 = new;
# let training_data = vec!;
let trained_model = train_model?;
// save the model
// both Kan and KanLayer implement the serde Serialize trait, so they can be saved to a file using any serde-compatible format
// here we use the ciborium crate to save the model in the CBOR format
let mut file = tempfile.unwrap;
into_writer?;
# Ok::
Load and use a trained classification model
use Kan;
let trained_model = from_reader;
let data: = /* load data */
let predictions: = Vecwith_capacity;
for features in data
To-Do list
fekan is fully functional, but there are a number of improvements to make
- Parity with Liu et. al
- grid extension
- Adjust coefficients on grid update to match previous function
- pruning un-needed nodes
- smybolification
- visualization
- train via methods other than SGD (Adam, LBFGS)
- Speed
- support multi-threading
- support SIMD/parallel computation
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.