fekan

A library to build and train Kolmogorov-Arnold neural networks.

The fekan crate contains utilities to build and train Kolmogorov-Arnold Networks (KANs) in Rust, including both a struct to represent a full model; and a struct to represant an individual KAN layer, for use on its own or in other models.

Issues and pull requests are welcome!

What is a Kolmogorov-Arnold Network?

Rather than perform a weighted sum of the activations of the previous layer and passing the sum through a fixed non-linear function, each node in a KAN passes each activation from the previous layer through a different, trainable non-linear function, then sums and outputs the result. This allows the network to be more interpretable than, and in some cases be significantly more accurate with a smaller memory footprint than, traditional neural networks.

Because the activation of each KAN layer can not be calculated using matrix multiplication, training a KAN is currently much slower than training a traditional neural network of comparable size. It is the author's hope, however, that the increased accuracy of KANs will allow smaller networks to be used in many cases, offsetting most increased training time; and that the interpretability of KANs will more than justify whatever aditional training time remains.

For more information on the theory behind this library and examples of problem-sets well suited to KANs, see the arXiv paper KAN: Kolmogorov-Arnold Neural Networks

Binary usage

The fekan crate includes a command-line tool to build and train KANs. Install with

cargo install fekan --features serialization

Build a new model and train it on a dataset:

fekan build classifier ...

fekan build regressor ...

Load an existing model for further training:

fekan load [model_file] train ...

Load an existing model to make predictions on a dataset:

fekan load [model_file] infer ...

Example full command to build a classification model to determine whether a set of features maps to a dog or a cat

fekan build classifier --data dog_or_cat_data.json --classes "cat,dog" --degree 3 --coefs 4 --hidden-layer-sizes "5,3" --num-epochs 250 --batch-size 100 --knot-adaptivity 0.1 --learning-rate 0.05 --validation-split 0.2 --validate-each-epoch --model-out my_new_model.cbor

[
  {
    "features": [1.2, 3.14159, -22.0]
    "label": "cat"
  }
  {
    "features": [2.89, -0.002, 16.288844]
    "label": "dog"
  }
]

fekan will assign each distinct string found in the label field to a different output node, and store the mapping with the model for future use

The data for a regression model with a single output would look like:

[
  {
    "features": [1.2, 3.14159, -22.0],
    "labels": [-0.01]
  }
  {
    "features": [2.89, -0.002, 16.288844],
    "labels": [100.5]
  }
]

And for a model with multiple regression outputs, where every sample includes a target for each regression:

[
  {
    "features": [1.2, 3.14159, -22.0],
    "labels": [-0.01, 2.2]
  }
  {
    "features": [2.89, -0.002, 16.288844],
    "labels": [100.5, 22.0]
  }
]

for a multi-regression model where a sample may only contain valid target values for certain outputs:

[
  {
    "features": [1.2, 3.14159, -22.0],
    "labels": [-0.01, 0.0]
    "label_mask": [true, false]
  }
  {
    "features": [2.89, -0.002, 16.288844],
    "labels": [0.0, 22.0]
    "label_mask": [false, true]
  }
]

in the above example, the first sample will only be trained on the first label, and the second sample will only be trained on the second label, because those are the only labels with a corresponding true value in their resepective samples' label_mask field

For complete usage details use the help command like fekan help [COMMAND]

The CLI supports reading data from .pkl, .json, and ~~.avro~~ files (avro currently bugged). Features must be in a single 'list'-type column/field named "features", and labels (for training) must be in a single column named "labels"; Labels should be strings for classification models and floats for regression models. Models can be saved as pickle, json, or cbor files, and the format is inferred from the provided file extension.

Code Example

Build, train and save a full KAN regression model with a 2-dimensional input, 1 hidden layer with 3 nodes, and 1 output node, where each layer uses degree-3 B-splines with 10 coefficients (AKA control points):

use fekan::kan::{Kan, KanOptions, ModelType};
use fekan::{Sample, training_options::{TrainingOptions, EachEpoch}};
use tempfile::tempfile;



// initialize the model
let model_options = KanOptions{
    input_size: 2,
    layer_sizes: vec![3, 1],
    degree: 3,
    coef_size: 7,
    model_type: ModelType::Regression,
    class_map: None};
let mut untrained_model = Kan::new(&model_options);

// train the model
let training_data: Vec<Sample> = Vec::new();
/* Load training data */
# let sample_1 = Sample::new(vec![1.0, 2.0], 3.0);
# let sample_2 = Sample::new(vec![-1.0, 1.0], 0.0);
# let training_data = vec![sample_1, sample_2];

let trained_model = fekan::train_model(untrained_model, &training_data, TrainingOptions::default())?;

// save the model
// both Kan and KanLayer implement the serde Serialize trait, so they can be saved to a file using any serde-compatible format
// here we use the ciborium crate to save the model in the CBOR format
let mut file = tempfile().unwrap();
ciborium::into_writer(&trained_model, &mut file)?;
# Ok::<(), Box<dyn std::error::Error>>(())

Load and use a trained classification model

use fekan::kan::Kan;

let trained_model = serde_json::from_reader(&model_file);
let data: Vec<Vec<f64>> = /* load data */
let predictions: Vec<(Vec<f64>, &str)> = Vec::with_capacity(data.len());
for features in data{
 let logits: Vec<f64> = trained_model.forward(features);
 let (index, probability) = /* interpret your logit data */
 let label: &str = trained_model.node_to_label(index); // get the human-readable label for a given output node
}

To-Do list

fekan is fully functional, but there are a number of improvements to make

Parity with Liu et. al
- grid extension
- Adjust coefficients on grid update to match previous function
- pruning un-needed nodes
- smybolification
- visualization
- train via methods other than SGD (Adam, LBFGS)
Speed
- support multi-threading
- support SIMD/parallel computation

License

Licensed under either of

Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.