Crate fekan

Source
Expand description

A library to build and train Kolmogorov-Arnold neural networks.

The fekan crate contains utilities to build and train Kolmogorov-Arnold Networks (KANs) in Rust. Of particular note:

  • the Kan struct, which represents a full KAN model
  • the train_model function, which trains a KAN model
  • the KanLayer struct, which represents a single layer of a KAN, and can be used to build full KANs or as a layer in other models

§What is a Kolmogorov-Arnold Network?

Rather than perform a weighted sum of the activations of the previous layer and passing the sum through a fixed non-linear function, each node in a KAN passes each activation from the previous layer through a different, trainable non-linear function, then sums and outputs the result. This allows the network to be more interpretable than, and in some cases be significantly more accurate with a smaller memory footprint than, traditional neural networks.

Because the activation of each KAN layer can not be calculated using matrix multiplication, training a KAN is currently much slower than training a traditional neural network of comparable size. It is the author’s hope, however, that the increased accuracy of KANs will allow smaller networks to be used in many cases, offsetting most increased training time; and that the interpretability of KANs will more than justify whatever aditional training time remains.

For more information on the theory behind this library and examples of problem-sets well suited to KANs, see the arXiv paper KAN: Kolmogorov-Arnold Neural Networks

§Examples

Build, train and save a full KAN regression model with a 2-dimensional input, 1 hidden layer with 3 nodes, and 1 output node, where each layer uses degree-4 B-splines with 5 coefficients (AKA control points):

use fekan::kan::{Kan, KanOptions, ModelType};
use fekan::{Sample, training_options::{TrainingOptions, EachEpoch}};
use tempfile::tempfile;



// initialize the model
let model_options = KanOptions{
    num_features: 2,
    layer_sizes: vec![3, 1],
    degree: 3,
    coef_size: 7,
    model_type: ModelType::Regression,
    class_map: None,
    embedding_options: None,
};
let mut untrained_model = Kan::new(&model_options);

// train the model
let training_data: Vec<Sample> = Vec::new();
/* Load training data */

let trained_model = fekan::train_model(untrained_model, &training_data, TrainingOptions::default())?;

// save the model
// both Kan and KanLayer implement the serde Serialize trait, so they can be saved to a file using any serde-compatible format
// here we use the ciborium crate to save the model in the CBOR format
let mut file = tempfile().unwrap();
ciborium::into_writer(&trained_model, &mut file)?;

Modules§

embedding_layer
Contains an embedding layer for optional inclusion in a KAN model.
kan
Contains the main struct of the library - Kan - which represents a full Kolmogorov-Arnold Network.
kan_layer
Contains the struct KanLayer, which represents a single layer of a Kolmogorov-Arnold Network.
layer_errors
Error types relating to the creation and manipulation of KanLayers
training_error
Contains the struct TrainingError representing an error encountered during training.
training_options
Options for training a model with crate::train_model.

Structs§

Sample
A sample of data to be used in training a model.

Functions§

preset_knot_ranges
Scan over the training data and adjust model knot ranges. This is equivalent to calling Kan::forward on each sample in the training data, then calling Kan::update_knots_from_samples with a knot_adaptivity of 0.0. This presetting helps avoid large amounts of training inputs falling outside the knot ranges, which can cause the model to fail to converge.
train_model
Train the provided model with the provided data.
validate_model
Calculates the loss of the model on the provided validation data. If the model is a classification model, the cross entropy loss is calculated. If the model is a regression model, the mean squared error is calculated.