Expand description
A library to build and train Kolmogorov-Arnold neural networks.
The fekan
crate contains utilities to build and train Kolmogorov-Arnold Networks (KANs) in Rust. Of particular note:
- the
Kan
struct, which represents a full KAN model - the
train_model
function, which trains a KAN model - the
KanLayer
struct, which represents a single layer of a KAN, and can be used to build full KANs or as a layer in other models
§What is a Kolmogorov-Arnold Network?
Rather than perform a weighted sum of the activations of the previous layer and passing the sum through a fixed non-linear function, each node in a KAN passes each activation from the previous layer through a different, trainable non-linear function, then sums and outputs the result. This allows the network to be more interpretable than, and in some cases be significantly more accurate with a smaller memory footprint than, traditional neural networks.
Because the activation of each KAN layer can not be calculated using matrix multiplication, training a KAN is currently much slower than training a traditional neural network of comparable size. It is the author’s hope, however, that the increased accuracy of KANs will allow smaller networks to be used in many cases, offsetting most increased training time; and that the interpretability of KANs will more than justify whatever aditional training time remains.
For more information on the theory behind this library and examples of problem-sets well suited to KANs, see the arXiv paper KAN: Kolmogorov-Arnold Neural Networks
§Examples
Build, train and save a full KAN regression model with a 2-dimensional input, 1 hidden layer with 3 nodes, and 1 output node, where each layer uses degree-4 B-splines with 5 coefficients (AKA control points):
use fekan::kan::{Kan, KanOptions, ModelType};
use fekan::{Sample, training_options::{TrainingOptions, EachEpoch}};
use tempfile::tempfile;
// initialize the model
let model_options = KanOptions{
num_features: 2,
layer_sizes: vec![3, 1],
degree: 3,
coef_size: 7,
model_type: ModelType::Regression,
class_map: None,
embedding_options: None,
};
let mut untrained_model = Kan::new(&model_options);
// train the model
let training_data: Vec<Sample> = Vec::new();
/* Load training data */
let trained_model = fekan::train_model(untrained_model, &training_data, TrainingOptions::default())?;
// save the model
// both Kan and KanLayer implement the serde Serialize trait, so they can be saved to a file using any serde-compatible format
// here we use the ciborium crate to save the model in the CBOR format
let mut file = tempfile().unwrap();
ciborium::into_writer(&trained_model, &mut file)?;
Modules§
- embedding_
layer - Contains an embedding layer for optional inclusion in a KAN model.
- kan
- Contains the main struct of the library -
Kan
- which represents a full Kolmogorov-Arnold Network. - kan_
layer - Contains the struct
KanLayer
, which represents a single layer of a Kolmogorov-Arnold Network. - layer_
errors - Error types relating to the creation and manipulation of
KanLayer
s - training_
error - Contains the struct
TrainingError
representing an error encountered during training. - training_
options - Options for training a model with
crate::train_model
.
Structs§
- Sample
- A sample of data to be used in training a model.
Functions§
- preset_
knot_ ranges - Scan over the training data and adjust model knot ranges. This is equivalent to calling
Kan::forward
on each sample in the training data, then callingKan::update_knots_from_samples
with aknot_adaptivity
of 0.0. This presetting helps avoid large amounts of training inputs falling outside the knot ranges, which can cause the model to fail to converge. - train_
model - Train the provided model with the provided data.
- validate_
model - Calculates the loss of the model on the provided validation data. If the model is a classification model, the cross entropy loss is calculated. If the model is a regression model, the mean squared error is calculated.