Module for BUND standard library: Naive Bayes text classifier
A powerful and flexible text classification module built on top of the bund machine learning framework. This module is part of the Bund Standard Library (stdlib) and provides a streamlined workflow for training, evaluating, and deploying text classification models. This module is a library and not designed to be a standalone application, but it can be embedded inside BUND virtual machine.
Installation
This module required make and Rust framework to be installed first. After that:
Quick start
Get started with a simple example to classify text data. First, you have to create and train classifier
:TEST textclassifier.new
:rust "./scripts/rust.txt" textclassifier.train.from_file
"{A} tokens for RUST" format println
:kant "./scripts/kant.txt" textclassifier.train.from_file
"{A} tokens for KANT" format println
:astronomy "./scripts/astronomy.txt" textclassifier.train.from_file
"{A} tokens for ASTRONOMY" format println
:tolstoy "./scripts/tolstoy.txt" textclassifier.train.from_file
"{A} tokens for LEO TOLSTOY" format println
textclassifier.train.finish
Then you can classify any text lines.
:TEST
"At its simplest, a test in Rust is a function that’s annotated with the test attribute. Attributes are metadata about pieces of Rust code"
textclassifier.classify
The following call will return a DICT value:
BUND functions exposed in this module
| Name | Stack IN | Stack OUT | Description |
|---|---|---|---|
| textclassifier.new | Classifier name |
Classifier name |
Create new classifier |
| textclassifier.exists | Classifier name |
Classifier nameTRUE/FALSE |
Check if classifier exists |
| textclassifier.train.from_file | Classifier nameCategoryFilename |
Classifier nameNumber of tokens |
Train classifier from text file |
| textclassifier.train.finish | Classifier name |
Classifier name |
Finalize classifier training |
| textclassifier.classify | Classifier nameText for classification |
Classifier nameDICT with scores |
Classify text string using pre-trained classifier |