Kalosm is a simple interface for pre-trained models in rust. It makes it easy to interact with pre-trained, language, audio, and image models.
There are three different packages in Kalosm:
kalosm::language - A simple interface for text generation and embedding models and surrounding tools. It includes support for search databases, and text collection from websites, RSS feeds, and search engines.
kalosm::audio - A simple interface for audio transcription and surrounding tools. It includes support for microphone input, transcription with the whisper model, and voice activity detection.
kalosm::vision - A simple interface for image generation and segmentation models and surrounding tools. It includes support for the wuerstchen and segment-anything models and integration with the image crate.
A complete guide for Kalosm is available on the Kalosm website, and examples are available in the examples folder.
Quickstart!
- Install rust
- Create a new project:
cargo new next-gen-ai
cd ./next-gen-ai
- Add Kalosm as a dependency
cargo add kalosm --git https://github.com/floneum/floneum --features full
cargo add tokio --features full
- Add this code to your
main.rs file
use std::io::Write;
use kalosm::{*, language::*};
#[tokio::main]
async fn main() {
let mut llm = Llama::new().await.unwrap();
let prompt = "The following is a 300 word essay about Paris:";
print!("{}", prompt);
let mut stream = llm(prompt);
stream.to_std_out().await.unwrap();
}
- Run your application with:
cargo run --release
What can you do with Kalosm?
You can think of Kalosm as the plumbing between different pre-trained models and each other or the surrounding world. Kalosm makes it easy to build applications that use pre-trained models to generate text, audio, and images. Here are some examples of what you can build with Kalosm:
The simplest way to get started with Kalosm language is to pull in one of the local large language models and use it to generate text. Kalosm supports a streaming API that allows you to generate text in real time without blocking your main thread:
use kalosm::language::*;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut llm = Llama::phi_3().await.unwrap();
let prompt = "The following is a 300 word essay about why the capital of France is Paris:";
print!("{}", prompt);
let mut stream = llm(prompt);
stream.to_std_out().await.unwrap();
Ok(())
}
Natural language generation is interesting, but the more interesting aspect of text is as a universal data format. You can encode any kind of data into text with a format like json. Kalosm lets you use LLMs with structured generation to create arbitrary types from natural language inputs:
use kalosm::language::*;
use std::sync::Arc;
#[derive(Parse, Clone, Debug)]
enum Class {
Thing,
Person,
Animal,
}
#[derive(Parse, Clone, Debug)]
struct Response {
classification: Class,
}
#[tokio::main]
async fn main() {
let llm = Llama::new_chat().await.unwrap();
let task = llm.task("You classify the user's message as about a person, animal or thing in a JSON format")
.with_constraints(Arc::new(Response::new_parser()));
let response = task("The Kalosm library lets you create structured data from natural language inputs").await.unwrap();
println!("{:?}", response);
}
Kalosm also supports cloud models like GPT4 with the same streaming API:
use kalosm::language::*;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut llm = OpenAICompatibleChatModel::builder()
.with_gpt_4o_mini()
.build();
let mut chat = llm.chat();
chat("What is the capital of France?").to_std_out().await?;
Ok(())
}
Kalosm makes it easy to collect text data from a variety of sources. For example, you can use Kalosm to collect text from a local folder of documents, an RSS stream, a website, or a search engine:
use kalosm::language::*;
use std::convert::TryFrom;
use std::path::PathBuf;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let nyt = RssFeed::new(Url::parse("https://rss.nytimes.com/services/xml/rss/nyt/US.xml").unwrap());
let mut documents = DocumentFolder::try_from(PathBuf::from("./documents")).unwrap();
let page = Page::new(Url::parse("https://www.nytimes.com/live/2023/09/21/world/zelensky-russia-ukraine-news").unwrap(), BrowserMode::Static).unwrap();
let document = page.article().await.unwrap();
println!("Title: {}", document.title());
println!("Body: {}", document.body());
let query = "What is the capital of France?";
let api_key = std::env::var("SERPER_API_KEY").unwrap();
let search_query = SearchQuery::new(query, &api_key, 5);
let documents = search_query.into_documents().await.unwrap();
let mut text = String::new();
for document in documents {
for word in document.body().split(' ').take(300) {
text.push_str(word);
text.push(' ');
}
text.push('\n');
}
println!("{}", text);
Ok(())
}
Once you have your data, Kalosm includes tools to create embedding-powered search indexes. Embedding-based search lets you find documents that are semantically similar to a specific word or phrase even if no words are an exact match:
use kalosm::language::*;
use surrealdb::{engine::local::SurrealKv, Surreal};
#[tokio::main]
async fn main() {
let db = Surreal::new::<SurrealKv>(std::env::temp_dir().join("temp.db")).await.unwrap();
db.use_ns("search").use_db("documents").await.unwrap();
let document_table = db
.document_table_builder("documents")
.build::<Document>()
.await
.unwrap();
document_table.add_context(DocumentFolder::new("./documents").unwrap()).await.unwrap();
loop {
let user_question = prompt_input("Query: ").unwrap();
let nearest_5 = document_table
.search(&user_question)
.with_results(5)
.await
.unwrap();
println!("{:?}", nearest_5);
}
}
A large part of making modern LLMs performant is curating the context the models have access to. Retrieval Augmented Generation (or RAG) helps you do this by inserting context into the prompt based on a search query. For example, you can Kalosm to create a chatbot that uses context from local documents to answer questions:
use kalosm::language::*;
use surrealdb::{engine::local::SurrealKv, Surreal};
#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
let exists = std::path::Path::new("./db").exists();
let db = Surreal::new::<SurrealKv>("./db/temp.db").await?;
db.use_ns("test").use_db("test").await?;
let document_table = db
.document_table_builder("documents")
.at("./db/embeddings.db")
.build::<Document>()
.await?;
if !exists {
std::fs::create_dir_all("documents")?;
let context = [
"https://floneum.com/kalosm/docs",
"https://floneum.com/kalosm/docs/guides/retrieval_augmented_generation",
]
.iter()
.map(|url| Url::parse(url).unwrap());
document_table.add_context(context).await?;
}
let model = Llama::new_chat().await?;
let mut chat = model.chat().with_system_prompt("The assistant help answer questions based on the context given by the user. The model knows that the information the user gives it is always true.");
loop {
let user_question = prompt_input("\n> ")?;
let context = document_table
.search(&user_question)
.with_results(1)
.await?
.into_iter()
.map(|document| {
format!(
"Title: {}\nBody: {}\n",
document.record.title(),
document.record.body()
)
})
.collect::<Vec<_>>()
.join("\n");
let prompt = format!(
"{context}\n{user_question}"
);
println!("{}", prompt);
let mut output_stream = chat(&prompt);
print!("Bot: ");
output_stream.to_std_out().await?;
}
}
Kalosm makes it easy to build up context about the world around your application.
use kalosm::sound::*;
#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
let model = Whisper::new().await?;
let mic = MicInput::default();
let stream = mic.stream();
let mut text_stream = stream.transcribe(model);
text_stream.to_std_out().await.unwrap();
Ok(())
}
In addition to language, audio, and embedding models, Kalosm also supports image generation. For example, you can use Kalosm to generate images from text:
use kalosm::vision::*;
#[tokio::main]
async fn main() {
let model = Wuerstchen::new().await.unwrap();
let settings = WuerstchenInferenceSettings::new(
"a cute cat with a hat in a room covered with fur with incredible detail",
);
let mut images = model.run(settings);
while let Some(image) = images.next().await {
if let Some(buf) = image.generated_image() {
buf.save(&format!("{}.png",image.sample_num())).unwrap();
}
}
}
Kalosm also supports image segmentation with the segment-anything model:
use kalosm::vision::*;
#[tokio::main]
async fn main() {
let model = SegmentAnything::builder().build().unwrap();
let image = image::open("examples/landscape.jpg").unwrap();
let images = model.segment_everything(image).unwrap();
for (i, img) in images.iter().enumerate() {
img.save(&format!("{}.png", i)).unwrap();
}
}