Crate kalosm_vision

Source
Expand description

§Kalosm Vision

Kalosm Vision is a collection of image models and utilities for the Kalosm framework. It includes utilities for generating images from text and segmenting images into objects.

§Image Generation

You can use the Wuerstchen model to generate images from text:

use futures_util::StreamExt;
use kalosm_vision::{Wuerstchen, WuerstchenInferenceSettings};

#[tokio::main]
async fn main() {
    let model = Wuerstchen::builder().build().await.unwrap();
    let settings = WuerstchenInferenceSettings::new(
        "a cute cat with a hat in a room covered with fur with incredible detail",
    );

    let mut images = model.run(settings);
    while let Some(image) = images.next().await {
        if let Some(buf) = image.generated_image() {
            buf.save(&format!("{}.png", image.sample_num())).unwrap();
        }
    }
}

§Image Segmentation

Kalosm supports image segmentation with the SegmentAnything model. You can use the SegmentAnything::segment_everything method to segment an image into objects or the SegmentAnything::segment_from_points method to segment an image into objects at specific points:

use kalosm::vision::*;

let model = SegmentAnything::builder().build().unwrap();
let image = image::open("examples/landscape.jpg").unwrap();
let x = image.width() / 2;
let y = image.height() / 4;
let images = model
    .segment_from_points(
        SegmentAnythingInferenceSettings::new(image)
            .add_goal_point(x, y),
    )
    .unwrap();

images.save("out.png").unwrap();

Structs§

ChannelImageStream
A stream of images from a tokio channel.
Image
An image generated by the model
Ocr
The trocs optical character recognition model.
OcrBuilder
A builder for Ocr.
OcrInferenceSettings
Settings for running inference on Ocr.
OcrSource
The source of the model.
SegmentAnything
The segment anything model.
SegmentAnythingBuilder
A builder for SegmentAnything.
SegmentAnythingInferenceSettings
Settings for running inference on SegmentAnything.
SegmentAnythingSource
The source of the model.
Wuerstchen
A quantized wuerstchen image diffusion model
WuerstchenBuilder
A builder for the Wuerstchen model.
WuerstchenInferenceSettings
Settings for running inference with the Wuerstchen model.

Enums§

LoadOcrError
An error that can occur when loading an Ocr model.
LoadSegmentAnythingError
An error that can occur when loading a SegmentAnything model.
ModelLoadingProgress
The progress starting a model
OcrInferenceError
An error that can occur when running an Ocr model.
SegmentAnythingInferenceError
An error that can occur when running a SegmentAnything model.