Crate orkhon[][src]

Orkhon

Orkhon: ML Inference Framework and Server Runtime

What is it?

Orkhon is Rust framework for Machine Learning to run/use inference/prediction code written in Python, frozen models and process unseen data. It is mainly focused on serving models and processing unseen data in a performant manner. Instead of using Python directly and having scalability problems for servers this framework tries to solve them with built-in async API.

Main features

  • Sync & Async API for models.
  • Easily embeddable engine for well-known Rust web frameworks.
  • API contract for interacting with Python code.
  • High processing throughput
    • ~4.8361 GiB/s prediction throughput
    • 3_000 concurrent requests takes ~4ms on average

Installation

You can include Orkhon into your project with;

[dependencies]
orkhon = "0.2"

Dependencies

You will need:

  • If you use pymodel feature, Python dev dependencies should be installed and have proper python runtime to use Orkhon with your project.
  • If you want to have tensorflow inference. Installing tensorflow as library for linking is required.
  • ONNX interface doesn't need extra dependencies from the system side.
  • Point out your PYTHONHOME environment variable to your Python installation.

Python API contract

Python API contract is hook based. If you want to call a method for prediction you should write Python code with args and **kwargs.

def model_hook(args, **kwargs):
    print("Doing prediction...")
    return args

Python Hook Input

Both args and kwargs are HashSets. args can take any acceptable hashset key and passes down to python level. But kwargs keys are restricted to &str for keeping it only for option passing. args can contain your data for making prediction. Input contract is opinionated for making interpreter work without unknown type conversions.

Python Hook Output

Python hook output is passed up without downcasting or casting. Python bindings are still exposed to make sure you get the type you wanted. By default; python passes PyObject to Rust interface. You can extract the type from the object that Python passed with

pyobj.extract()?

This api uses PyO3 bindings for Python <-> Rust. You can look for PyO3's documentation to make conversions. Auto conversion methods soon will be added.

Examples

Request a Tensorflow prediction asynchronously

use orkhon::prelude::*;
use orkhon::tcore::prelude::*;
use orkhon::ttensor::prelude::*;
use rand::*;
use std::path::PathBuf;

let o = Orkhon::new()
   .config(
       OrkhonConfig::new()
           .with_default_input_fact_shape(InferenceFact::dt_shape(f32::datum_type(), tvec![10, 100])),
   )
   .tensorflow(
       "model_which_will_be_tested",
       PathBuf::from("tests/protobuf/manual_input_infer/my_model.pb"),
   )
   .shareable();

let mut rng = thread_rng();
let vals: Vec<_> = (0..1000).map(|_| rng.gen::<f32>()).collect();
let input = tract_ndarray::arr1(&vals).into_shape((10, 100)).unwrap();

let o = o.get();
let handle = async move {
   let processor = o.tensorflow_request_async(
      "model_which_will_be_tested",
      ORequest::with_body(TFRequest::new().body(input.into())),
   );
   processor.await
};
let resp = block_on(handle).unwrap();

Request an ONNX prediction synchronously

This example needs onnxmodel feature enabled.

use orkhon::prelude::*;
use orkhon::tcore::prelude::*;
use orkhon::ttensor::prelude::*;
use rand::*;
use std::path::PathBuf;

let o = Orkhon::new()
    .config(
        OrkhonConfig::new()
            .with_input_fact_shape(
                InferenceFact::dt_shape(f32::datum_type(), tvec![10, 100])),
    )
    .onnx(
        "model_which_will_be_tested",
        PathBuf::from("tests/protobuf/onnx_model/example.onnx"),
    )
    .build();

let mut rng = thread_rng();
let vals: Vec<_> = (0..1000).map(|_| rng.gen::<f32>()).collect();
let input = tract_ndarray::arr1(&vals).into_shape((10, 100)).unwrap();

let resp = o
    .onnx_request(
        "model_which_will_be_tested",
        ORequest::with_body(ONNXRequest::new().body(input.into())),
    )
    .unwrap();
assert_eq!(resp.body.output.len(), 1);

License

License is MIT

Discussion and Development

We use Gitter for development discussions. Also please don't hesitate to open issues on GitHub ask for features, report bugs, comment on design and more! More interaction and more ideas are better!

Contributing to Orkhon Open Source Helpers

All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome.

A detailed overview on how to contribute can be found in the CONTRIBUTING guide on GitHub.

Re-exports

pub use tract_core as tcore;

Modules

config

Orkhon configuration structure

errors

Error types for Orkhon

orkhon

Base model server structure

prelude

Prelude for Orkhon

preprocessing
reqrep

Request response structures for Orkhon

service

Asynchronous service traits for serving ML models