Crate inference

source ·
Expand description

Library wrapping the gRPC interface to NVIDIA’s Triton Inference Server

Exposes a Rust API for creating Triton clients and running ML inference

Modules

Custom error implementations for the inference crate
Client for interfacing with the Triton inference server over gRPC
Implements model registry operations for a TritonClient
Triton gRPC requests related to the server’s health and readiness to serve a model
Triton server gRPC requests related to the state of the hardware the server is running on

Structs

A gRPC client for the Triton inference server
Base configuration for a model in a Triton Inference Server Used to store the model’s parameters and to generate inference requests that can be submitted through a TritonClient

Traits

Common methods for a derived data producing model to implement TODO: Add more useful methods and factor more of the SimpleModel implementation into generics