pub struct GGMLRuntime { /* private fields */ }Expand description
GGML runtime - reuses GGUF runtime implementation since llama-server supports both
Implementations§
Source§impl GGMLRuntime
impl GGMLRuntime
Trait Implementations§
Source§impl Default for GGMLRuntime
impl Default for GGMLRuntime
Source§impl ModelRuntime for GGMLRuntime
impl ModelRuntime for GGMLRuntime
Source§fn supported_format(&self) -> ModelFormat
fn supported_format(&self) -> ModelFormat
Get the format this runtime supports
Source§fn initialize<'life0, 'async_trait>(
&'life0 mut self,
config: RuntimeConfig,
) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn initialize<'life0, 'async_trait>(
&'life0 mut self,
config: RuntimeConfig,
) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Initialize the runtime (start server process, load model, etc.)
Source§fn is_ready<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = bool> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn is_ready<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = bool> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Check if runtime is ready for inference
Source§fn health_check<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Result<String>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn health_check<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Result<String>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Get health status
Source§fn base_url(&self) -> String
fn base_url(&self) -> String
Get the base URL for inference API (e.g., “http://127.0.0.1:8001”)
Source§fn generate<'life0, 'async_trait>(
&'life0 self,
request: InferenceRequest,
) -> Pin<Box<dyn Future<Output = Result<InferenceResponse>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn generate<'life0, 'async_trait>(
&'life0 self,
request: InferenceRequest,
) -> Pin<Box<dyn Future<Output = Result<InferenceResponse>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Perform inference (non-streaming)
Source§fn generate_stream<'life0, 'async_trait>(
&'life0 self,
request: InferenceRequest,
) -> Pin<Box<dyn Future<Output = Result<Box<dyn Stream<Item = Result<String, Error>> + Send + Unpin>>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn generate_stream<'life0, 'async_trait>(
&'life0 self,
request: InferenceRequest,
) -> Pin<Box<dyn Future<Output = Result<Box<dyn Stream<Item = Result<String, Error>> + Send + Unpin>>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Perform streaming inference
Source§fn shutdown<'life0, 'async_trait>(
&'life0 mut self,
) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn shutdown<'life0, 'async_trait>(
&'life0 mut self,
) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Shutdown the runtime (stop server, cleanup resources)
Source§fn metadata(&self) -> RuntimeMetadata
fn metadata(&self) -> RuntimeMetadata
Get runtime metadata
Source§fn completions_url(&self) -> String
fn completions_url(&self) -> String
Get the OpenAI-compatible chat completions endpoint
Auto Trait Implementations§
impl Freeze for GGMLRuntime
impl !RefUnwindSafe for GGMLRuntime
impl Send for GGMLRuntime
impl Sync for GGMLRuntime
impl Unpin for GGMLRuntime
impl UnsafeUnpin for GGMLRuntime
impl !UnwindSafe for GGMLRuntime
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more