Function tensor_response_stream

Source

pub async fn tensor_response_stream(
    state: Arc<State>,
    request: NvCreateTensorRequest,
    streaming: bool,
) -> Result<impl Stream<Item = Annotated<NvCreateTensorResponse>>, Status>

Expand description

Tensor Request Handler

This method will handle the incoming request for model type tensor. The endpoint is a “source” for an [super::OpenAICompletionsStreamingEngine] and will return a stream of responses which will be forward to the client.

Note: For all requests, streaming or non-streaming, we always call the engine with streaming enabled. For non-streaming requests, we will fold the stream into a single response as part of this handler.

tensor_response_stream

Function tensor_response_stream Copy item path

Function tensor_response_stream