# Inference Gateway Rust SDK
An SDK written in Rust for the [Inference Gateway](https://github.com/inference-gateway/inference-gateway).
- [Inference Gateway Rust SDK](#inference-gateway-rust-sdk)
- [Installation](#installation)
- [Usage](#usage)
- [Creating a Client](#creating-a-client)
- [Listing Models](#listing-models)
- [Listing Models from a specific provider](#listing-models-from-a-specific-provider)
- [Generating Content](#generating-content)
- [Streaming Content](#streaming-content)
- [Tool-Use](#tool-use)
- [Health Check](#health-check)
- [Contributing](#contributing)
- [License](#license)
## Installation
Run `cargo add inference-gateway-sdk`.
## Usage
### Creating a Client
Here is a full example of how to create a client and interact with the Inference Gateway API:
```rust
use inference_gateway_sdk::{
CreateChatCompletionResponse,
GatewayError,
InferenceGatewayAPI,
InferenceGatewayClient,
ListModelsResponse,
Message,
Provider,
MessageRole
};
use log::info;
use std::env;
#[tokio::main]
async fn main() -> Result<(), GatewayError> {
if env::var("RUST_LOG").is_err() {
env::set_var("RUST_LOG", "info");
}
env_logger::init();
// Create a client
let client = InferenceGatewayClient::new("http://localhost:8080");
// List all models and all providers
let response: ListModelsResponse = client.list_models().await?;
for model in response.data {
info!("Model: {:?}", model.id);
}
// List models for a specific provider
let response: ListModelsResponse = client.list_models_by_provider(Provider::Groq).await?;
info!("Models for provider: {:?}", response.provider);
for model in response.data {
info!("Model: {:?}", model.id);
}
// Generate content - choose from available providers and models
let response: CreateChatCompletionResponse = client.generate_content(Provider::Groq, "deepseek-r1-distill-llama-70b", vec![
Message{
role: MessageRole::System,
content: "You are an helpful assistent.".to_string()
},
Message{
role: MessageRole::User,
content: "Tell me a funny joke".to_string()
}
]).await?;
log::info!(
"Generated content: {:?}",
response.choices[0].message.content
);
Ok(())
}
```
### Listing Models
To list all available models from all configured providers, use the `list_models` method:
```rust
use inference_gateway_sdk::{
GatewayError
InferenceGatewayAPI,
InferenceGatewayClient,
ListModelsResponse,
Message,
};
use log::info;
#[tokio::main]
fn main() -> Result<(), GatewayError> {
// ...Create a client
// List models from all providers
let response: ListModelsResponse = client.list_models().await?;
for model in response.data {
info!("Model: {:?}", model.id);
}
// ...
}
```
### Listing Models from a specific provider
To list all available models from a specific provider, use the `list_models_by_provider` method:
```rust
use inference_gateway_sdk::{
GatewayError
InferenceGatewayAPI,
InferenceGatewayClient,
ListModelsResponse,
Provider,
};
use log::info;
// ...Open main function
// List models for a specific provider
let response: ListModelsResponse = client.list_models_by_provider(Provider::Groq).await?;
info!("Models for provider: {:?}", response.provider);
for model in response.data {
info!("Model: {:?}", model.id);
}
// ...Rest of the main function
```
### Generating Content
To generate content using a model, use the `generate_content` method:
```rust
use inference_gateway_sdk::{
CreateChatCompletionResponse,
GatewayError,
InferenceGatewayAPI,
InferenceGatewayClient,
Message,
Provider,
MessageRole
};
// Generate content - choose from available providers and models
let response: CreateChatCompletionResponse = client.generate_content(Provider::Groq, "deepseek-r1-distill-llama-70b", vec![
Message{
role: MessageRole::System,
content: "You are an helpful assistent.".to_string(),
..Default::default()
},
Message{
role: MessageRole::User,
content: "Tell me a funny joke".to_string(),
..Default::default()
}
]).await?;
log::info!(
"Generated content: {:?}",
response.choices[0].message.content
);
```
### Streaming Content
You need to add the following tiny dependencies:
- `futures-util` for the `StreamExt` trait
- `serde` with feature `derive` and `serde_json` for serialization and deserialization of the response content
```rust
use futures_util::{pin_mut, StreamExt};
use inference_gateway_sdk::{
CreateChatCompletionStreamResponse, GatewayError, InferenceGatewayAPI, InferenceGatewayClient,
Message, MessageRole, Provider,
};
use log::info;
use std::env;
#[tokio::main]
async fn main() -> Result<(), GatewayError> {
if env::var("RUST_LOG").is_err() {
env::set_var("RUST_LOG", "info");
}
env_logger::init();
let system_message = "You are an helpful assistent.".to_string();
let model = "deepseek-r1-distill-llama-70b";
let client = InferenceGatewayClient::new("http://localhost:8080/v1");
let stream = client.generate_content_stream(
Provider::Groq,
model,
vec![
Message {
role: MessageRole::System,
content: system_message,
..Default::default()
},
Message {
role: MessageRole::User,
content: "Write a poem".to_string(),
..Default::default()
},
],
);
pin_mut!(stream);
// Iterate over the stream of Server Sent Events
while let Some(ssevent) = stream.next().await {
let ssevent = ssevent?;
// Deserialize the event response
let generate_response_stream: CreateChatCompletionStreamResponse =
serde_json::from_str(&ssevent.data)?;
let choice = generate_response_stream.choices.get(0);
if choice.is_none() {
continue;
}
let choice = choice.unwrap();
if let Some(usage) = generate_response_stream.usage.as_ref() {
// Get the usage metrics from the response
info!("Usage Metrics: {:?}", usage);
// Probably send them over to a metrics service
break;
}
// Print the token out as it's being sent from the server
if let Some(content) = choice.delta.content.as_ref() {
print!("{}", content);
}
if let Some(finish_reason) = choice.finish_reason.as_ref() {
if finish_reason == "stop" {
info!("Finished generating content");
break;
}
}
}
Ok(())
}
```
### Tool-Use
You can pass to the generate_content function also tools, which will be available for the LLM to use:
```rust
use inference_gateway_sdk::{
FunctionObject, GatewayError, InferenceGatewayAPI, InferenceGatewayClient, Message,
MessageRole, Provider, Tool, ToolType,
};
use log::{info, warn};
use serde::{Deserialize, Serialize};
use serde_json::{json, Value};
use std::env;
#[tokio::main]
async fn main() -> Result<(), GatewayError> {
// Configure logging
if env::var("RUST_LOG").is_err() {
env::set_var("RUST_LOG", "info");
}
env_logger::init();
// API endpoint - store as a variable so we can reuse it
let api_endpoint = "http://localhost:8080/v1";
// Initialize the API client
let client = InferenceGatewayClient::new(api_endpoint);
// Define the model and provider
let provider = Provider::Groq;
let model = "deepseek-r1-distill-llama-70b";
// Define the weather tool
let tools = vec![Tool {
r#type: ToolType::Function,
function: FunctionObject {
name: "get_current_weather".to_string(),
description: "Get the weather for a location".to_string(),
parameters: json!({
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city name"
}
},
"required": ["location"]
}),
},
}];
// Create initial conversation
let initial_messages = vec![
Message {
role: MessageRole::System,
content: "You are a helpful assistant that can check the weather.".to_string(),
..Default::default()
},
Message {
role: MessageRole::User,
content: "What is the current weather in Berlin?".to_string(),
..Default::default()
},
];
// Make the initial API request
info!("Sending initial request to model");
let response = client
.with_tools(Some(tools.clone()))
.generate_content(provider, model, initial_messages)
.await?;
info!("Received response from model");
// Check if we have a response
let choice = match response.choices.get(0) {
Some(choice) => choice,
None => {
warn!("No choice returned");
return Ok(());
}
};
// Check for tool calls in the response
if let Some(tool_calls) = &choice.message.tool_calls {
// Create a new conversation starting with the initial messages
let mut follow_up_convo = vec![
Message {
role: MessageRole::System,
content: "You are a helpful assistant that can check the weather.".to_string(),
..Default::default()
},
Message {
role: MessageRole::User,
content: "What is the current weather in Berlin?".to_string(),
..Default::default()
},
Message {
role: MessageRole::Assistant,
content: choice.message.content.clone(),
tool_calls: choice.message.tool_calls.clone(),
..Default::default()
},
];
// Process each tool call
for tool_call in tool_calls {
info!("Tool Call Requested: {}", tool_call.function.name);
if tool_call.function.name == "get_current_weather" {
// Parse arguments
let args = tool_call.function.parse_arguments()?;
// Call our function
let weather_result = get_current_weather(args)?;
// Add the tool response to the conversation
follow_up_convo.push(Message {
role: MessageRole::Tool,
content: weather_result,
tool_call_id: Some(tool_call.id.clone()),
..Default::default()
});
}
}
// Send the follow-up request with the tool results
info!("Sending follow-up request with tool results");
// Create a new client for the follow-up request
let follow_up_client = InferenceGatewayClient::new(api_endpoint);
let follow_up_response = follow_up_client
.with_tools(Some(tools))
.generate_content(provider, model, follow_up_convo)
.await?;
if let Some(choice) = follow_up_response.choices.get(0) {
info!("Final response: {}", choice.message.content);
} else {
warn!("No response in follow-up");
}
} else {
info!("No tool calls in the response");
info!("Model response: {}", choice.message.content);
}
Ok(())
}
#[derive(Debug, Deserialize, Serialize)]
struct Weather {
location: String,
}
fn get_current_weather(args: Value) -> Result<String, GatewayError> {
// Parse the location from the arguments
let weather: Weather = serde_json::from_value(args)?;
info!(
"Getting weather function was called for {}",
weather.location
);
// In a real application, we would call an actual weather API here
// For this example, we'll just return a mock response
let location = weather.location;
Ok(format!(
"The weather in {} is currently sunny with a temperature of 22°C",
location
))
}
```
### Health Check
To check if the Inference Gateway is running, use the `health_check` method:
```rust
// ...rest of the imports
use log::info;
// ...main function
let is_healthy = client.health_check().await?;
info!("API is healthy: {}", is_healthy);
```
## Contributing
Please refer to the [CONTRIBUTING.md](CONTRIBUTING.md) file for information about how to get involved. We welcome issues, questions, and pull requests.
## License
This SDK is distributed under the MIT License, see [LICENSE](LICENSE) for more information.