LLM Kit Hugging Face
Hugging Face provider for LLM Kit - Complete integration with the Hugging Face Responses API for chat models with advanced features.
Note: This provider uses the standardized builder pattern. See the Quick Start section for the recommended usage.
Features
- Text Generation: Generate text using Hugging Face models via the Responses API
- Streaming: Stream responses in real-time with support for tool calls
- Tool Calling: Support for function calling with automatic execution
- Multi-modal: Support for text and image inputs
- Provider-Executed Tools: MCP (Model Context Protocol) integration for server-side tool execution
- Source Annotations: Automatic source citations in responses
- Structured Output: JSON schema support for constrained generation
- Reasoning Content: Support for models with reasoning capabilities
Installation
Add this to your Cargo.toml:
[]
= "0.1"
= "0.1"
= "0.1"
= { = "1", = ["full"] }
Quick Start
Using the Client Builder (Recommended)
use HuggingFaceClient;
use LanguageModel;
async
Using Settings Directly (Alternative)
use ;
use LanguageModel;
async
Configuration
Environment Variables
Set your Hugging Face API key as an environment variable:
Using the Client Builder
use HuggingFaceClient;
let provider = new
.api_key
.base_url
.header
.name
.build;
Builder Methods
The HuggingFaceClient builder supports:
.api_key(key)- Set the API key (overridesHUGGINGFACE_API_KEYenvironment variable).base_url(url)- Set custom base URL (default:https://router.huggingface.co/v1).name(name)- Set provider name (optional).header(key, value)- Add a single custom header.headers(map)- Add multiple custom headers.build()- Build the provider
Provider-Specific Options
Provider-Executed Tools (MCP)
Hugging Face Responses API supports Model Context Protocol (MCP), allowing tools to be executed on the provider side:
use ;
// When tools are called, check if they were executed by the provider
let result = new
.tools
.execute
.await?;
for content in result.content
Source Annotations
The API automatically returns source citations as separate content items:
for content in result.content
Structured Output
Use JSON schema to constrain model output:
use ResponseFormat;
use json;
let result = new
.response_format
.execute
.await?;
Multi-modal Inputs
Include images in your prompts:
use Prompt;
let prompt = new
.add_text
.add_image;
let result = new
.execute
.await?;
Supported Models
The provider includes constants for popular models:
use ;
let model = provider.responses;
All models available via the Hugging Face Responses API are supported. You can also use any model ID as a string:
let model = provider.responses;
For a complete list of available models, see the Hugging Face Responses API documentation.
Supported Settings
| Setting | Supported | Notes |
|---|---|---|
temperature |
✅ | Temperature for sampling |
top_p |
✅ | Nucleus sampling |
max_output_tokens |
✅ | Maximum tokens to generate |
tools |
✅ | Function calling with MCP support |
tool_choice |
✅ | auto, required, specific tool |
response_format |
✅ | JSON schema support |
top_k |
❌ | Not supported by API |
seed |
❌ | Not supported by API |
presence_penalty |
❌ | Not supported by API |
frequency_penalty |
❌ | Not supported by API |
stop_sequences |
❌ | Not supported by API |
Examples
See the examples/ directory for complete examples:
chat.rs- Basic chat completion with usage statisticsstream.rs- Streaming responses with real-time outputchat_tool_calling.rs- Tool calling with function definitionsstream_tool_calling.rs- Streaming with tool calls
Run examples with:
Documentation
- API Documentation
- LLM Kit Documentation
- Hugging Face Responses API Reference
- Model Context Protocol (MCP)
License
MIT
Contributing
Contributions are welcome! Please see the Contributing Guide for more details.