Hibachi
Efficient batched inference tensor models

Hibachi is a Rust library for efficient batched inference with autoregressive (and soon feedforward) models. It dynamically groups multiple generation requests into batches, manages tensor operations, and streams results back to clients as they become available.
Key Features
- Dynamic Batching - Optimizes resource utilization by batching requests
- Asynchronous Processing - Non-blocking architecture built on Tokio
- Stream-Based API - Tokens are streamed back to clients as they're generated
- Backend Agnostic - Works with any tensor library that implements the
Backendtrait, includes implementations forCandleandBurnbackends (maxBurntensor rank of9) - Memory Efficient - Manages tensor padding, concatenation, and cleanup
Installation
Add this to your Cargo.toml:
[]
= { = "0.1.0", = ["candle", "autoregressive"] }# burn flag available as well
= { = "1", = ["full"] }
Quick Start
use ;
use ;
use Arc;
use ;
// 1. Implement the Autoregressive trait for your model
// 3. Create the batched inference engine
async
Architecture
Tensor Batch consists of several core components:
-
Backend Abstraction
- Traits that define required tensor operations
- Enables support for different tensor libraries
-
Autoregressive Models
- Interface for models that predict the next token based on previous tokens
- Supports variable batch and sequence dimensions
-
Batching Engine
- Dynamically manages multiple generation requests
- Handles tensor padding, concatenation, and state management
- Streams generated tokens back to clients
-
Communication Layer
- Asynchronous channels for efficient token streaming
- Proper error handling and resource cleanup
Advanced Usage
Custom Tensor Backends
To use with a custom tensor library, implement the Backend and Unsqueezable traits:
use ;
Custom Autoregressive Models
Implement the Autoregressive trait for your model:
use Autoregressive;
use async_trait;
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Building Docs
RUSTDOCFLAGS="--cfg docsrs" cargo +nightly doc --all-features