# rs-llama-cpp
> Automated Rust bindings generation for LLaMA.cpp
## Description
[LLaMA.cpp][1] is under heavy development with contributions pouring in from numerous individuals every day. Currently, its C API is very low-level and given how fast the project is evolving, keeping up with the changes and porting the examples into a higher-level API prove to be difficult. As a trade-off, this project prioritizes automation over flexibility by automatically generating Rust bindings for the [main example][2] of LLaMA.cpp.
### Limitations
The main design goal of this project is to minimize the effort of updating LLaMA.cpp by automating as many steps as possible. However, this approach does have some limitations:
1. The API is very high-level, resembling a call to the main function of LLaMA.cpp and receiving tokens through a callback function.
2. Currently, the project does not expose parameters with types that are more challenging to convert to Rust, such as `std::unordered_map` and `std::vector`.
3. Some of the parameters exposed via the Rust API are only relevant for a CLI.
4. The generated C++ library outputs a significant amount of debug information to `stderr` and `stdout`. This is not configurable at the moment
## Usage
```rust
use rs_llama_cpp::{gpt_params_c, run_inference, str_to_mut_i8};
fn main() {
let params: gpt_params_c = {
gpt_params_c {
model: str_to_mut_i8("/path/to/model.bin"),
prompt: str_to_mut_i8("Hello "),
..Default::default()
}
};
run_inference(params, |token| {
println!("Token: {}", token);
if token.ends_with("\n") {
return false; // stop inference
}
return true; // continue inference
});
}
```
[1]: https://github.com/ggerganov/llama.cpp
[2]: https://github.com/ggerganov/llama.cpp/blob/master/examples/main/main.cpp