pub struct Chat<M: CreateChatSession> { /* private fields */ }Expand description
Chat is a chat interface that builds on top of crate::ChatModel and crate::StructuredChatModel. It makes it easy to create a chat session with streaming responses, and constraints.
Let’s start with a simple chat application:
// Before you create a chat session, you need a model. Llama::new_chat will create a good default chat model.
let model = Llama::new_chat().await.unwrap();
// Then you can build a chat session that uses that model
let mut chat = model.chat()
// The builder exposes methods for settings like the system prompt and constraints the bot response must follow
.with_system_prompt("The assistant will act like a pirate");
loop {
// To use the chat session, you need to add messages to it
let mut response_stream = chat(&prompt_input("\n> ").unwrap());
// And then display the response stream to the user
response_stream.to_std_out().await.unwrap();
}If you run the application, you may notice that it takes more time for the assistant to start responding to long prompts. The LLM needs to read and transform the prompt into a format it understands before it can start generating a response. Kalosm stores that state in a chat session, which can be saved and loaded from the filesystem to make loading existing chat sessions faster.
You can save and load chat sessions from the filesystem using the ChatSession::to_bytes and [ChatBuilder::from_bytes] methods:
// First, create a model to chat with
let model = Llama::new_chat().await.unwrap();
// Then try to load the chat session from the filesystem
let save_path = std::path::PathBuf::from("./chat.llama");
let mut chat = model.chat();
if let Some(old_session) = std::fs::read(&save_path)
.ok()
.and_then(|bytes| LlamaChatSession::from_bytes(&bytes).ok())
{
chat = chat.with_session(old_session);
}
// Then you can add messages to the chat session as usual
let mut response_stream = chat(&prompt_input("\n> ").unwrap());
// And then display the response stream to the user
response_stream.to_std_out().await.unwrap();
// After you are done, you can save the chat session to the filesystem
let session = chat.session().unwrap();
let bytes = session.to_bytes().unwrap();
std::fs::write(&save_path, bytes).unwrap();LLMs are powerful because of their generality, but sometimes you need more control over the output. For example, you might want the assistant to start with a certain phrase, or to follow a certain format.
In kalosm, you can use constraints to guide the model’s response. Constraints are a way to specify the format of the output. When generating with constraints, the model will always respond with the specified format.
Let’s create a chat application that uses constraints to guide the assistant’s response to always start with “Yes!”:
let model = Llama::new_chat().await.unwrap();
// Create constraints that parses Yes! and then stops on the end of the assistant's response
let constraints = LiteralParser::new("Yes!")
.then(model.default_assistant_constraints());
// Create a chat session with the model and the constraints
let mut chat = model.chat();
// Chat with the user
loop {
let mut output_stream = chat(&prompt_input("\n> ").unwrap()).with_constraints(constraints.clone());
output_stream.to_std_out().await.unwrap();
}Implementations§
Source§impl<M: CreateChatSession> Chat<M>
impl<M: CreateChatSession> Chat<M>
Sourcepub fn new(model: M) -> Chat<M>
pub fn new(model: M) -> Chat<M>
Create a new chat session with the default settings.
§Example
// Before you create a chat session, you need to create a model. Llama::new_chat will create a good default chat model.
let model = Llama::new_chat().await.unwrap();
// If you don't need to customize the chat session, you can use the `new` method to create a chat session with the default settings
let mut chat = Chat::new(model);Sourcepub fn with_system_prompt(self, system_prompt: impl ToString) -> Self
pub fn with_system_prompt(self, system_prompt: impl ToString) -> Self
Adds a system prompt to the chat. The system prompt guides the model to respond in a certain way. If no system prompt is added, the model will use a default system prompt that instructs the model to respond in a way that is safe and respectful.
§Example
let model = Llama::new_chat().await.unwrap();
let mut chat = model
.chat()
.with_system_prompt("The assistant will act like a pirate.");Sourcepub fn with_session(self, session: M::ChatSession) -> Self
pub fn with_session(self, session: M::ChatSession) -> Self
Starts the chat instance with the given model session. This can be useful for resuming a chat session with a long context that has already been processed.
§Example
let model = Llama::new_chat().await.unwrap();
// Load the model session from the filesystem
let session =
LlamaChatSession::from_bytes(std::fs::read("chat.llama").unwrap().as_slice()).unwrap();
// Start the chat session with the cached session
let mut chat = model.chat().with_session(session);Sourcepub fn add_message(
&mut self,
message: impl IntoChatMessage,
) -> ChatResponseBuilder<'_, M>
pub fn add_message( &mut self, message: impl IntoChatMessage, ) -> ChatResponseBuilder<'_, M>
Adds a user message to the chat session and streams the bot response.
§Example
let model = Llama::new_chat().await.unwrap();
let mut chat = model.chat();
let prompt = prompt_input("\n> ").unwrap();
// You can add the user message to the chat session with the `add_message` method
let mut response_stream = chat.add_message(prompt);
// And then stream the result to std out
response_stream.to_std_out().await.unwrap();Sourcepub fn into_add_message(
self,
message: impl IntoChatMessage,
) -> ChatResponseBuilder<'static, M>
pub fn into_add_message( self, message: impl IntoChatMessage, ) -> ChatResponseBuilder<'static, M>
Adds a user message to the chat session and streams the bot response while consuming the chat session.
§Example
let model = Llama::new_chat().await.unwrap();
let mut chat = model.chat();
let prompt = prompt_input("\n> ").unwrap();
// You can add the user message to the chat session with the `add_message` method
let mut response_stream = chat.into_add_message(prompt);
// And then stream the result to std out
response_stream.to_std_out().await.unwrap();Sourcepub fn session(
&self,
) -> Result<impl Deref<Target = M::ChatSession> + use<'_, M>, &M::Error>
pub fn session( &self, ) -> Result<impl Deref<Target = M::ChatSession> + use<'_, M>, &M::Error>
Get a reference to the chat session or an error if the session failed to load.
You can use the session to save the chat for later:
let model = Llama::new_chat().await.unwrap();
let mut chat = model.chat();
let session = chat.session().unwrap();
let bytes = session.to_bytes().unwrap();
std::fs::write("./chat.llama", bytes).unwrap();Or get the chat history:
let model = Llama::new_chat().await.unwrap();
let mut chat = model.chat();
// Add a message to the chat history
chat("Hello, world!").to_std_out().await.unwrap();
// Get the chat session
let session = chat.session().unwrap();
// Get the chat history
let history = session.history();
println!("{:?}", history);