# Conversational API
`fchat` is the conversation orchestration layer for Fiddlesticks.
It sits above `fprovider` and is responsible for handling chat turns, session history loading/saving, and assembling provider requests from conversational state.
It can also integrate with `ftooling` for provider tool-call execution loops.
## Responsibilities
- Own chat-session and turn request/response types
- Load prior transcript messages from a conversation store
- Build and execute provider requests through `fprovider::ModelProvider`
- Persist new user/assistant transcript messages
`fchat` does **not**:
- Implement model-provider transports (that belongs to `fprovider`)
- Execute tools (that belongs to `ftooling`)
- Define memory retrieval/summarization engines (that belongs to `fmemory`)
## Current implementation scope
The implementation currently supports:
- Non-streaming turn execution via `ChatService::run_turn(...)`
- Live streaming turn execution via `ChatService::stream_turn(...)`
- Session-level system prompt injection
- In-memory transcript storage implementation for local use/tests
- Optional tool-call execution loop via `ftooling::ToolRuntime`
- Provider-call retries via `fprovider::RetryPolicy`
- Tool round-cap signaling when execution limits are reached
## Add dependency
```toml
[dependencies]
fchat = { path = "../fchat" }
ftooling = { path = "../ftooling" }
fprovider = { path = "../fprovider", features = ["provider-openai"] }
```
## Basic usage
```rust
use std::sync::Arc;
use fchat::prelude::*;
use fprovider::ProviderId;
async fn run_chat(provider: Arc<dyn fprovider::ModelProvider>) -> Result<(), ChatError> {
let chat = ChatService::builder(provider)
.default_temperature(Some(0.2))
.default_max_tokens(Some(400))
.build();
let session = ChatSession::new("session-1", ProviderId::OpenAi, "gpt-4o-mini")
.with_system_prompt("You are concise and helpful.");
let request = ChatTurnRequest::new(session, "Summarize this repo layout");
let result = chat.run_turn(request).await?;
println!("assistant: {}", result.assistant_message);
Ok(())
}
```
## High-level builders and defaults
`fchat` includes an opinionated builder path so you can configure once and keep turn calls lightweight.
- `ChatService::builder(provider)` defaults to `InMemoryConversationStore`
- `ChatPolicy::default()` is applied unless overridden
- per-turn values are merged with service defaults (`ChatTurnRequest` values win)
- provider retries default to `RetryPolicy::default()` and can be overridden
- `ChatTurnRequest::builder(...)` provides turn-level ergonomics for overrides
```rust
use std::sync::Arc;
use fchat::prelude::*;
fn build_service(provider: Arc<dyn fprovider::ModelProvider>) -> ChatService {
ChatService::builder(provider)
.default_temperature(Some(0.3))
.default_max_tokens(Some(512))
.max_tool_round_trips(4)
.provider_retry_policy(fprovider::RetryPolicy::new(3))
.build()
}
```
For turn-level overrides, use `ChatTurnOptions`:
```rust
use fchat::prelude::*;
let options = ChatTurnOptions {
temperature: Some(0.7),
max_tokens: Some(120),
stream: false,
};
let request = ChatTurnRequest::new(session, "Explain this quickly")
.with_options(options);
```
Or use the builder for symmetry with `ChatService::builder(...)`:
```rust
use fchat::prelude::*;
let request = ChatTurnRequest::builder(session, "Explain this quickly")
.temperature(0.7)
.max_tokens(120)
.build();
```
## Streaming usage
```rust
use std::sync::Arc;
use futures_util::StreamExt;
use fchat::prelude::*;
use fprovider::ProviderId;
async fn run_streaming(provider: Arc<dyn fprovider::ModelProvider>) -> Result<(), ChatError> {
let store = Arc::new(InMemoryConversationStore::new());
let chat = ChatService::new(provider, store);
let session = ChatSession::new("session-stream", ProviderId::OpenAi, "gpt-4o-mini");
let request = ChatTurnRequest::new(session, "Stream this response").enable_streaming();
let mut events = chat.stream_turn(request).await?;
while let Some(event) = events.next().await {
match event? {
ChatEvent::TextDelta(delta) => {
println!("delta: {}", delta);
}
ChatEvent::ToolCallDelta(_) => {}
ChatEvent::AssistantMessageComplete(_) => {}
ChatEvent::TurnComplete(result) => {
println!("final: {}", result.assistant_message);
}
}
}
Ok(())
}
```
Current streaming semantics:
- `stream_turn` maps provider stream events into chat-layer events.
- streaming supports multi-round tool execution when a tool runtime is configured.
- tool lifecycle events are emitted (`ToolExecutionStarted`, `ToolExecutionFinished`).
- Transcript persistence still occurs before `TurnComplete` is emitted.
- Events are forwarded as they arrive from the provider stream.
- Stream acquisition uses retry policy; once a stream is established, event failures are surfaced immediately.
## Tool loop usage (`ftooling` integration)
When configured, `ChatService::run_turn(...)` can execute provider tool calls and continue model turns.
```rust
use std::sync::Arc;
use fchat::prelude::*;
use fprovider::ProviderId;
use ftooling::prelude::*;
fn build_chat(provider: Arc<dyn fprovider::ModelProvider>) -> ChatService {
let mut registry = ToolRegistry::new();
registry.register_sync_fn(
fprovider::ToolDefinition {
name: "echo".to_string(),
description: "Echo tool".to_string(),
input_schema: "{\"type\":\"string\"}".to_string(),
},
|args, _ctx| Ok(args),
);
let runtime = Arc::new(DefaultToolRuntime::new(Arc::new(registry)));
let store = Arc::new(InMemoryConversationStore::new());
ChatService::new(provider, store)
.with_tool_runtime(runtime)
.with_max_tool_round_trips(4)
}
fn _session() -> ChatSession {
ChatSession::new("session-tools", ProviderId::OpenAi, "gpt-4o-mini")
}
```
Tool loop semantics:
- Tool execution is only used when both a runtime is configured and `max_tool_round_trips > 0`.
- Each provider `ToolCall` is executed through `ftooling::ToolRuntime`.
- Tool outputs are returned to the provider as `ToolResult` values for follow-up completions.
- Loop stops when no tool calls remain or max round-trips is reached.
- If the max round-trip cap is reached with pending tool calls:
- `run_turn` sets `ChatTurnResult.tool_round_limit_reached = true`
- `stream_turn` also emits `ChatEvent::ToolRoundLimitReached { ... }`
## Public API overview
- `ChatService`: turn orchestrator over provider + store
- `ChatSession`: session metadata (`id`, `provider`, `model`, optional `system_prompt`)
- `ChatTurnRequest`: user input + per-turn model params
- `ChatTurnResult`: assistant text + tool calls + stop reason + usage
- `ChatTurnResult`: includes `tool_round_limit_reached` for cap visibility
- `ChatTurnRequestBuilder`: ergonomic builder for per-turn options
- `ChatEvent`: streaming event envelope (`TextDelta`, `ToolCallDelta`, `ToolExecutionStarted`, `ToolExecutionFinished`, `AssistantMessageComplete`, `ToolRoundLimitReached`, `TurnComplete`)
- `ChatEventStream`: stream alias for chat event consumers
- `ConversationStore`: async conversation history contract
- `InMemoryConversationStore`: default in-crate store implementation
- `with_tool_runtime(...)`: opt-in `ftooling::ToolRuntime` integration
- `with_max_tool_round_trips(...)`: cap recursive tool/model rounds
## Error model
`ChatErrorKind` variants:
- `InvalidRequest`
- `Provider`
- `Store`
- `Tooling`
Provider errors from `fprovider` are mapped into `ChatErrorKind::Provider`.
Tool errors from `ftooling` are mapped into `ChatErrorKind::Tooling`.
`ChatError` also exposes:
- `retryable`: normalized retry hint for higher layers
- `phase`: where the failure occurred (`Provider`, `Tooling`, `Storage`, `Streaming`, etc.)
- `source`: source error kind (`ProviderErrorKind` or `ToolErrorKind`)
- helper methods: `is_retryable()` and `is_user_error()`
`phase` values are stable API hints for where the failure happened:
- `RequestValidation`
- `Provider`
- `Streaming`
- `Tooling`
- `Storage`