Conversational API

fchat is the conversation orchestration layer for Fiddlesticks.

It sits above fprovider and is responsible for handling chat turns, session history loading/saving, and assembling provider requests from conversational state.

It can also integrate with ftooling for provider tool-call execution loops.

Responsibilities

Own chat-session and turn request/response types
Load prior transcript messages from a conversation store
Build and execute provider requests through fprovider::ModelProvider
Persist new user/assistant transcript messages

fchat does not:

Implement model-provider transports (that belongs to fprovider)
Execute tools (that belongs to ftooling)
Define memory retrieval/summarization engines (that belongs to fmemory)

Current implementation scope

The implementation currently supports:

Non-streaming turn execution via ChatService::run_turn(...)
Live streaming turn execution via ChatService::stream_turn(...)
Session-level system prompt injection
In-memory transcript storage implementation for local use/tests
Optional tool-call execution loop via ftooling::ToolRuntime
Provider-call retries via fprovider::RetryPolicy
Tool round-cap signaling when execution limits are reached

Add dependency

[dependencies]
fchat = { path = "../fchat" }
ftooling = { path = "../ftooling" }
fprovider = { path = "../fprovider", features = ["provider-openai"] }

Basic usage

use std::sync::Arc;

use fchat::prelude::*;
use fprovider::ProviderId;

async fn run_chat(provider: Arc<dyn fprovider::ModelProvider>) -> Result<(), ChatError> {
    let chat = ChatService::builder(provider)
        .default_temperature(Some(0.2))
        .default_max_tokens(Some(400))
        .build();

    let session = ChatSession::new("session-1", ProviderId::OpenAi, "gpt-4o-mini")
        .with_system_prompt("You are concise and helpful.");

    let request = ChatTurnRequest::new(session, "Summarize this repo layout");

    let result = chat.run_turn(request).await?;

    println!("assistant: {}", result.assistant_message);
    Ok(())
}

High-level builders and defaults

fchat includes an opinionated builder path so you can configure once and keep turn calls lightweight.

ChatService::builder(provider) defaults to InMemoryConversationStore
ChatPolicy::default() is applied unless overridden
per-turn values are merged with service defaults (ChatTurnRequest values win)
provider retries default to RetryPolicy::default() and can be overridden
ChatTurnRequest::builder(...) provides turn-level ergonomics for overrides

use std::sync::Arc;

use fchat::prelude::*;

fn build_service(provider: Arc<dyn fprovider::ModelProvider>) -> ChatService {
    ChatService::builder(provider)
        .default_temperature(Some(0.3))
        .default_max_tokens(Some(512))
        .max_tool_round_trips(4)
        .provider_retry_policy(fprovider::RetryPolicy::new(3))
        .build()
}

For turn-level overrides, use ChatTurnOptions:

use fchat::prelude::*;

let options = ChatTurnOptions {
    temperature: Some(0.7),
    max_tokens: Some(120),
    stream: false,
};

let request = ChatTurnRequest::new(session, "Explain this quickly")
    .with_options(options);

Or use the builder for symmetry with ChatService::builder(...):

use fchat::prelude::*;

let request = ChatTurnRequest::builder(session, "Explain this quickly")
    .temperature(0.7)
    .max_tokens(120)
    .build();

Streaming usage

use std::sync::Arc;

use futures_util::StreamExt;
use fchat::prelude::*;
use fprovider::ProviderId;

async fn run_streaming(provider: Arc<dyn fprovider::ModelProvider>) -> Result<(), ChatError> {
    let store = Arc::new(InMemoryConversationStore::new());
    let chat = ChatService::new(provider, store);

    let session = ChatSession::new("session-stream", ProviderId::OpenAi, "gpt-4o-mini");
    let request = ChatTurnRequest::new(session, "Stream this response").enable_streaming();

    let mut events = chat.stream_turn(request).await?;
    while let Some(event) = events.next().await {
        match event? {
            ChatEvent::TextDelta(delta) => {
                println!("delta: {}", delta);
            }
            ChatEvent::ToolCallDelta(_) => {}
            ChatEvent::AssistantMessageComplete(_) => {}
            ChatEvent::TurnComplete(result) => {
                println!("final: {}", result.assistant_message);
            }
        }
    }

    Ok(())
}

Current streaming semantics:

stream_turn maps provider stream events into chat-layer events.
streaming supports multi-round tool execution when a tool runtime is configured.
tool lifecycle events are emitted (ToolExecutionStarted, ToolExecutionFinished).
Transcript persistence still occurs before TurnComplete is emitted.
Events are forwarded as they arrive from the provider stream.
Stream acquisition uses retry policy; once a stream is established, event failures are surfaced immediately.

Tool loop usage (`ftooling` integration)

When configured, ChatService::run_turn(...) can execute provider tool calls and continue model turns.

use std::sync::Arc;

use fchat::prelude::*;
use fprovider::ProviderId;
use ftooling::prelude::*;

fn build_chat(provider: Arc<dyn fprovider::ModelProvider>) -> ChatService {
    let mut registry = ToolRegistry::new();
    registry.register_sync_fn(
        fprovider::ToolDefinition {
            name: "echo".to_string(),
            description: "Echo tool".to_string(),
            input_schema: "{\"type\":\"string\"}".to_string(),
        },
        |args, _ctx| Ok(args),
    );

    let runtime = Arc::new(DefaultToolRuntime::new(Arc::new(registry)));
    let store = Arc::new(InMemoryConversationStore::new());

    ChatService::new(provider, store)
        .with_tool_runtime(runtime)
        .with_max_tool_round_trips(4)
}

fn _session() -> ChatSession {
    ChatSession::new("session-tools", ProviderId::OpenAi, "gpt-4o-mini")
}

Tool loop semantics:

Tool execution is only used when both a runtime is configured and max_tool_round_trips > 0.
Each provider ToolCall is executed through ftooling::ToolRuntime.
Tool outputs are returned to the provider as ToolResult values for follow-up completions.
Loop stops when no tool calls remain or max round-trips is reached.
If the max round-trip cap is reached with pending tool calls:
- run_turn sets ChatTurnResult.tool_round_limit_reached = true
- stream_turn also emits ChatEvent::ToolRoundLimitReached { ... }

Public API overview

ChatService: turn orchestrator over provider + store
ChatSession: session metadata (id, provider, model, optional system_prompt)
ChatTurnRequest: user input + per-turn model params
ChatTurnResult: assistant text + tool calls + stop reason + usage
ChatTurnResult: includes tool_round_limit_reached for cap visibility
ChatTurnRequestBuilder: ergonomic builder for per-turn options
ChatEvent: streaming event envelope (TextDelta, ToolCallDelta, ToolExecutionStarted, ToolExecutionFinished, AssistantMessageComplete, ToolRoundLimitReached, TurnComplete)
ChatEventStream: stream alias for chat event consumers
ConversationStore: async conversation history contract
InMemoryConversationStore: default in-crate store implementation
with_tool_runtime(...): opt-in ftooling::ToolRuntime integration
with_max_tool_round_trips(...): cap recursive tool/model rounds

Error model

ChatErrorKind variants:

InvalidRequest
Provider
Store
Tooling

Provider errors from fprovider are mapped into ChatErrorKind::Provider. Tool errors from ftooling are mapped into ChatErrorKind::Tooling.

ChatError also exposes:

retryable: normalized retry hint for higher layers
phase: where the failure occurred (Provider, Tooling, Storage, Streaming, etc.)
source: source error kind (ProviderErrorKind or ToolErrorKind)
helper methods: is_retryable() and is_user_error()

phase values are stable API hints for where the failure happened:

RequestValidation
Provider
Streaming
Tooling
Storage

fchat 3.0.0