ambi 0.3.8 - Docs.rs

# Stream Formatter


A stream formatter processes the LLM's output token-by-token in real time, deciding what to show the user and what to hide.

## Why you need one


When the LLM outputs raw text, it may include:
- `<think>` / `</think>` tags (reasoning blocks)
- `[TOOL_CALL]...[/TOOL_CALL]` blocks (raw tool call JSON)

These are useful for the machine but noisy for humans. The formatter strips or replaces them before the text reaches the user.

## Default: PassthroughFormatter


By default, nothing is filtered – all text goes straight through:

```rust
pub struct PassThroughFormatter;
impl StreamFormatter for PassThroughFormatter {
    fn push(&mut self, token: &str) -> String {
        token.to_string()
    }
    fn flush(&mut self) -> String {
        String::new()
    }
}
```

## Standard formatting


```rust
let agent = Agent::make(config).await?
    .with_standard_formatting();
```

This enables `StandardStreamFormatter`, which:

1. Scans each incoming token for the tool start/end tags and think tags
2. Buffers text within tool call blocks and suppresses it
3. Replaces think blocks with `[Thinking]:\n`
4. Labels non-tool content with `[Content]:`
5. Enforces a max buffer size (8KB by default) to prevent OOM on large chunks

Example transformation:

```
Raw LLM output:
  Let me think about this
  <think>
  The user is asking about the weather...
  </think>
  [TOOL_CALL]{"name":"get_weather","args":{"city":"Tokyo"}}[/TOOL_CALL]
  The weather in Tokyo is...

Formatted output:
  [Thinking]:
  The user is asking about the weather...
  [Content]:
  The weather in Tokyo is...
```

## Custom StreamFormatter


Implement the `StreamFormatter` trait:

```rust
use ambi::types::StreamFormatter;

struct MyFormatter;

impl StreamFormatter for MyFormatter {
    fn push(&mut self, token: &str) -> String {
        // Simple: just uppercase everything
        token.to_uppercase()
    }

    fn flush(&mut self) -> String {
        String::new()
    }
}
```

Then inject it:

```rust
let agent = Agent::make(config).await?
    .with_stream_formatter(|| Box::new(MyFormatter));
```

The `with_stream_formatter` method takes a **factory closure** (not an instance) because a new formatter is created per streaming request. This is important for stateful formatters that accumulate buffers.

## When the formatter is called


- In streaming mode: every LLM token chunk goes through `push()`, and `flush()` runs after the stream ends.
- In sync mode: the full output is passed through a formatter once, but the pipeline constructs it internally from `push()` + `flush()` calls.

## Buffer overflow protection


`StandardStreamFormatter` has a hard cap (`max_buffer_size`, default 8192 bytes). If the buffer exceeds it, the formatter clears itself and logs an error. This is a safety net against pathological LLM output.