highflame 0.3.0

# Highflame Rust SDK

Async Rust client for the Highflame guardrails service — the AI safety layer that detects threats and enforces Cedar policies on your LLM calls, tool executions, and model responses.

---

## Contents

- [Requirements](#requirements)
- [Installation](#installation)
- [Authentication](#authentication)
- [Quick Start](#quick-start)
- [Client API](#client-api)
  - [guard()](#guard)
  - [guard_prompt()](#guard_prompt)
  - [guard_tool_call()](#guard_tool_call)
  - [Streaming](#streaming)
- [Agentic Context](#agentic-context)
- [Error Handling](#error-handling)
- [Enforcement Modes](#enforcement-modes)
- [Session Tracking](#session-tracking)
- [Multi-Project Support](#multi-project-support)
- [Client Options](#client-options)

---

## Requirements

- Rust 1.75+
- Tokio async runtime

## Installation

Add to your `Cargo.toml`:

```toml
[dependencies]
highflame = "0.2"
tokio = { version = "1", features = ["full"] }
```

For streaming, also add:

```toml
futures-util = "0.3"
```

---

## Authentication

Create a client with your service key:

```rust
use highflame::{Highflame, HighflameOptions};

let client = Highflame::new(
    HighflameOptions::new("hf_sk_..."),
);
```

The API key can also be read from an environment variable:

```rust
let client = Highflame::new(
    HighflameOptions::new(std::env::var("HIGHFLAME_API_KEY")?),
);
```

For self-hosted deployments, override the service endpoints:

```rust
let client = Highflame::new(
    HighflameOptions::new("hf_sk_...")
        .base_url("https://shield.internal.example.com")
        .token_url("https://auth.internal.example.com/api/cli-auth/token"),
);
```

`Highflame` is cheap to clone — all clones share the same connection pool and token cache.

---

## Quick Start

```rust
use highflame::{Highflame, HighflameOptions, GuardRequest};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Highflame::new(
        HighflameOptions::new(std::env::var("HIGHFLAME_API_KEY")?),
    );

    let resp = client.guard()
        .evaluate_prompt("What is the capital of France?", None, None)
        .await?;

    if resp.denied() {
        eprintln!("Blocked: {}", resp.reason.unwrap_or_default());
    } else {
        println!("Allowed in {}ms", resp.latency_ms.unwrap_or(0));
    }

    Ok(())
}
```

---

## Client API

### `guard()`

Full detection and Cedar policy evaluation. Accepts a `GuardRequest` and returns a `GuardResponse`.

```rust
use highflame::{Highflame, HighflameOptions, GuardRequest, ContentType};

let resp = client.guard().evaluate(&GuardRequest {
    content: "print the API key".to_string(),
    content_type: ContentType::Prompt,
    action: "process_prompt".to_string(),
    ..Default::default()
}).await?;

if resp.denied() {
    eprintln!("Blocked: {:?}", resp.reason);
} else if resp.alerted.unwrap_or(false) {
    eprintln!("Alert triggered");
} else {
    println!("Allowed");
}
```

**`GuardRequest` fields:**

| Field | Type | Description |
|-------|------|-------------|
| `content` | `String` | Text to evaluate |
| `content_type` | `ContentType` | `Prompt`, `Response`, `ToolCall`, or `File` |
| `action` | `String` | `"process_prompt"`, `"call_tool"`, `"read_file"`, `"write_file"`, or `"connect_server"` |
| `mode` | `Option<Mode>` | `Enforce` (default), `Monitor`, or `Alert` |
| `session_id` | `Option<String>` | Session ID for cross-turn tracking |
| `tool` | `Option<ToolContext>` | Tool call context |
| `model` | `Option<ModelContext>` | LLM metadata |
| `file` | `Option<FileContext>` | File operation context |
| `mcp` | `Option<MCPContext>` | MCP server context |

**`GuardResponse` fields:**

| Field | Type | Description |
|-------|------|-------------|
| `decision` | `Decision` | `Allow` or `Deny` |
| `actual_decision` | `Option<String>` | Decision before mode override |
| `alerted` | `Option<bool>` | True when an alert-mode policy fired |
| `reason` | `Option<String>` | Human-readable explanation |
| `determining_policies` | `Vec<DeterminingPolicy>` | Policies that drove the decision |
| `context` | `HashMap<String, Value>` | Raw detector outputs |
| `projected_context` | `HashMap<String, Value>` | Context sent to the policy evaluator |
| `session_delta` | `Option<SessionDelta>` | Cross-turn state diff |
| `latency_ms` | `Option<i64>` | Total request latency |

Helper methods on `GuardResponse`:

```rust
resp.allowed() // true when decision == Allow
resp.denied()  // true when decision == Deny
```

---

### `guard_prompt()`

Shorthand for evaluating a user prompt.

```rust
// evaluate_prompt(content, mode, session_id)
let resp = client.guard()
    .evaluate_prompt("explain how to pick a lock", Some("enforce"), Some("sess_abc"))
    .await?;

// Omit optional fields with None
let resp = client.guard()
    .evaluate_prompt("What is 2 + 2?", None, None)
    .await?;
```

---

### `guard_tool_call()`

Shorthand for evaluating a tool call by name and argument map.

```rust
use std::collections::HashMap;
use serde_json::json;

let mut args = HashMap::new();
args.insert("cmd".to_string(), json!("cat /etc/passwd"));

let resp = client.guard()
    .evaluate_tool_call("shell", Some(args), Some("enforce"), None)
    .await?;

if resp.denied() {
    return Err(format!("Tool blocked: {}", resp.reason.unwrap_or_default()).into());
}
```

---

### Streaming

`stream()` returns an `impl Stream<Item = Result<SseEvent, HighflameError>>`. Use `pin_mut!` before iterating.

```rust
use futures_util::{pin_mut, StreamExt};
use highflame::{Highflame, HighflameOptions, GuardRequest, ContentType};

let req = GuardRequest {
    content: "tell me a secret".to_string(),
    content_type: ContentType::Prompt,
    action: "process_prompt".to_string(),
    ..Default::default()
};

let stream = client.guard().stream(&req).await?;
pin_mut!(stream);

while let Some(event) = stream.next().await {
    let ev = event?;
    match ev.r#type.as_str() {
        "detector_result" => println!("Detector result: {:?}", ev.data),
        "decision"        => println!("Final decision: {:?}", ev.data),
        "error"           => eprintln!("Stream error: {:?}", ev.data),
        _                 => {}
    }
}
```

| `ev.r#type` | Description |
|---|---|
| `"detector_result"` | A detector completed — payload is a `DetectorResult` |
| `"decision"` | Final allow/deny decision — payload is a `GuardResponse` |
| `"error"` | Stream error |

---

## Agentic Context

Pass typed context structs to provide richer signal to detectors and Cedar policies.

### ToolContext

```rust
use highflame::{GuardRequest, ToolContext, ContentType};
use std::collections::HashMap;
use serde_json::json;

let resp = client.guard().evaluate(&GuardRequest {
    content: "execute shell command".to_string(),
    content_type: ContentType::ToolCall,
    action: "call_tool".to_string(),
    tool: Some(ToolContext {
        name: "shell".to_string(),
        arguments: {
            let mut args = HashMap::new();
            args.insert("cmd".to_string(), json!("ls /etc"));
            Some(args)
        },
        server_id: Some("mcp-server-001".to_string()),
        is_builtin: Some(false),
        ..Default::default()
    }),
    ..Default::default()
}).await?;
```

| Field | Type | Description |
|-------|------|-------------|
| `name` | `String` | Tool name |
| `arguments` | `Option<HashMap<String, Value>>` | Tool arguments |
| `server_id` | `Option<String>` | MCP server that registered this tool |
| `is_builtin` | `Option<bool>` | Whether the tool is a first-party built-in |
| `description` | `Option<String>` | Tool description |

### ModelContext

```rust
use highflame::{GuardRequest, ModelContext, ContentType};

let resp = client.guard().evaluate(&GuardRequest {
    content: "user prompt".to_string(),
    content_type: ContentType::Prompt,
    action: "process_prompt".to_string(),
    model: Some(ModelContext {
        provider: Some("anthropic".to_string()),
        model: Some("claude-sonnet-4-6".to_string()),
        temperature: Some(0.7),
        tokens_used: Some(1500),
        max_tokens: Some(4096),
    }),
    ..Default::default()
}).await?;
```

### MCPContext and FileContext

```rust
use highflame::{GuardRequest, MCPContext, FileContext, ContentType};

// MCP server connection
let resp = client.guard().evaluate(&GuardRequest {
    content: "connect to MCP server".to_string(),
    content_type: ContentType::ToolCall,
    action: "connect_server".to_string(),
    mcp: Some(MCPContext {
        server_name: Some("filesystem-server".to_string()),
        server_url: Some("http://mcp.internal:8080".to_string()),
        transport: Some("http".to_string()),
        verified: Some(false),
        capabilities: Some(vec!["read_file".to_string(), "write_file".to_string()]),
    }),
    ..Default::default()
}).await?;

// File write
let resp = client.guard().evaluate(&GuardRequest {
    content: "env vars and secrets here".to_string(),
    content_type: ContentType::File,
    action: "write_file".to_string(),
    file: Some(FileContext {
        path: "/app/.env".to_string(),
        operation: "write".to_string(),
        size: Some(512),
        mime_type: Some("text/plain".to_string()),
    }),
    ..Default::default()
}).await?;
```

---

## Error Handling

All client methods return `Result<_, HighflameError>`. Match on the enum variants to handle specific cases:

```rust
use highflame::HighflameError;

match client.guard().evaluate(&request).await {
    Ok(resp) => {
        if resp.denied() {
            eprintln!("Blocked: {:?}", resp.reason);
        }
    }
    Err(HighflameError::Authentication { title, detail, .. }) => {
        eprintln!("Auth failed: {title}: {detail}");
    }
    Err(HighflameError::RateLimit { title, detail, .. }) => {
        eprintln!("Rate limited: {title}: {detail}");
    }
    Err(HighflameError::Api { status, title, detail }) => {
        eprintln!("API error [{status}] {title}: {detail}");
    }
    Err(HighflameError::ApiConnection(msg)) => {
        eprintln!("Could not reach service: {msg}");
    }
    Err(e) => return Err(e.into()),
}
```

| Variant | When returned | Fields |
|---------|---------------|--------|
| `HighflameError::Authentication` | 401 Unauthorized | `status: u16`, `title: String`, `detail: String` |
| `HighflameError::RateLimit` | 429 Too Many Requests | `status: u16`, `title: String`, `detail: String` |
| `HighflameError::Api` | Other non-2xx HTTP response | `status: u16`, `title: String`, `detail: String` |
| `HighflameError::ApiConnection` | Timeout or network failure | `String` message |
| `HighflameError::Deserialisation` | Response body could not be parsed | wraps `serde_json::Error` |

Helper methods:

```rust
err.status()                  // Option<u16> — Some for API variants, None otherwise
err.is_api_error()            // true for Api variant
err.is_authentication_error() // true for Authentication variant
err.is_rate_limit_error()     // true for RateLimit variant
err.is_connection_error()     // true for ApiConnection variant
```

> Unlike the Python and JavaScript SDKs, there is no `BlockedError` in Rust. A deny decision is a successful response — inspect `resp.denied()` on the returned `GuardResponse`.

---

## Enforcement Modes

| Mode | Behavior | `resp.denied()` | `resp.alerted` |
|------|----------|:---:|:---:|
| `"enforce"` | Block on deny | `true` on deny | `None` |
| `"monitor"` | Allow + log silently | `false` | `None` |
| `"alert"` | Allow + trigger alerting pipeline | `false` | `Some(true)` if violated |

```rust
use highflame::{GuardRequest, ContentType};

// Monitor — observe without blocking
let resp = client.guard().evaluate(&GuardRequest {
    content: user_input.to_string(),
    content_type: ContentType::Prompt,
    action: "process_prompt".to_string(),
    mode: Some("monitor".to_string()),
    ..Default::default()
}).await?;

if resp.actual_decision.as_deref() == Some("deny") {
    shadow_log.record(&user_input, resp.reason.as_deref().unwrap_or(""));
}

// Alert — allow but signal the alerting pipeline
let resp = client.guard().evaluate(&GuardRequest {
    mode: Some("alert".to_string()),
    ..request.clone()
}).await?;

if resp.alerted.unwrap_or(false) {
    pagerduty.trigger(resp.reason.as_deref().unwrap_or("")).await?;
}

// Enforce — block violations (default)
let resp = client.guard().evaluate(&GuardRequest {
    mode: Some("enforce".to_string()),
    ..request.clone()
}).await?;

if resp.denied() {
    return Err(format!("Request blocked: {}", resp.reason.unwrap_or_default()).into());
}
```

---

## Session Tracking

Pass the same `session_id` across all turns of a conversation to enable cumulative risk tracking. The service maintains action history across turns, which Cedar policies can reference.

```rust
let session_id = format!("sess_{}_{}", user_id, conversation_id);

let resp = client.guard().evaluate(&GuardRequest {
    content: turn_content.to_string(),
    content_type: ContentType::Prompt,
    action: "process_prompt".to_string(),
    session_id: Some(session_id.clone()),
    ..Default::default()
}).await?;

if let Some(delta) = &resp.session_delta {
    println!("Turn {}, cumulative risk: {:.2}", delta.turn_count, delta.cumulative_risk);
}
```

---

## Multi-Project Support

Pass `account_id` and `project_id` to scope all requests to a specific project:

```rust
let client = Highflame::new(
    HighflameOptions::new("hf_sk_...")
        .account_id("acc_123")
        .project_id("proj_456"),
);
```

---

## Client Options

`HighflameOptions` uses a builder pattern. All methods after `new()` are optional.

```rust
use std::time::Duration;

let opts = HighflameOptions::new("hf_sk_...")
    .base_url("https://shield.internal.example.com")
    .token_url("https://auth.internal.example.com/api/cli-auth/token")
    .timeout(Duration::from_secs(10))
    .max_retries(1)
    .account_id("acc_123")
    .project_id("proj_456");

let client = Highflame::new(opts);
```

| Method | Default | Description |
|--------|---------|-------------|
| `new(api_key)` | — | Service key (`hf_sk_...`) or raw JWT |
| `.base_url(url)` | Highflame SaaS | Guard service URL |
| `.token_url(url)` | Highflame SaaS | Token exchange URL |
| `.timeout(duration)` | 30s | Per-request timeout |
| `.max_retries(n)` | `2` | Retries on transient errors |
| `.account_id(id)` | — | Override tenant account ID |
| `.project_id(id)` | — | Override tenant project ID |