rig-cat 0.1.2

LLM agent framework built on comp-cat-rs: typed effects, no async, categorical foundations
Documentation

rig-cat

An LLM agent framework built on comp-cat-rs. No async, no tokio. All effects are Io<Error, A>, all concurrency is Fiber, all streaming is Stream.

Installation

[dependencies]
rig-cat = "0.1"

Quick start

use rig_cat::provider::openai::{OpenAiCompletion, ApiKey, ModelName};
use rig_cat::agent::AgentBuilder;

let model = OpenAiCompletion::new(
    ApiKey::new("sk-...".into()),
    ModelName::new("gpt-4o".into()),
);

let agent = AgentBuilder::new(model)
    .preamble("You are a helpful assistant.")
    .temperature(0.7)
    .build();

// Nothing runs until .run()
let response = agent.prompt("What is a Kan extension?").run();

Architecture

model/          CompletionModel trait, Message, Request/Response types
embedding/      EmbeddingModel trait, Embedding type, cosine similarity
tool/           Tool trait, generic Toolbox<T>
vector_store/   VectorStoreIndex trait, InMemoryVectorStore
agent/          Agent<M, T> with builder pattern
pipeline/       RAG pipeline via Io::flat_map chains
provider/       OpenAI, Anthropic implementations
error/          Hand-rolled Error enum

Every function that touches the network returns Io<Error, A>. Composition happens via map, flat_map, zip. Side effects only happen when you call .run().

Providers

OpenAI

use rig_cat::provider::openai::{OpenAiCompletion, OpenAiEmbedding, ApiKey, ModelName};

let completion = OpenAiCompletion::new(
    ApiKey::new(api_key),
    ModelName::new("gpt-4o".into()),
);

let embedding = OpenAiEmbedding::new(
    ApiKey::new(api_key),
    ModelName::new("text-embedding-3-small".into()),
);

Anthropic

use rig_cat::provider::anthropic::{AnthropicCompletion, ApiKey, ModelName};

let model = AnthropicCompletion::new(
    ApiKey::new(api_key),
    ModelName::new("claude-sonnet-4-20250514".into()),
    4096, // max_tokens
);

Tools

Tools are generic over a concrete type. For heterogeneous tools, define an enum:

use rig_cat::tool::{Tool, ToolDefinition, Toolbox};
use comp_cat_rs::effect::io::Io;
use serde_json::Value;

struct Calculator;

impl Tool for Calculator {
    fn definition(&self) -> ToolDefinition {
        ToolDefinition::new(
            "calculate".into(),
            "Evaluate a math expression".into(),
            serde_json::json!({"type": "object", "properties": {"expr": {"type": "string"}}}),
        )
    }

    fn call(&self, args: Value) -> Io<rig_cat::error::Error, Value> {
        Io::pure(serde_json::json!({"result": 42}))
    }
}

let toolbox = Toolbox::new().with_tool(Calculator);
let result = toolbox.invoke("calculate", serde_json::json!({"expr": "6 * 7"})).run();

RAG pipeline

The pipeline::rag function composes embedding, search, and generation into a single Io chain:

use std::rc::Rc;
use rig_cat::pipeline::rag;

let response = rag(
    "What is the return policy?".into(),
    Rc::new(completion_model),
    Rc::new(embedding_model),
    Rc::new(vector_store),
    Some("You are a customer service agent.".into()),
    3, // top_k
).run();

Concurrency

Use comp-cat-rs Fiber for parallel LLM calls:

use comp_cat_rs::effect::fiber::par_zip;

let task_a = agent.prompt("Summarize document A");
let task_b = agent.prompt("Summarize document B");

// Both run on separate threads
let (summary_a, summary_b) = par_zip(task_a, task_b).run()?;

Why no async?

LLM API calls are high-latency (1-30 seconds) and low-concurrency (a handful of calls, not thousands). Thread-per-request via Fiber is perfectly adequate. The benefit: no tokio, no Pin<Box<dyn Future>>, no colored functions. Everything composes with flat_map.

The categorical foundation

This crate is the practical application of the thesis proved in comp-cat-theory (Lean 4):

  • Io is a monad, which is a pair of Kan extensions
  • Stream is a colimit, which is a left Kan extension
  • Fiber::fork is a coproduct, Fiber::join is a limit
  • The RAG pipeline is a composition of monadic effects

The proofs are in Lean 4 with zero sorrys. The Rust code is the runtime implementation.

Status

Alpha. The core architecture is stable, but:

  • Streaming responses are not yet implemented (providers return Stream::empty())
  • Tool-calling loop (agent calls tool, feeds result back, repeats) is not yet implemented
  • Only two providers (OpenAI, Anthropic); more planned
  • Only in-memory vector store; external stores planned

License

MIT