vllora_core 0.1.14

AI gateway Core for Vllora.
docs.rs failed to build vllora_core-0.1.14
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: vllora_core-0.1.8

Lightweight, Real-time Debugging for AI Agents

Debug your Agents in Real Time. Trace, analyze, and optimize instantly. Seamless with LangChain, Google ADK, OpenAI, and all major frameworks.

Quick Start

First, install Homebrew if you haven't already, then:

brew tap vllora/vllora
brew install vllora

Start the vLLora:

vllora

The server will start on http://localhost:9090 and the UI will be available at http://localhost:9091.

vLLora uses OpenAI-compatible chat completions API, so when your AI agents make calls through vLLora, it automatically collects traces and debugging information for every interaction.

vLLora Demo

Test Send your First Request

  1. Configure API Keys: Visit http://localhost:9091 to configure your AI provider API keys through the UI
  2. Make a request to see debugging in action:
curl http://localhost:9090/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "What is the capital of France?"}]
  }'

Rust streaming example (OpenAI-compatible)

In llm/examples/openai_stream_basic/src/main.rs you can find a minimal Rust example that:

  • Builds an OpenAI-style request using CreateChatCompletionRequestArgs with:
    • model("gpt-4.1-mini")
    • a system message: "You are a helpful assistant."
    • a user message: "Stream numbers 1 to 20 in separate lines."
  • Constructs a VlloraLLMClient and configures credentials via:
export VLLORA_OPENAI_API_KEY="your-openai-compatible-key"

Inside the example, the client is created roughly as:

let client = VlloraLLMClient::new()
    .with_credentials(Credentials::ApiKey(ApiKeyCredentials {
        api_key: std::env::var("VLLORA_OPENAI_API_KEY")
            .expect("VLLORA_OPENAI_API_KEY must be set")
    }));

Then it streams the completion using the original OpenAI-style request:

let mut stream = client
    .completions()
    .create_stream(openai_req)
    .await?;

while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    for choice in chunk.choices {
        if let Some(delta) = choice.delta.content {
            print!("{delta}");
        }
    }
}

This will print the streamed response chunks (in this example, numbers 1 to 20) to stdout as they arrive.

Features

Real-time Tracing - Monitor AI agent interactions as they happen with live observability of calls, tool interactions, and agent workflow. See exactly what your agents are doing in real-time.

Real-time Tracing

MCP Support - Full support for Model Context Protocol (MCP) servers, enabling seamless integration with external tools by connecting with MCP Servers through HTTP and SSE

MCP Configuration

Development

To get started with development:

  1. Clone the repository:
git clone https://github.com/vllora/vllora.git
cd vLLora
cargo build --release

The binary will be available at target/release/vlora.

  1. Run tests:
cargo test

License

vLLora is fair-code distributed under the Elastic License 2.0 (ELv2).

  • Source Available: Always visible vLLora source code
  • Self-Hostable: Deploy vLLora anywhere you need
  • Extensible: Add your own providers, tools, MCP servers, and custom functionality

For Enterprise License, contact us at hello@vllora.dev.

Additional information about the license model can be found in the docs.

Contributing

We welcome contributions! Please check out our Contributing Guide for guidelines on:

  • How to submit issues
  • How to submit pull requests
  • Code style conventions
  • Development workflow
  • Testing requirements