Harmony Protocol

A reverse-engineered Rust library implementing OpenAI's Harmony response format for structured conversational AI interactions.

⚠️ IMPORTANT: This Library Requires an OpenAI Model

This library does NOT include an AI model. It provides the conversation formatting and parsing layer that works with OpenAI's models that understand the Harmony format. You still need:

OpenAI API access or compatible model
A model that understands Harmony formatting (<|start|>, <|message|>, <|end|> tokens)
Integration code to send formatted tokens to the model and receive responses

What this library does: Formats conversations → [Your OpenAI Model] → Parses responses

Overview

This library provides a complete implementation of the Harmony response format used by OpenAI's open-weight model series (gpt-oss). It enables parsing and rendering of structured conversations with support for:

Multiple communication channels (analysis, commentary, final)
Tool calling and function integration
Reasoning effort control
Streaming token parsing
System and developer instructions

Key Features

🚀 High Performance

Rust-based core with minimal overhead
Thread-local regex optimization
Efficient tokenization with BPE encoding
Memory-efficient streaming parser

🔧 Flexible Architecture

Support for multiple encoding configurations
Extensible tool system with namespaces
Configurable channel routing
Role-based message validation

🌐 Multi-Platform Support

Native Rust library
Python bindings (PyO3) with full API compatibility
WebAssembly support with interactive demo
Cross-platform vocabulary download and caching

📊 Production Ready

Comprehensive test suite (13 tests passing)
Performance benchmarks for all operations
Graceful error handling and network failure recovery
4 detailed examples with documentation
Thread-safe concurrent processing

Quick Start

Installation

Add to your Cargo.toml:

[dependencies]
harmony-protocol = { git = "https://github.com/yourusername/harmony-protocol" }

Basic Usage

use harmony_protocol::{
    load_harmony_encoding, HarmonyEncodingName,
    chat::{Role, Message, Conversation, SystemContent}
};

fn main() -> anyhow::Result<()> {
    // Load the encoding
    let enc = load_harmony_encoding(HarmonyEncodingName::HarmonyGptOss)?;

    // Create a conversation
    let convo = Conversation::from_messages([
        Message::from_role_and_content(
            Role::System,
            SystemContent::new()
                .with_required_channels(["analysis", "commentary", "final"])
        ),
        Message::from_role_and_content(Role::User, "Hello, world!"),
    ]);

    // Render for completion (ready to send to OpenAI model)
    let input_tokens = enc.render_conversation_for_completion(&convo, Role::Assistant, None)?;
    println!("Generated {} tokens ready for OpenAI model", input_tokens.len());

    // TODO: Send input_tokens to your OpenAI model and get response_tokens
    // let response_tokens = your_openai_client.complete(input_tokens).await?;

    // Parse the model's response back to structured messages
    // let messages = enc.parse_messages_from_completion_tokens(response_tokens, Some(Role::Assistant))?;

    Ok(())
}

With Tool Support

use harmony_protocol::chat::{
    SystemContent, ToolDescription, ToolNamespaceConfig, Message, Role
};

fn main() -> anyhow::Result<()> {
    let tools = vec![
        ToolDescription::new(
            "calculate",
            "Performs mathematical calculations",
            Some(serde_json::json!({
                "type": "object",
                "properties": {
                    "expression": {"type": "string"}
                },
                "required": ["expression"]
            }))
        )
    ];

    let function_namespace = ToolNamespaceConfig::new("functions", None, tools);

    let system_content = SystemContent::new()
        .with_browser_tool()
        .with_tools(function_namespace);

    let message = Message::from_role_and_content(Role::System, system_content);
    Ok(())
}

Streaming Parser

use harmony_protocol::{StreamableParser, load_harmony_encoding, HarmonyEncodingName};
use harmony_protocol::chat::Role;

fn main() -> anyhow::Result<()> {
    let encoding = load_harmony_encoding(HarmonyEncodingName::HarmonyGptOss)?;
    let mut parser = StreamableParser::new(encoding.clone(), Some(Role::Assistant))?;

    // In practice, response_tokens would come from your OpenAI model's streaming API
    let response_tokens = vec![200006, 1234, 5678]; // These would be from OpenAI

    // Process tokens as they arrive from the model
    for token in response_tokens {
        parser.process(token)?;

        // Get content delta for real-time streaming UI updates
        if let Ok(Some(delta)) = parser.last_content_delta() {
            print!("{}", delta); // Show new content to user immediately
        }
    }

    // Get final structured messages after streaming is complete
    let messages = parser.into_messages();
    println!("\nParsed {} messages from model output", messages.len());
    Ok(())
}

Message Format

The Harmony format structures conversations using special tokens:

<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06

Reasoning: medium

# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|>
<|start|>user<|message|>What is 2 + 2?<|end|>
<|start|>assistant<|channel|>analysis<|message|>I need to perform a simple arithmetic calculation.<|end|>
<|start|>assistant<|channel|>final<|message|>2 + 2 equals 4.<|end|>

Channel System

The library supports multiple communication channels for organized model outputs:

analysis: Internal reasoning and analysis
commentary: Model explanations and meta-commentary
final: User-facing final responses

Channels can be configured as required, and the system automatically handles analysis dropping when final responses are complete.

Tool Integration

Built-in Tool Namespaces

Browser Tools: Web browsing, search, and content extraction
Python Tools: Code execution environment
Function Tools: Custom function definitions

Custom Tools

use harmony_protocol::chat::ToolDescription;

fn main() {
    let custom_tool = ToolDescription::new(
        "weather",
        "Gets current weather for a location",
        Some(serde_json::json!({
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
        }))
    );

    println!("Created custom tool: {}", custom_tool.name);
}

Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Chat Module   │    │ Encoding Module │    │ Registry Module │
│                 │    │                 │    │                 │
│ • Message       │◄──►│ • Rendering     │◄──►│ • Configurations│
│ • Conversation  │    │ • Parsing       │    │ • Token Mappings│
│ • Content Types │    │ • Streaming     │    │ • Vocab Loading │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         ▲                       ▲
         │                       │
         ▼                       ▼
┌─────────────────┐    ┌─────────────────┐
│ Tiktoken Module │    │    Extensions   │
│                 │    │                 │
│ • BPE Encoding  │    │ • Public Vocabs │
│ • Tokenization  │    │ • Hash Verify   │
│ • Thread Safety │    │ • Remote Loading│
└─────────────────┘    └─────────────────┘

Special Tokens

Token	ID	Purpose
`<	start	>`
`<	message	>`
`<	end	>`
`<	channel	>`
`<	call	>`
`<	return	>`
`<	constrain	>`

Configuration

Environment Variables

TIKTOKEN_ENCODINGS_BASE: Custom vocabulary file directory
TIKTOKEN_RS_CACHE_DIR: Custom cache directory

Features

python-binding: Enable PyO3 Python bindings
wasm-binding: Enable WebAssembly support

Performance

Context Window: 1,048,576 tokens (1M)
Max Action Length: 524,288 tokens (512K)
Thread-Safe: Optimized for concurrent access
Memory Efficient: Token reuse and streaming parsing

Testing

# Run all tests (13 tests covering unit + integration)
cargo test

# Run performance benchmarks
cargo bench

# Run specific examples
cargo run --example basic_usage
cargo run --example tool_integration
cargo run --example streaming_parser
cargo run --example channel_management

The test suite includes comprehensive validation against canonical examples and edge cases.

Examples

The library includes 4 comprehensive examples:

basic_usage.rs - Message creation and conversation rendering
tool_integration.rs - Custom tools and function calling
streaming_parser.rs - Real-time token processing
channel_management.rs - Multi-channel workflows

See examples/README.md for detailed usage instructions.

Python Bindings

cd python
python setup.py build_rust
pip install -e .

import harmony_protocol as hr

# Same API as Rust, but in Python
encoding = hr.load_harmony_encoding(hr.HarmonyEncodingName.harmony_gpt_oss())
conversation = hr.Conversation.from_messages([
    hr.Message.from_role_and_content(hr.Role.user(), "Hello!")
])
tokens = encoding.render_conversation(conversation)

WebAssembly Demo

cd www
npm run build  # Requires wasm-pack
npm run serve  # Open http://localhost:8000

Performance

Run benchmarks to see performance characteristics:

cargo bench

Results show the library can handle:

Large conversations: 1000+ messages efficiently
Real-time streaming: Process tokens as they arrive from model
Concurrent access: Thread-safe for multiple conversations

Contributing

Fork the repository
Create a feature branch
Add tests for new functionality
Ensure all tests pass
Submit a pull request

Documentation

License

This project is licensed under the Apache License 2.0.

Disclaimer

This is a reverse-engineered implementation for educational and research purposes. It is not affiliated with or endorsed by OpenAI.

harmony-protocol 0.1.0

Harmony Protocol

⚠️ IMPORTANT: This Library Requires an OpenAI Model

Overview

Key Features

🚀 High Performance

🔧 Flexible Architecture

🌐 Multi-Platform Support

📊 Production Ready

Quick Start

Installation

Basic Usage

With Tool Support

Streaming Parser

Message Format

Channel System

Tool Integration

Built-in Tool Namespaces

Custom Tools

Architecture

Special Tokens

Configuration

Environment Variables

Features

Performance

Testing

Examples

Python Bindings

WebAssembly Demo

Performance

Contributing

Documentation

License

Disclaimer