harmony-protocol 0.1.0

# Harmony Protocol

A reverse-engineered Rust library implementing OpenAI's Harmony response format for structured conversational AI interactions.

## ⚠️ **IMPORTANT: This Library Requires an OpenAI Model**

**This library does NOT include an AI model.** It provides the conversation formatting and parsing layer that works with OpenAI's models that understand the Harmony format. You still need:

- **OpenAI API access** or compatible model
- A model that understands Harmony formatting (`<|start|>`, `<|message|>`, `<|end|>` tokens)
- Integration code to send formatted tokens to the model and receive responses

**What this library does:** Formats conversations → **[Your OpenAI Model]** → Parses responses

## Overview

This library provides a complete implementation of the Harmony response format used by OpenAI's open-weight model series (gpt-oss). It enables parsing and rendering of structured conversations with support for:

- **Multiple communication channels** (analysis, commentary, final)
- **Tool calling and function integration**
- **Reasoning effort control**
- **Streaming token parsing**
- **System and developer instructions**

## Key Features

### 🚀 **High Performance**
- Rust-based core with minimal overhead
- Thread-local regex optimization
- Efficient tokenization with BPE encoding
- Memory-efficient streaming parser

### 🔧 **Flexible Architecture**
- Support for multiple encoding configurations
- Extensible tool system with namespaces
- Configurable channel routing
- Role-based message validation

### 🌐 **Multi-Platform Support**
- Native Rust library
- Python bindings (PyO3) with full API compatibility
- WebAssembly support with interactive demo
- Cross-platform vocabulary download and caching

### 📊 **Production Ready**
- Comprehensive test suite (13 tests passing)
- Performance benchmarks for all operations
- Graceful error handling and network failure recovery
- 4 detailed examples with documentation
- Thread-safe concurrent processing

## Quick Start

### Installation

Add to your `Cargo.toml`:

```toml
[dependencies]
harmony-protocol = { git = "https://github.com/yourusername/harmony-protocol" }
```

### Basic Usage

```rust
use harmony_protocol::{
    load_harmony_encoding, HarmonyEncodingName,
    chat::{Role, Message, Conversation, SystemContent}
};

fn main() -> anyhow::Result<()> {
    // Load the encoding
    let enc = load_harmony_encoding(HarmonyEncodingName::HarmonyGptOss)?;

    // Create a conversation
    let convo = Conversation::from_messages([
        Message::from_role_and_content(
            Role::System,
            SystemContent::new()
                .with_required_channels(["analysis", "commentary", "final"])
        ),
        Message::from_role_and_content(Role::User, "Hello, world!"),
    ]);

    // Render for completion (ready to send to OpenAI model)
    let input_tokens = enc.render_conversation_for_completion(&convo, Role::Assistant, None)?;
    println!("Generated {} tokens ready for OpenAI model", input_tokens.len());

    // TODO: Send input_tokens to your OpenAI model and get response_tokens
    // let response_tokens = your_openai_client.complete(input_tokens).await?;

    // Parse the model's response back to structured messages
    // let messages = enc.parse_messages_from_completion_tokens(response_tokens, Some(Role::Assistant))?;

    Ok(())
}
```

### With Tool Support

```rust
use harmony_protocol::chat::{
    SystemContent, ToolDescription, ToolNamespaceConfig, Message, Role
};

fn main() -> anyhow::Result<()> {
    let tools = vec![
        ToolDescription::new(
            "calculate",
            "Performs mathematical calculations",
            Some(serde_json::json!({
                "type": "object",
                "properties": {
                    "expression": {"type": "string"}
                },
                "required": ["expression"]
            }))
        )
    ];

    let function_namespace = ToolNamespaceConfig::new("functions", None, tools);

    let system_content = SystemContent::new()
        .with_browser_tool()
        .with_tools(function_namespace);

    let message = Message::from_role_and_content(Role::System, system_content);
    Ok(())
}
```

### Streaming Parser

```rust
use harmony_protocol::{StreamableParser, load_harmony_encoding, HarmonyEncodingName};
use harmony_protocol::chat::Role;

fn main() -> anyhow::Result<()> {
    let encoding = load_harmony_encoding(HarmonyEncodingName::HarmonyGptOss)?;
    let mut parser = StreamableParser::new(encoding.clone(), Some(Role::Assistant))?;

    // In practice, response_tokens would come from your OpenAI model's streaming API
    let response_tokens = vec![200006, 1234, 5678]; // These would be from OpenAI

    // Process tokens as they arrive from the model
    for token in response_tokens {
        parser.process(token)?;

        // Get content delta for real-time streaming UI updates
        if let Ok(Some(delta)) = parser.last_content_delta() {
            print!("{}", delta); // Show new content to user immediately
        }
    }

    // Get final structured messages after streaming is complete
    let messages = parser.into_messages();
    println!("\nParsed {} messages from model output", messages.len());
    Ok(())
}
```

## Message Format

The Harmony format structures conversations using special tokens:

```text
<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06

Reasoning: medium

# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|>
<|start|>user<|message|>What is 2 + 2?<|end|>
<|start|>assistant<|channel|>analysis<|message|>I need to perform a simple arithmetic calculation.<|end|>
<|start|>assistant<|channel|>final<|message|>2 + 2 equals 4.<|end|>
```

## Channel System

The library supports multiple communication channels for organized model outputs:

- **analysis**: Internal reasoning and analysis
- **commentary**: Model explanations and meta-commentary
- **final**: User-facing final responses

Channels can be configured as required, and the system automatically handles analysis dropping when final responses are complete.

## Tool Integration

### Built-in Tool Namespaces

1. **Browser Tools**: Web browsing, search, and content extraction
2. **Python Tools**: Code execution environment
3. **Function Tools**: Custom function definitions

### Custom Tools

```rust
use harmony_protocol::chat::ToolDescription;

fn main() {
    let custom_tool = ToolDescription::new(
        "weather",
        "Gets current weather for a location",
        Some(serde_json::json!({
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
        }))
    );

    println!("Created custom tool: {}", custom_tool.name);
}
```

## Architecture

```text
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Chat Module   │    │ Encoding Module │    │ Registry Module │
│                 │    │                 │    │                 │
│ • Message       │◄──►│ • Rendering     │◄──►│ • Configurations│
│ • Conversation  │    │ • Parsing       │    │ • Token Mappings│
│ • Content Types │    │ • Streaming     │    │ • Vocab Loading │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         ▲                       ▲
         │                       │
         ▼                       ▼
┌─────────────────┐    ┌─────────────────┐
│ Tiktoken Module │    │    Extensions   │
│                 │    │                 │
│ • BPE Encoding  │    │ • Public Vocabs │
│ • Tokenization  │    │ • Hash Verify   │
│ • Thread Safety │    │ • Remote Loading│
└─────────────────┘    └─────────────────┘
```

## Special Tokens

| Token | ID | Purpose |
|-------|-----|---------|
| `<|start|>` | 200006 | Message start marker |
| `<|message|>` | 200008 | Content start marker |
| `<|end|>` | 200007 | Message end marker |
| `<|channel|>` | 200005 | Channel specification |
| `<|call|>` | 200012 | Tool call end marker |
| `<|return|>` | 200002 | Training completion |
| `<|constrain|>` | 200003 | Constrained format |

## Configuration

### Environment Variables

- `TIKTOKEN_ENCODINGS_BASE`: Custom vocabulary file directory
- `TIKTOKEN_RS_CACHE_DIR`: Custom cache directory

### Features

- `python-binding`: Enable PyO3 Python bindings
- `wasm-binding`: Enable WebAssembly support

## Performance

- **Context Window**: 1,048,576 tokens (1M)
- **Max Action Length**: 524,288 tokens (512K)
- **Thread-Safe**: Optimized for concurrent access
- **Memory Efficient**: Token reuse and streaming parsing

## Testing

```bash
# Run all tests (13 tests covering unit + integration)
cargo test

# Run performance benchmarks
cargo bench

# Run specific examples
cargo run --example basic_usage
cargo run --example tool_integration
cargo run --example streaming_parser
cargo run --example channel_management
```

The test suite includes comprehensive validation against canonical examples and edge cases.

## Examples

The library includes 4 comprehensive examples:

1. **`basic_usage.rs`** - Message creation and conversation rendering
2. **`tool_integration.rs`** - Custom tools and function calling
3. **`streaming_parser.rs`** - Real-time token processing
4. **`channel_management.rs`** - Multi-channel workflows

See [`examples/README.md`](examples/README.md) for detailed usage instructions.

## Python Bindings

```bash
cd python
python setup.py build_rust
pip install -e .
```

```python
import harmony_protocol as hr

# Same API as Rust, but in Python
encoding = hr.load_harmony_encoding(hr.HarmonyEncodingName.harmony_gpt_oss())
conversation = hr.Conversation.from_messages([
    hr.Message.from_role_and_content(hr.Role.user(), "Hello!")
])
tokens = encoding.render_conversation(conversation)
```

## WebAssembly Demo

```bash
cd www
npm run build  # Requires wasm-pack
npm run serve  # Open http://localhost:8000
```

## Performance

Run benchmarks to see performance characteristics:

```bash
cargo bench
```

Results show the library can handle:
- **Large conversations**: 1000+ messages efficiently
- **Real-time streaming**: Process tokens as they arrive from model
- **Concurrent access**: Thread-safe for multiple conversations

## Contributing

1. Fork the repository
2. Create a feature branch
3. Add tests for new functionality
4. Ensure all tests pass
5. Submit a pull request

## Documentation

- [API Reference](docs/api-reference.md)
- [Reverse Engineering Notes](docs/reverse-engineering-notes.md)

## License

This project is licensed under the Apache License 2.0.

## Disclaimer

This is a reverse-engineered implementation for educational and research purposes. It is not affiliated with or endorsed by OpenAI.