Harmony Protocol
A reverse-engineered Rust library implementing OpenAI's Harmony response format for structured conversational AI interactions.
⚠️ IMPORTANT: This Library Requires an OpenAI Model
This library does NOT include an AI model. It provides the conversation formatting and parsing layer that works with OpenAI's models that understand the Harmony format. You still need:
- OpenAI API access or compatible model
- A model that understands Harmony formatting (
<|start|>,<|message|>,<|end|>tokens) - Integration code to send formatted tokens to the model and receive responses
What this library does: Formats conversations → [Your OpenAI Model] → Parses responses
Overview
This library provides a complete implementation of the Harmony response format used by OpenAI's open-weight model series (gpt-oss). It enables parsing and rendering of structured conversations with support for:
- Multiple communication channels (analysis, commentary, final)
- Tool calling and function integration
- Reasoning effort control
- Streaming token parsing
- System and developer instructions
Key Features
🚀 High Performance
- Rust-based core with minimal overhead
- Thread-local regex optimization
- Efficient tokenization with BPE encoding
- Memory-efficient streaming parser
🔧 Flexible Architecture
- Support for multiple encoding configurations
- Extensible tool system with namespaces
- Configurable channel routing
- Role-based message validation
🌐 Multi-Platform Support
- Native Rust library
- Python bindings (PyO3) with full API compatibility
- WebAssembly support with interactive demo
- Cross-platform vocabulary download and caching
📊 Production Ready
- Comprehensive test suite (13 tests passing)
- Performance benchmarks for all operations
- Graceful error handling and network failure recovery
- 4 detailed examples with documentation
- Thread-safe concurrent processing
Quick Start
Installation
Add to your Cargo.toml:
[]
= { = "https://github.com/yourusername/harmony-protocol" }
Basic Usage
use ;
With Tool Support
use ;
Streaming Parser
use ;
use Role;
Message Format
The Harmony format structures conversations using special tokens:
<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
Reasoning: medium
# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|>
<|start|>user<|message|>What is 2 + 2?<|end|>
<|start|>assistant<|channel|>analysis<|message|>I need to perform a simple arithmetic calculation.<|end|>
<|start|>assistant<|channel|>final<|message|>2 + 2 equals 4.<|end|>
Channel System
The library supports multiple communication channels for organized model outputs:
- analysis: Internal reasoning and analysis
- commentary: Model explanations and meta-commentary
- final: User-facing final responses
Channels can be configured as required, and the system automatically handles analysis dropping when final responses are complete.
Tool Integration
Built-in Tool Namespaces
- Browser Tools: Web browsing, search, and content extraction
- Python Tools: Code execution environment
- Function Tools: Custom function definitions
Custom Tools
use ToolDescription;
Architecture
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Chat Module │ │ Encoding Module │ │ Registry Module │
│ │ │ │ │ │
│ • Message │◄──►│ • Rendering │◄──►│ • Configurations│
│ • Conversation │ │ • Parsing │ │ • Token Mappings│
│ • Content Types │ │ • Streaming │ │ • Vocab Loading │
└─────────────────┘ └─────────────────┘ └─────────────────┘
▲ ▲
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Tiktoken Module │ │ Extensions │
│ │ │ │
│ • BPE Encoding │ │ • Public Vocabs │
│ • Tokenization │ │ • Hash Verify │
│ • Thread Safety │ │ • Remote Loading│
└─────────────────┘ └─────────────────┘
Special Tokens
| Token | ID | Purpose |
|---|---|---|
| `< | start | >` |
| `< | message | >` |
| `< | end | >` |
| `< | channel | >` |
| `< | call | >` |
| `< | return | >` |
| `< | constrain | >` |
Configuration
Environment Variables
TIKTOKEN_ENCODINGS_BASE: Custom vocabulary file directoryTIKTOKEN_RS_CACHE_DIR: Custom cache directory
Features
python-binding: Enable PyO3 Python bindingswasm-binding: Enable WebAssembly support
Performance
- Context Window: 1,048,576 tokens (1M)
- Max Action Length: 524,288 tokens (512K)
- Thread-Safe: Optimized for concurrent access
- Memory Efficient: Token reuse and streaming parsing
Testing
# Run all tests (13 tests covering unit + integration)
# Run performance benchmarks
# Run specific examples
The test suite includes comprehensive validation against canonical examples and edge cases.
Examples
The library includes 4 comprehensive examples:
basic_usage.rs- Message creation and conversation renderingtool_integration.rs- Custom tools and function callingstreaming_parser.rs- Real-time token processingchannel_management.rs- Multi-channel workflows
See examples/README.md for detailed usage instructions.
Python Bindings
# Same API as Rust, but in Python
=
=
=
WebAssembly Demo
Performance
Run benchmarks to see performance characteristics:
Results show the library can handle:
- Large conversations: 1000+ messages efficiently
- Real-time streaming: Process tokens as they arrive from model
- Concurrent access: Thread-safe for multiple conversations
Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
Documentation
License
This project is licensed under the Apache License 2.0.
Disclaimer
This is a reverse-engineered implementation for educational and research purposes. It is not affiliated with or endorsed by OpenAI.