cloudllm 0.4.2

A Rust library for bridging applications with remote LLMs across various platforms.
Documentation
0.4.0 OCT/13/2025
 - feat: Multi-Agent Council System
   - New council.rs module implementing collaborative multi-agent orchestration
   - Five council modes for different collaboration patterns:
     - Parallel: All agents process prompt simultaneously, responses aggregated
     - RoundRobin: Agents take sequential turns building on previous responses
     - Moderated: Agents submit proposals, moderator synthesizes final answer
     - Hierarchical: Lead agent coordinates, specialists handle specific aspects
     - Debate: Agents discuss and challenge each other until convergence
   - Agent identity system with name, expertise, personality, and optional tool access
   - Conversation history tracking with CouncilMessage and round metadata
   - CouncilResponse includes final answer, message history, rounds executed, convergence score, and total tokens used
 - feat: Tool Protocol Abstraction Layer (tool_protocol.rs)
   - Flexible abstraction for connecting agents to various tool protocols
   - ToolProtocol trait with execute(), list_tools(), get_tool_metadata()
   - Support for MCP (Model Context Protocol), custom functions, and user-defined protocols
   - ToolResult struct with success status, output, error, and execution metadata
   - ToolParameter system with support for String, Number, Integer, Boolean, Array, Object types
   - ToolMetadata with parameter definitions and protocol-specific metadata
   - ToolRegistry for centralized tool management
   - Tool and ToolError types for type-safe tool operations
 - feat: Tool Adapters (tool_adapters.rs)
   - CustomToolAdapter: Execute user-defined Rust closures as tools
   - MCPToolAdapter: Integration with Model Context Protocol servers
   - OpenAIToolAdapter: Compatible with OpenAI function calling format
   - All adapters implement async ToolProtocol trait
 - feat: Automatic Tool Execution in Agent Generation
   - Agents automatically discover and execute tools during response generation
   - Tool information injected into system prompts with name, description, and parameters
   - JSON-based tool calling format: {"tool_call": {"name": "...", "parameters": {...}}}
   - Automatic tool execution loop with max 5 iterations to prevent infinite loops
   - Tool results fed back to LLM as user messages for continued generation
   - Token usage tracked cumulatively across all LLM calls and tool executions
   - New AgentResponse struct returns both content and token usage
   - Agent::generate_with_tokens() method for internal use with token tracking
 - feat: Convergence Detection for Debate Mode
   - Jaccard similarity-based convergence detection for debate termination
   - Compares word sets between consecutive debate rounds
   - Configurable convergence threshold (default: 0.75 / 75% similarity)
   - Early termination when agents reach consensus, saving tokens and cost
   - Convergence score returned in CouncilResponse for inspection
   - calculate_convergence_score() and jaccard_similarity() helper methods
 - feat: Token Usage Tracking in Council Modes
   - Parallel mode tracks and aggregates tokens from all concurrent agents
   - RoundRobin mode accumulates tokens across sequential turns
   - Token usage includes all LLM calls plus tool execution overhead
   - CouncilResponse.total_tokens_used provides complete cost visibility
 - feat: Comprehensive Multi-Agent Tutorial (COUNCIL_TUTORIAL.md)
   - Cookbook-style tutorial with progressive complexity
   - Five detailed recipes demonstrating each council mode
   - Real-world carbon capture strategy problem domain
   - Examples use multiple LLM providers (OpenAI, Claude, Gemini, Grok)
   - Up to 5 agents per example with distinct expertise and personalities
   - Tool integration example with MCPToolAdapter
   - Best practices guide for agent design and council mode selection
   - Troubleshooting section with common issues and solutions
   - Complete multi-stage pipeline example combining multiple modes
 - test: Added comprehensive test coverage
   - test_agent_with_tool_execution: Validates tool discovery, execution, and result integration
   - test_debate_mode_convergence: Validates convergence detection with mock agents
   - test_parallel_execution: Tests concurrent agent execution
   - test_round_robin_execution: Tests sequential turn-taking
   - test_moderated_execution: Tests proposal aggregation
   - test_hierarchical_execution: Tests lead-specialist coordination
   - test_debate_execution: Tests debate discussion flow
   - All 12 tests passing
 - refactor: Code quality improvements
   - Fixed all compiler warnings
   - Switched from std::sync::Mutex to tokio::sync::Mutex for async compatibility
   - Removed unused imports and assignments
   - Improved error handling in council execution paths
 - docs: Added inline documentation for council system and tool protocol
   - Module-level documentation with architecture diagrams
   - Example code in doc comments
   - Detailed parameter and return value documentation

0.3.0 OCT/11/2025
 - feat: Add first-class streaming support for LLM responses
   - New MessageChunk type for incremental content delivery
   - ClientWrapper::send_message_stream() method for streaming responses
   - Implemented in OpenAIClient and GrokClient (delegates to OpenAI)
   - LLMSession::send_message_stream() for session-aware streaming
   - Dramatically reduces perceived latency - users see tokens as they arrive
   - Streaming returns Stream<Item = Result<MessageChunk, Box<dyn Error>>>
   - Backward compatible - existing non-streaming code unchanged
   - Added streaming_example.rs demonstrating usage
   - Added futures-util dependency for Stream trait
 - Note: Token usage tracking not available for streaming responses
 - Optimized LLMSession conversation history with VecDeque to eliminate O(N²) insert/remove churn
 - Minimize string formatting overhead in hot path error logs
 - Replace std::sync::Mutex with tokio::sync::Mutex for async-friendly token usage tracking
 - Change ClientWrapper::send_message to accept &[Message] instead of Vec<Message>
 - refactor: Move all client tests to external integration test files
 - feat: make token usage retrieval async and clean up Claude/OpenAI client usage
 - Implement token count caching for efficient message trimming
 - Implement pre-transmission trimming to reduce network overhead and latency
 - Arena/bump allocation for message bodies
 - Reuse request buffers in LLMSession to avoid repeated allocations on each send
 - OpenAI and Gemini clients preallocate formatted_messages with Vec::with_capacity
 - feat: Add model_name() to ClientWrapper and LLMSession
 - feat: Implement persistent HTTP connection pooling for all provider clients

0.2.12 SEP/21/2025
 - Added Claude client implementation at src/cloudllm/clients/claude.rs:
   - ClaudeClient struct follows the same delegate pattern as GrokClient, using OpenAIClient internally
   - Supports Anthropic API base URL (https://api.anthropic.com/v1)
   - Includes 6 Claude model variants: Claude35Sonnet20241022, Claude35Haiku20241022, Claude3Opus20240229, Claude35Sonnet20240620, Claude3Sonnet20240229, Claude3Haiku20240307
   - Implements all standard constructor methods and ClientWrapper trait
   - Added test function with CLAUDE_API_KEY environment variable
   - Updated README.md to mark Claude as supported (removed "Coming Soon")
   - Added Claude example to interactive_session.rs example

0.2.10 SEP/21/2025
 - New Grok model enums added to the GrokClient:
   - grok-4-fast-reasoning
   - grok-4-fast-non-reasoning
   - grok-code-fast-1

0.2.9
 - New OpenAI model enums added to the OpenAIClient:
   - gpt-5
   - gpt-5-mini
   - gpt-5-nano
   - gpt-5-chat-latest
 - Upgraded tokio to 1.47.1

0.2.8
 - Bumped cloudllm version to 0.2.8
 - Upgraded tokio dependency from 1.44.5 to 1.46.1
 - Updated Grok client model names and enums in src/cloudllm/clients/grok.rs:
  - Renamed Grok3MiniFastBeta to Grok3MiniFast, Grok3MiniBeta to Grok3Mini, Grok3FastBeta to Grok3Fast, Grok3Beta to Grok3, and Grok3Latest to Grok4_0709
  - Updated model_to_string function to reflect new model names
  - Changed test client initialization to use Grok4_0709 instead of Grok3Latest
 - Updated Gemini client model names and enums in src/cloudllm/clients/gemini.rs:
  - Renamed Gemini25FlashPreview0520 to Gemini25Flash and Gemini25ProPreview0506 to Gemini25Pro to reflect stable releases
  - Added new model enum Gemini25FlashLitePreview0617 for lightweight preview model
  - Updated model_to_string function to map new enum names: gemini-2.5-flash, gemini-2.5-pro, and gemini-2.5-flash-lite-preview-06-17

0.2.7
- Bumped cloudllm version to 0.2.7
- Upgraded openai-rust2 dependency from 1.5.9 to 1.6.0
- Extended ChatArguments and client wrappers for search and tool support:
  - Added `SearchParameters` struct and `with_search_parameters()` builder to `openai_rust::chat::ChatArguments`
  - Added `ToolType` enum and `Tool` struct, plus `tools` field and `with_tools()` builder (snake_case serialization)
  - Updated `ClientWrapper::send_message` signature to accept `optional_search_parameters: Option<SearchParameters>`
  - Modified `clients/common.rs` `send_and_track()` to take and inject `optional_search_parameters`
  - Updated `OpenAIClient`, `GeminiClient`, and `GrokClient` to forward `optional_search_parameters` to `send_and_track`
  - Exposed `optional_search_parameters` through `LLMSession::send_message` and its callers
- Other updates:
  - Added `Grok3Latest` variant to `grok::Model` enum and updated test to use it
  - Ensured backward compatibility: all existing call sites default `optional_search_parameters` to `None`

0.2.6
- Implemented token usage tracking across the LLMSession and ClientWrapper trait, including:
  - New TokenUsage struct for standardized tracking of input, output, and total tokens.
  - LLMSession now accumulates actual token usage after each message.
  - LLMSession::token_usage() method added for external inspection.
  - ClientWrapper trait extended with get_last_usage() (default: None) and new usage_slot() hook.
- Refactored token usage handling in OpenAIClient and GeminiClient:
  - Introduced a common send_and_track helper in clients/common.rs to centralize usage capture logic.
  - OpenAIClient and GeminiClient now store usage in an internal Mutex<Option<TokenUsage>>.
  - Redundant get_last_usage() implementations removed; only usage_slot() is overridden.
  - Added multiple constructors to GeminiClient: support for model strings, enums, and base URL configuration.
- Improved LLMSession context management:
  - Added max_tokens field with get_max_tokens() accessor.
  - Prunes conversation history using estimated token count per message.
  - Precise control over total_context_tokens and total_token_count.
- Example interactive_session.rs refactored to:
  - Demonstrate integration with both OpenAIClient, GrokClient, and GeminiClient.
  - Show token usage in real-time after each LLM response.
  - Test max_tokens pruning logic with visible metrics.
- Added model variants to GeminiClient enum:
  - Gemini25FlashPreview0520
  - Gemini25ProPreview0506
- Cleaned up and reorganized internal code:
  - Moved constructors and imports for clarity.
  - Removed redundant comments and unused stub code.
- Updated README example (interactive_session.md) with new usage patterns.

0.2.5
- Bumped tokio from 1.44.2 to 1.44.5
- Updated openai-rust2 from 1.5.8 to 1.5.9, with updated support for image generation models

0.2.4
- New enums for OpenAI client: gpt-4.1, gpt-4.1-mini, gpt-4.1-nano
- example in interactive_session.rs now uses gpt-4.1-mini

0.2.3
- Modified LLMSession to use Arc<dyn ClientWrapper> instead of generic T: ClientWrapper, enabling dynamic selection of client implementations (e.g., OpenAIClient, GrokClient, GeminiClient) at runtime.
- Updated LLMSession::new to accept Arc<dyn ClientWrapper>, removing Arc::new wrapping inside the constructor.
- Adjusted tests in gemini.rs and grok.rs to use Arc::new and non-generic LLMSession.
- Updated interactive_session.rs example to wrap client in Arc::new.
- Added init_logger function in lib.rs for thread-safe logger initialization using env_logger and Once.
- Replaced env_logger::try_init with crate::init_logger in gemini.rs and grok.rs tests for consistency.
- Updated GrokClient test to use Grok3MiniFastBeta enum variant.
- Updated LLMSession documentation to reflect Arc::new usage.
- Updated ClientWrapper trait to require Send + Sync, ensuring Arc<dyn ClientWrapper> is thread-safe and LLMSession can be used in async tasks (e.g., Tokio spawn).
- Enables safe dynamic client selection in multithreaded contexts
- All tests pass with the new implementation

0.2.2
- New Grok3 enums for the grokclient available
- Dependencies updated, cargo formatted

0.2.1
- Added new enums for O4Mini, O4MiniHigh and O3
- New enum for gpt-4.5-preview for the OpenAIClient

0.1.9 - feb.26.2025
- Update dependencies: tokio to 1.43.0, async-trait to 0.1.86, log to 0.4.26
- Refactor send_message in ClientWrapper to remove opt_url_path parameter
- Update GeminiClient to use openai_rust::Client and handle Gemini API directly
- Adjust OpenAIClient, GrokClient, LLMSession, and examples to new send_message signature

0.1.8 - feb.26.2025
- documentation updates

0.1.7 - feb.26.2025
- Update README.md to reflect support for Gemini
- Introduce Model enum in openai.rs for OpenAI models with model_to_string function
- Modify ClientWrapper trait to include optional URL path for API requests
- Update GeminiClient to use new ClientWrapper signature and set base URL to 'https://generativelanguage.googleapis.com/v1beta/'
- Adjust GrokClient to align with updated ClientWrapper signature
- Enhance OpenAIClient with Model enum support and optional URL paths in send_message
- Update LLMSession to pass optional URL path to client in send_message
- Revise examples/interactive_session.rs to use Model enum and new client methods
- Increment openai-rust2 dependency to version 1.5.8
- Fix minor formatting and improve error logging in clients

0.1.6 - feb.23.2025
- Introduced GrokClient in src/cloudllm/clients/grok.rs with support for multiple Grok models.
- Implemented the ClientWrapper trait for GrokClient to enable message sending via OpenAIClient.
- Added a test (test_grok_client) demonstrating basic usage and integration.
- Updated src/cloudllm/clients/mod.rs to include the new grok module.

0.1.5 - jan.23.2025
- Removed the original `openai-rust` dependency.
- Added `openai-rust2` as a new dependency, pointing to your custom fork with improvements.
- Added a new constructor `new_with_base_url` to allow specifying a custom base URL
- Ensured all modules are properly referenced and re-exported for future scalability.