kowalski-tools 0.5.0

# Kowalski Tools

A comprehensive toolkit for the Kowalski AI agent framework, providing specialized tools for data processing, code analysis, web scraping, document processing, and more.

## Description

Kowalski Tools is a modular collection of specialized tools designed to extend the capabilities of the Kowalski AI agent framework. Each tool implements the `Tool` trait from `kowalski-core` and provides domain-specific functionality for various data processing and analysis tasks.

## Dependencies

### Core Dependencies
- **serde** (1.0) - Serialization/deserialization with derive features
- **serde_json** (1.0) - JSON serialization/deserialization
- **thiserror** (2.0) - Error handling utilities
- **anyhow** (1.0) - Error propagation utilities
- **async-trait** - Async trait support
- **tracing** - Logging and diagnostics
- **tokio** (1.32) - Async runtime with full features
- **chrono** - Date and time utilities
- **url** - URL parsing and manipulation

### Tool-Specific Dependencies
- **reqwest** (0.12) - HTTP client for web tools
- **scraper** (0.23) - HTML parsing and CSS selector support
- **lopdf** (0.36) - PDF processing library
- **csv** (1.1) - CSV parsing and processing

### Development Dependencies
- **mockall** (0.13) - Mocking framework for testing
- **wiremock** (0.6.0-rc.3) - HTTP mocking for integration tests

## Features

The module supports optional features that can be enabled:
- `web` - Web scraping and search functionality
- `pdf` - PDF document processing
- `data` - CSV and data analysis tools
- `code` - Code analysis tools

## Architecture

### Core Structure
```kowalski-tools/
├── src/
│   ├── lib.rs          # Main library entry point
│   ├── tool.rs         # Tool manager and utilities
│   ├── data.rs         # CSV and data processing tools
│   ├── code.rs         # Code analysis tools
│   ├── web/            # Web-related tools
│   │   ├── mod.rs
│   │   ├── search.rs   # Web search functionality
│   │   └── scrape.rs   # Web scraping functionality
│   └── document/       # Document processing tools
│       ├── mod.rs
│       └── pdf.rs      # PDF processing
```

### Design Patterns
- **Trait-based Architecture**: All tools implement the `Tool` trait from `kowalski-core`
- **Async Support**: Tools are designed for async execution using `async-trait`
- **Error Handling**: Comprehensive error handling using `KowalskiError` types
- **Parameter Validation**: Built-in parameter validation and type checking
- **Metadata Support**: Tools provide execution metadata for debugging and logging

### Tool Manager
The `ToolManager` provides centralized tool registration and execution:
- Tool registration and discovery
- Parameter validation
- Async execution handling
- Tool listing and metadata

## Tools

### CSV Tool
**Location**: `src/data.rs`  
**Tool Name**: `csv_tool`

A comprehensive CSV processing and analysis tool with statistical capabilities.

**Features**:
- Proper CSV parsing with error handling
- Statistical analysis (min, max, average, sum for numeric columns)
- Text analysis (unique counts, most common values)
- Configurable limits (max_rows, max_columns)
- Two task types: `process_csv` and `analyze_csv`

**Parameters**:
- `content` (required): CSV content to process
- `max_rows` (optional): Maximum number of rows to process
- `max_columns` (optional): Maximum number of columns to process

**Output**: JSON with headers, records, and statistical summary

### Code Analysis Tools
**Location**: `src/code.rs`

#### Java Analysis Tool
**Tool Name**: `java_analysis`

Comprehensive Java code analysis with metrics and quality suggestions.

**Features**:
- Code metrics (lines, characters, words, classes, methods, imports)
- Cyclomatic complexity calculation
- Code quality suggestions
- Basic syntax validation
- Complexity level assessment (Low/Medium/High)

**Parameters**:
- `content` (required): Java code to analyze

**Output**: JSON with metrics, complexity analysis, suggestions, and syntax errors

#### Python Analysis Tool
**Tool Name**: `python_analysis`

Python-specific code analysis with PEP 8 compliance checking.

**Features**:
- Python-specific metrics and analysis
- PEP 8 style guide compliance checking
- Function and class detection
- Import analysis
- Code quality suggestions

**Parameters**:
- `content` (required): Python code to analyze

**Output**: JSON with Python-specific metrics, PEP 8 violations, and suggestions

#### Rust Analysis Tool
**Tool Name**: `rust_analysis`

Rust code analysis with safety and performance considerations.

**Features**:
- Rust-specific syntax and safety analysis
- Ownership and borrowing pattern detection
- Performance optimization suggestions
- Error handling analysis
- Cargo.toml dependency analysis

**Parameters**:
- `content` (required): Rust code to analyze

**Output**: JSON with Rust-specific analysis, safety checks, and optimization suggestions

### Web Tools
**Location**: `src/web/`

#### Web Search Tool
**Tool Name**: `web_search`

Performs web searches using multiple search providers.

**Features**:
- Multiple search provider support (DuckDuckGo, Serper)
- Configurable result count
- Provider selection via parameters
- Environment-based API key configuration

**Parameters**:
- `query` (required): Search query string
- `num_results` (optional): Number of results (default: 3)
- `provider` (optional): Search provider (default: duckduckgo)

**Output**: JSON with search results and metadata

#### Web Scrape Tool
**Tool Name**: `web_scrape`

Scrapes web pages using CSS selectors with recursive link following.

**Features**:
- CSS selector-based content extraction
- Recursive link following with depth control
- HTML and text content extraction
- Error handling for network issues
- Configurable scraping depth

**Parameters**:
- `url` (required): URL to scrape
- `selectors` (required): Array of CSS selectors
- `follow_links` (optional): Follow links recursively (default: false)
- `max_depth` (optional): Maximum recursion depth (default: 1)

**Output**: JSON array of extracted content with metadata

### Document Tools
**Location**: `src/document/`

#### PDF Tool
**Tool Name**: `pdf_tool`

Comprehensive PDF processing with text, metadata, and image extraction.

**Features**:
- Text extraction from PDF pages
- Metadata extraction (title, author, creation date, etc.)
- Image extraction capabilities
- PDF structure analysis
- Error handling for corrupted files

**Parameters**:
- `file_path` (required): Path to PDF file
- `extract_text` (optional): Extract text content (default: true)
- `extract_metadata` (optional): Extract metadata (default: false)
- `extract_images` (optional): Extract images (default: false)

**Output**: JSON with extracted content based on selected options

## Usage Examples

### Basic Tool Usage
```rust
use kowalski_tools::{CsvTool, ToolManager};

let mut manager = ToolManager::new();
let csv_tool = CsvTool::new(1000, 50);
manager.register_tool(csv_tool);

// Execute CSV analysis
let input = ToolInput {
    task_type: "analyze_csv".to_string(),
    content: csv_content,
    parameters: HashMap::new(),
};

let result = manager.execute_tool("csv_tool", input).await?;
```

### Web Search Example
```rust
use kowalski_tools::web::WebSearchTool;

let search_tool = WebSearchTool::new("duckduckgo".to_string());
let input = ToolInput {
    task_type: "search".to_string(),
    content: "".to_string(),
    parameters: {
        let mut params = HashMap::new();
        params.insert("query".to_string(), json!("Rust programming"));
        params.insert("num_results".to_string(), json!(5));
        params
    },
};

let result = search_tool.execute(input).await?;
```

## Error Handling

All tools use the `KowalskiError` type for consistent error handling:
- `ToolExecution` - Errors during tool execution
- `ToolConfig` - Configuration errors
- `ContentProcessing` - Data processing errors
- `Execution` - General execution errors

## Testing

The module includes comprehensive tests for each tool:
- Unit tests for individual tool functionality
- Integration tests for tool interactions
- Mock-based testing for external dependencies
- HTTP mocking for web tool testing

## Future Enhancements

- Additional code analysis tools for more languages
- Enhanced PDF processing with OCR capabilities
- Database connectivity tools
- Machine learning model integration
- Real-time data streaming tools
- Advanced web scraping with JavaScript rendering