heroindex_client 0.1.3

# HeroIndex Client

[![Crates.io](https://img.shields.io/crates/v/heroindex_client.svg)](https://crates.io/crates/heroindex_client)
[![Documentation](https://docs.rs/heroindex_client/badge.svg)](https://docs.rs/heroindex_client)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Repository](https://img.shields.io/badge/repo-forge.ourworld.tf-blue)](https://forge.ourworld.tf/lhumina_research/hero_index_server)

A Rust client library for [HeroIndex](https://crates.io/crates/heroindex), a high-performance full-text search server built on Tantivy.

**Repository:** https://forge.ourworld.tf/lhumina_research/hero_index_server

> **Need the server?** Install it with `cargo install heroindex` or see [heroindex on crates.io](https://crates.io/crates/heroindex)

## Features

- **Async/Await** - Built on Tokio for async operations
- **Type-Safe** - Strongly typed responses
- **Simple API** - Intuitive method names matching RPC calls
- **10+ Field Types** - text, str, u64, i64, f64, date, bool, json, bytes, ip
- **8 Query Types** - match, term, fuzzy, phrase, prefix, range, regex, boolean
- **Batch Operations** - Efficient bulk document insertion

## Installation

Add to your `Cargo.toml`:

```toml
[dependencies]
heroindex_client = "0.1"
tokio = { version = "1", features = ["full"] }
serde_json = "1"
```

## Quick Start

```rust
use heroindex_client::HeroIndexClient;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), heroindex_client::Error> {
    // Connect to HeroIndex server
    let mut client = HeroIndexClient::connect("/tmp/heroindex.sock").await?;
    
    // Create a database with schema
    client.db_create("articles", json!({
        "fields": [
            {"name": "title", "type": "text", "stored": true, "indexed": true},
            {"name": "body", "type": "text", "stored": true, "indexed": true},
            {"name": "views", "type": "u64", "stored": true, "indexed": true, "fast": true}
        ]
    })).await?;
    
    // Select database and add documents
    client.db_select("articles").await?;
    client.doc_add(json!({"title": "Hello World", "body": "Welcome to search", "views": 100})).await?;
    client.commit().await?;
    client.reload().await?;
    
    // Search
    let results = client.search(json!({"type": "match", "field": "body", "value": "search"}), 10, 0).await?;
    println!("Found {} results", results.total_hits);
    
    Ok(())
}
```

## Field Types

When creating a schema, you can use these field types:

| Type | Description | Example Value | Use Case |
|------|-------------|---------------|----------|
| `text` | Full-text searchable, tokenized | `"Hello World"` | Articles, descriptions |
| `str` | Exact match keyword, not tokenized | `"user-123"` | IDs, tags, status |
| `u64` | Unsigned 64-bit integer | `42` | Counts, ages |
| `i64` | Signed 64-bit integer | `-10` | Scores, offsets |
| `f64` | 64-bit floating point | `3.14` | Prices, ratings |
| `date` | DateTime (RFC 3339 format) | `"2024-01-15T10:30:00Z"` | Timestamps |
| `bool` | Boolean | `true` | Flags, toggles |
| `json` | Nested JSON object | `{"key": "value"}` | Metadata, attributes |
| `bytes` | Binary data (base64) | `"SGVsbG8="` | Hashes, binary |
| `ip` | IP address | `"192.168.1.1"` | Network logs |

### Field Options

Each field can have these options:

- `stored: true` - Store the value to retrieve it in search results
- `indexed: true` - Index the field to make it searchable
- `fast: true` - Enable fast fields for sorting and aggregations (numeric types)
- `tokenizer: "en_stem"` - Use stemming tokenizer for text fields

### Schema Example

```rust
client.db_create("products", json!({
    "fields": [
        {"name": "id", "type": "str", "stored": true, "indexed": true},
        {"name": "name", "type": "text", "stored": true, "indexed": true, "tokenizer": "en_stem"},
        {"name": "description", "type": "text", "stored": true, "indexed": true},
        {"name": "price", "type": "f64", "stored": true, "indexed": true, "fast": true},
        {"name": "stock", "type": "u64", "stored": true, "indexed": true, "fast": true},
        {"name": "created_at", "type": "date", "stored": true, "indexed": true},
        {"name": "active", "type": "bool", "stored": true, "indexed": true},
        {"name": "metadata", "type": "json", "stored": true, "indexed": true}
    ]
})).await?;
```

## Query Types

### 1. Match Query (Full-Text Search)

Tokenizes the query and finds documents containing any of the terms.

```rust
// Simple full-text search
client.search(json!({
    "type": "match", 
    "field": "description", 
    "value": "wireless bluetooth headphones"
}), 10, 0).await?;
```

### 2. Term Query (Exact Match)

Finds documents with the exact value (no tokenization). Best for keyword fields.

```rust
// Find by exact ID
client.search(json!({
    "type": "term",
    "field": "id",
    "value": "prod-001"
}), 10, 0).await?;

// Find by status
client.search(json!({
    "type": "term",
    "field": "status",
    "value": "published"
}), 10, 0).await?;
```

### 3. Fuzzy Query (Typo-Tolerant)

Finds documents even with spelling mistakes. Distance is the max number of character edits.

```rust
// Finds "keyboard" even when user types "keybaord"
client.search(json!({
    "type": "fuzzy",
    "field": "name",
    "value": "keybaord",
    "distance": 2  // Allow up to 2 character differences
}), 10, 0).await?;

// Finds "smartphone" from "smartfone"
client.search(json!({
    "type": "fuzzy",
    "field": "name",
    "value": "smartfone",
    "distance": 2
}), 10, 0).await?;
```

### 4. Phrase Query (Exact Phrase Match)

Finds documents containing the exact phrase in order.

```rust
// Must contain "machine learning" as an exact phrase
client.search(json!({
    "type": "phrase",
    "field": "content",
    "value": "machine learning algorithms"
}), 10, 0).await?;
```

### 5. Prefix Query (Autocomplete)

Finds documents where the field starts with the given prefix.

```rust
// Autocomplete: find all products starting with "wire"
client.search(json!({
    "type": "prefix",
    "field": "name",
    "value": "wire"  // Matches "wireless", "wired", "wire-free"
}), 10, 0).await?;
```

### 6. Range Query (Numeric/Date Ranges)

Finds documents within a numeric or date range.

```rust
// Products between $50 and $100
client.search(json!({
    "type": "range",
    "field": "price",
    "gte": 50.0,   // Greater than or equal
    "lt": 100.0    // Less than
}), 10, 0).await?;

// Products with at least 10 in stock
client.search(json!({
    "type": "range",
    "field": "stock",
    "gte": 10
}), 10, 0).await?;

// All options: gt, gte, lt, lte
client.search(json!({
    "type": "range",
    "field": "views",
    "gt": 100,     // Greater than (exclusive)
    "lte": 1000    // Less than or equal
}), 10, 0).await?;
```

### 7. Regex Query (Pattern Matching)

Finds documents matching a regular expression.

```rust
// Find product codes matching pattern
client.search(json!({
    "type": "regex",
    "field": "sku",
    "value": "PRD-[0-9]{4}-[A-Z]+"
}), 10, 0).await?;

// Find emails from specific domain
client.search(json!({
    "type": "regex",
    "field": "email",
    "value": ".*@company\\.com"
}), 10, 0).await?;
```

### 8. Boolean Query (Complex Combinations)

Combines multiple queries with AND/OR/NOT logic.

```rust
// Complex search: electronics that are premium but not discontinued
client.search(json!({
    "type": "boolean",
    "must": [
        // ALL must match (AND)
        {"type": "match", "field": "category", "value": "electronics"},
        {"type": "range", "field": "price", "gte": 100.0}
    ],
    "should": [
        // At least one should match for higher score (OR)
        {"type": "match", "field": "name", "value": "premium"},
        {"type": "match", "field": "name", "value": "pro"}
    ],
    "must_not": [
        // NONE should match (NOT)
        {"type": "term", "field": "status", "value": "discontinued"},
        {"type": "term", "field": "status", "value": "out_of_stock"}
    ]
}), 10, 0).await?;
```

### 9. All Query (Match Everything)

Returns all documents in the index.

```rust
// Get all documents
client.search(json!({"type": "all"}), 100, 0).await?;
```

## Real-World Query Examples

### E-commerce Product Search

```rust
// User searches "wireless mouse" - combine fuzzy for typo tolerance with filters
client.search(json!({
    "type": "boolean",
    "must": [
        {"type": "fuzzy", "field": "name", "value": "wireless mouse", "distance": 1}
    ],
    "should": [
        {"type": "match", "field": "description", "value": "ergonomic"},
        {"type": "range", "field": "rating", "gte": 4.0}
    ],
    "must_not": [
        {"type": "term", "field": "in_stock", "value": false}
    ]
}), 20, 0).await?;
```

### Log Search

```rust
// Find error logs from the last hour
client.search(json!({
    "type": "boolean",
    "must": [
        {"type": "term", "field": "level", "value": "ERROR"},
        {"type": "range", "field": "timestamp", "gte": "2024-01-15T09:00:00Z"}
    ],
    "should": [
        {"type": "match", "field": "message", "value": "connection timeout"},
        {"type": "match", "field": "message", "value": "database error"}
    ]
}), 100, 0).await?;
```

### Article Search with Pagination

```rust
// Search articles, page 3 (20 results per page)
let page = 3;
let per_page = 20;
let offset = (page - 1) * per_page;

let results = client.search(json!({
    "type": "match",
    "field": "content",
    "value": "rust programming"
}), per_page, offset).await?;

println!("Page {} of {}", page, (results.total_hits + per_page - 1) / per_page);
```

## Document Operations

### Adding Documents

```rust
// Single document
client.doc_add(json!({
    "id": "prod-001",
    "name": "Wireless Mouse",
    "price": 29.99
})).await?;

// Batch insert (much faster for bulk operations)
client.doc_add_batch(vec![
    json!({"id": "prod-002", "name": "Keyboard", "price": 59.99}),
    json!({"id": "prod-003", "name": "Monitor", "price": 299.99}),
    json!({"id": "prod-004", "name": "Webcam", "price": 79.99}),
]).await?;

// IMPORTANT: Commit and reload to make documents searchable
client.commit().await?;
client.reload().await?;
```

### Deleting Documents

```rust
// Delete by field value
client.doc_delete("id", json!("prod-001")).await?;
client.commit().await?;
client.reload().await?;
```

## Database Management

```rust
// List all databases
let list = client.db_list().await?;
for db in &list.databases {
    println!("{}: {} docs, {} bytes", db.name, db.doc_count, db.size_bytes);
}

// Create database
client.db_create("logs", json!({
    "fields": [
        {"name": "timestamp", "type": "date", "stored": true, "indexed": true, "fast": true},
        {"name": "level", "type": "str", "stored": true, "indexed": true},
        {"name": "message", "type": "text", "stored": true, "indexed": true}
    ]
})).await?;

// Select database (required before operations)
client.db_select("logs").await?;

// Get database info
let info = client.db_info().await?;
println!("Documents: {}, Segments: {}", info.doc_count, info.segment_count);

// Delete database
client.db_delete("old_logs").await?;
```

## Error Handling

```rust
use heroindex_client::{Error, error_codes};

match client.db_select("nonexistent").await {
    Ok(_) => println!("Selected"),
    Err(Error::Rpc { code, message }) => {
        match code {
            error_codes::DATABASE_NOT_FOUND => println!("Database not found"),
            error_codes::NO_DATABASE_SELECTED => println!("Select a database first"),
            error_codes::INVALID_QUERY => println!("Invalid query syntax"),
            _ => println!("RPC error {}: {}", code, message),
        }
    }
    Err(Error::Connection(e)) => println!("Connection error: {}", e),
    Err(e) => println!("Other error: {}", e),
}
```

## Response Types

All methods return strongly-typed responses:

| Type | Fields |
|------|--------|
| `PingResponse` | `status`, `version` |
| `ServerStats` | `uptime_secs`, `databases`, `total_docs` |
| `DatabaseList` | `databases: Vec<DatabaseInfo>` |
| `DatabaseInfo` | `name`, `doc_count`, `size_bytes`, `segment_count` |
| `SchemaInfo` | `fields: Vec<FieldInfo>` |
| `SearchResult` | `total_hits`, `hits: Vec<SearchHit>`, `took_ms` |
| `SearchHit` | `score`, `doc: serde_json::Value` |
| `CountResult` | `count` |
| `OpResult` | `success`, `opstamp` |

## Performance Tips

1. **Use batch inserts** - `doc_add_batch` is 10-100x faster than individual `doc_add` calls
2. **Commit periodically** - Don't commit after every document, batch them
3. **Enable fast fields** - For fields used in sorting/filtering/aggregations
4. **Use term queries** - For exact matches on keyword fields, `term` is faster than `match`
5. **Limit results** - Always set reasonable `limit` values, fetch more with pagination

## Related Crates

- [heroindex](https://crates.io/crates/heroindex) - The HeroIndex server

## License

MIT License