# HeroIndex
[](https://crates.io/crates/heroindex)
[](https://docs.rs/heroindex)
[](https://opensource.org/licenses/MIT)
A high-performance full-text search server built on [Tantivy](https://github.com/quickwit-oss/tantivy), exposing an OpenRPC interface over Unix sockets.
## Features
- **Multiple Index Management** - Create, delete, and manage multiple search indexes
- **Dynamic Schemas** - Define custom schemas with 10+ field types
- **Powerful Queries** - Full-text, fuzzy, phrase, boolean, range, regex queries
- **OpenRPC Discovery** - Self-documenting API via `rpc.discover`
- **Concurrent Connections** - Handle multiple clients simultaneously
- **Fast Fields** - Columnar storage for sorting and aggregations
- **Zero-Copy Search** - Efficient memory-mapped index files
## Installation
### From crates.io
```bash
cargo install heroindex
```
### From source
```bash
git clone https://github.com/heroindex/heroindex
cd heroindex
cargo build --release
```
## Quick Start
### 1. Start the Server
```bash
heroindex --dir /var/lib/heroindex --socket /tmp/heroindex.sock
```
### 2. Connect with the Client Library
Use [heroindex_client](https://crates.io/crates/heroindex_client) to connect:
```rust
use heroindex_client::HeroIndexClient;
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), heroindex_client::Error> {
let mut client = HeroIndexClient::connect("/tmp/heroindex.sock").await?;
// Create an index
client.db_create("articles", json!({
"fields": [
{"name": "title", "type": "text", "stored": true, "indexed": true},
{"name": "body", "type": "text", "stored": true, "indexed": true}
]
})).await?;
// Add documents
client.db_select("articles").await?;
client.doc_add(json!({"title": "Hello", "body": "World"})).await?;
client.commit().await?;
client.reload().await?;
// Search
let results = client.search(
json!({"type": "match", "field": "body", "value": "world"}),
10, 0
).await?;
println!("Found {} results", results.total_hits);
Ok(())
}
```
## Command Line Options
```
heroindex [OPTIONS]
Options:
-d, --dir <DIR> Base directory for all indexes
-s, --socket <SOCKET> Unix socket path for RPC interface
-h, --help Print help
-V, --version Print version
```
## Schema Definition
Define your index schema with these field types:
| `text` | Full-text searchable (tokenized) | `stored`, `indexed`, `fast`, `tokenizer` |
| `str` | Exact match string (keyword) | `stored`, `indexed`, `fast` |
| `u64` | Unsigned 64-bit integer | `stored`, `indexed`, `fast` |
| `i64` | Signed 64-bit integer | `stored`, `indexed`, `fast` |
| `f64` | 64-bit floating point | `stored`, `indexed`, `fast` |
| `date` | DateTime (RFC 3339) | `stored`, `indexed`, `fast` |
| `bool` | Boolean | `stored`, `indexed`, `fast` |
| `json` | JSON object | `stored`, `indexed` |
| `bytes` | Binary data | `stored`, `indexed`, `fast` |
| `ip` | IP address | `stored`, `indexed`, `fast` |
### Example Schema
```json
{
"fields": [
{"name": "id", "type": "str", "stored": true, "indexed": true},
{"name": "title", "type": "text", "stored": true, "indexed": true, "tokenizer": "en_stem"},
{"name": "content", "type": "text", "stored": true, "indexed": true},
{"name": "views", "type": "u64", "stored": true, "indexed": true, "fast": true},
{"name": "rating", "type": "f64", "stored": true, "indexed": true, "fast": true},
{"name": "published", "type": "date", "stored": true, "indexed": true, "fast": true},
{"name": "active", "type": "bool", "stored": true, "indexed": true},
{"name": "metadata", "type": "json", "stored": true, "indexed": true}
]
}
```
## Query Types
### Match Query (Full-Text)
```json
{"type": "match", "field": "content", "value": "search terms"}
```
### Term Query (Exact)
```json
{"type": "term", "field": "id", "value": "abc123"}
```
### Fuzzy Query (Typo-Tolerant)
```json
{"type": "fuzzy", "field": "title", "value": "serch", "distance": 2}
```
### Phrase Query
```json
{"type": "phrase", "field": "content", "value": "exact phrase match"}
```
### Prefix Query
```json
{"type": "prefix", "field": "title", "value": "hel"}
```
### Range Query
```json
{"type": "range", "field": "views", "gte": 100, "lt": 1000}
```
### Regex Query
```json
{"type": "regex", "field": "title", "value": "test.*"}
```
### Boolean Query
```json
{
"type": "boolean",
"must": [{"type": "match", "field": "content", "value": "rust"}],
"should": [{"type": "match", "field": "title", "value": "tutorial"}],
"must_not": [{"type": "term", "field": "status", "value": "draft"}]
}
```
## RPC Methods
| `rpc.discover` | Get OpenRPC schema |
| `server.ping` | Health check |
| `server.stats` | Server statistics |
| `db.list` | List all databases |
| `db.create` | Create database with schema |
| `db.delete` | Delete a database |
| `db.select` | Select database for operations |
| `db.info` | Get database info |
| `schema.get` | Get current schema |
| `doc.add` | Add single document |
| `doc.add_batch` | Add multiple documents |
| `doc.delete` | Delete by term |
| `index.commit` | Commit changes |
| `index.reload` | Reload to see changes |
| `search.query` | Execute search |
| `search.count` | Count matches |
## Performance Tips
1. **Use batch inserts** - `doc.add_batch` is much faster than individual adds
2. **Commit periodically** - Don't commit after every document
3. **Enable fast fields** - For fields used in sorting/filtering
4. **Use appropriate tokenizers** - `en_stem` for English, `raw` for keywords
## Related Crates
- [heroindex_client](https://crates.io/crates/heroindex_client) - Client library for connecting to HeroIndex
## License
MIT License - see [LICENSE](LICENSE) for details.
## Credits
Built on the excellent [Tantivy](https://github.com/quickwit-oss/tantivy) search engine library.