marisa-ffi 0.3.1

Rust FFI bindings for libmarisa - a space-efficient trie data structure
Documentation
# Marisa FFI Rust Bindings

[![Build Status](https://github.com/shinecrazy/marisa-ffi/workflows/CI/badge.svg)](https://github.com/shinecrazy/marisa-ffi/actions)
[![Crates.io](https://img.shields.io/crates/v/marisa-ffi.svg)](https://crates.io/crates/marisa-ffi)
[![Documentation](https://docs.rs/marisa-ffi/badge.svg)](https://docs.rs/marisa-ffi)
[![License](https://img.shields.io/badge/license-BSD--2--Clause-blue.svg)](LICENSE)

This project provides Rust FFI bindings for the [libmarisa](https://github.com/s-yata/marisa-trie) library, a space-efficient trie data structure implementation.

## Features

- **Lookup**: Check whether a string exists in the dictionary
- **Reverse lookup**: Restore a key from its ID  
- **Common prefix search**: Find keys from prefixes of a given string
- **Predictive search**: Find keys starting with a given string
- **Memory efficient**: Significantly more compact than other trie implementations
- **Cross-platform**: Works on Linux, macOS, and Windows

## Installation

### Prerequisites

- Rust 1.70+
- CMake 3.22+
- C++ compiler with C++20 support

### From Crates.io

Add this to your `Cargo.toml`:

```toml
[dependencies]
marisa-ffi = "0.3.1"
```

### From Source

1. Clone the repository:
```bash
git clone https://github.com/shinecrazy/marisa-ffi.git
cd marisa-ffi
```

2. Build the FFI library:
```bash
cargo build --release
```

### Using as a Dependency

Add this to your `Cargo.toml`:

```toml
[dependencies]
marisa-ffi = "0.3.1"
```

## Quick Start

```rust
use marisa_ffi::{Trie, Keyset, Agent};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create a keyset and add some keys
    let mut keyset = Keyset::new()?;
    keyset.push("hello")?;
    keyset.push("world")?;
    keyset.push("rust")?;

    // Build the trie
    let trie = Trie::build(&keyset)?;

    // Lookup a key
    if let Some(id) = trie.lookup("hello") {
        println!("Found 'hello' with ID: {}", id);
    }

    // Predictive search
    let mut agent = Agent::new()?;
    if trie.predictive_search("h", &mut agent)? {
        println!("Found key: {}", agent.key()?);
    }

    Ok(())
}
```

## API Reference

### Trie

The main trie structure for efficient string storage and lookup.

```rust
impl Trie {
    /// Create a new empty trie
    pub fn new() -> Result<Self, MarisaError>;
    
    /// Build a trie from a keyset
    pub fn build(keyset: &Keyset) -> Result<Self, MarisaError>;
    
    /// Load a trie from a file
    pub fn load(&mut self, filename: &str) -> Result<(), MarisaError>;
    
    /// Save the trie to a file
    pub fn save(&self, filename: &str) -> Result<(), MarisaError>;
    
    /// Lookup a key and return its ID if found
    pub fn lookup(&self, key: &str) -> Option<u32>;
    
    /// Perform predictive search starting with the given prefix
    pub fn predictive_search(&self, prefix: &str, agent: &mut Agent) -> Result<bool, MarisaError>;
    
    /// Perform reverse lookup to get a key by its ID
    pub fn reverse_lookup(&self, id: u32, agent: &mut Agent) -> Result<(), MarisaError>;
    
    /// Perform common prefix search
    pub fn common_prefix_search(&self, prefix: &str, agent: &mut Agent) -> Result<bool, MarisaError>;
}
```

### Keyset

A collection of keys used to build tries.

```rust
impl Keyset {
    /// Create a new empty keyset
    pub fn new() -> Result<Self, MarisaError>;
    
    /// Add a key to the keyset
    pub fn push(&mut self, key: &str) -> Result<(), MarisaError>;
    
    /// Add a key with a specific ID to the keyset
    pub fn push_with_id(&mut self, key: &str, id: u32) -> Result<(), MarisaError>;
    
    /// Reset the keyset
    pub fn reset(&mut self);
}
```

### Agent

An agent for performing search operations and retrieving results.

```rust
impl Agent {
    /// Create a new agent
    pub fn new() -> Result<Self, MarisaError>;
    
    /// Get the current key as a string
    pub fn key(&self) -> Result<String, MarisaError>;
    
    /// Get the current key ID
    pub fn id(&self) -> u32;
    
    /// Move to the next result
    pub fn next(&mut self) -> Result<bool, MarisaError>;
}
```

## Examples

### Basic Usage

See `examples/basic_usage.rs` for a comprehensive example demonstrating all features.

### Benchmark

See `examples/benchmark.rs` for performance testing and comparison.

## Performance

Marisa trie is significantly more space-efficient than other implementations:

| Implementation | Size (bytes) | Remarks |
|----------------|--------------|---------|
| darts-clone    | 376,613,888  | Compacted double-array trie |
| tx-trie        | 127,727,058  | LOUDS-based trie |
| marisa-trie    | 50,753,560   | MARISA trie |

*Source: English Wikipedia page titles (9,805,576 keys)*

## Error Handling

All operations return `Result` types with detailed error information:

```rust
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum MarisaError {
    StateError,
    NullError,
    BoundError,
    RangeError,
    CodeError,
    ResetError,
    SizeError,
    MemoryError,
    IoError,
    FormatError,
    KeyNotFound,
    NoResults,
    Utf8Error(std::str::Utf8Error),
}
```

## License

This project is licensed under the same terms as libmarisa:
- BSD-2-Clause OR LGPL-2.1-or-later

## Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

## Acknowledgments

- [s-yata]https://github.com/s-yata for the original libmarisa implementation
- The Rust community for excellent FFI tooling and documentation