Crate dictutils

Crate dictutils 

Source
Expand description

High-performance dictionary utilities library

This library provides fast and efficient dictionary operations with support for multiple dictionary formats including Monkey’s Dictionary (MDict), StarDict, and ZIM format.

§Features

  • Multiple Format Support: MDict, StarDict, and ZIM formats
  • High Performance: B-TREE indexing for fast lookups
  • Full-Text Search: Inverted indexing for content search
  • Memory-Mapped Files: Efficient large file handling
  • Compression Support: GZIP, LZ4, Zstandard compression
  • Batch Operations: Efficient bulk processing
  • Thread Safety: Safe concurrent access
  • Lazy Loading: Memory-efficient on-demand loading

§Quick Start

use dictutils::prelude::*;

fn main() {
    // This is a usage example, not executed in doctests.
    let loader = DictLoader::new();

    // Load a dictionary (format auto-detected)
    let dict = loader.load("path/to/dictionary.mdict");

    // Handle Result in real code (omitted here for brevity)
    let _ = dict;
}

§Configuration

use dictutils::prelude::*;

// Example configuration (not executed in doctests)
let config = DictConfig {
    load_btree: true,      // Fast key lookups
    load_fts: true,        // Full-text search
    use_mmap: true,        // Memory mapping
    cache_size: 1000,      // Entry cache size
    batch_size: 100,       // Batch operation size
    build_btree: true,     // Allow building missing B-TREE sidecar
    build_fts: true,       // Allow building missing FTS sidecar
    ..Default::default()
};

let loader = DictLoader::with_config(config);
let _dict = loader.load("path/to/dictionary");

§Performance Tips

  1. Build Indexes: Use build_indexes() for large dictionaries
  2. Use B-TREE: Enable for fast exact lookups
  3. Enable FTS: For content search functionality
  4. Memory Mapping: Recommended for files > 100MB
  5. Batch Operations: Use get_multiple() for multiple lookups

§Supported Dictionary Formats

  • Monkey’s Dictionary (.mdict): Fast lookup format with optional indexes
  • StarDict (.dict): Classic format with binary search
  • ZIM (.zim): Wikipedia offline format with article storage

§Thread Safety

All dictionary operations are thread-safe and can be shared across threads using standard Rust concurrency patterns.

Re-exports§

pub use dict::BatchOperations;
pub use dict::DictLoader;
pub use dict::MDict;
pub use dict::StarDict;
pub use dict::ZimDict;
pub use index::btree;
pub use index::fts;
pub use util::buffer;
pub use util::compression;
pub use util::encoding;
pub use util::file_utils;
pub use traits::*;

Modules§

dict
Dictionary format implementations Dictionary implementations module
index
High-performance indexing system High-performance indexing system for dictionary operations
prelude
traits
Core trait definitions and types Core dictionary trait definitions and types
util
Utility functions and helpers Utility functions for dictionary operations

Constants§

DEFAULT_BATCH_SIZE
Default batch size for operations
DEFAULT_CACHE_SIZE
Default cache size for entries
DESCRIPTION
Library description
MAX_DICT_SIZE
Maximum supported dictionary size (2GB)
MIN_MEMORY
Minimum memory required for basic operations (64MB)
NAME
Library name
RECOMMENDED_MEMORY
Recommended memory for optimal performance (256MB)
VERSION
Library version