Crate zipora

Source
Expand description

§Zipora: High-Performance Data Structures and Compression

This crate provides a comprehensive Rust implementation of advanced data structures and compression algorithms, offering high-performance solutions with modern Rust design.

§Key Features

  • Fast Containers: Optimized vector and string types with zero-copy semantics
  • Succinct Data Structures: Rank-select operations with SIMD optimizations
  • Advanced Tries: LOUDS, Critical-Bit, and Patricia tries with full FSA support
  • Blob Storage: Memory-mapped and compressed blob storage systems
  • Entropy Coding: Huffman, rANS, and dictionary-based compression algorithms
  • Memory Management: Advanced allocators including memory pools and bump allocators
  • Specialized Algorithms: Suffix arrays, radix sort, and multi-way merge
  • Fiber-based Concurrency: High-performance async/await with work-stealing execution
  • Real-time Compression: Adaptive algorithms with strict latency guarantees
  • C FFI Support: Complete C API compatibility layer for gradual migration
  • Memory Safety: All the performance of C++ with Rust’s memory safety guarantees

§Quick Start

use zipora::{
    FastVec, ValVec32, SmallMap, FixedCircularQueue, AutoGrowCircularQueue,
    FastStr, MemoryBlobStore, BlobStore, LoudsTrie, Trie, GoldHashMap,
    HuffmanEncoder, MemoryPool, PoolConfig, SuffixArray, FiberPool
};

// High-performance vector with realloc optimization
let mut vec = FastVec::new();
vec.push(42).unwrap();

// Memory-efficient 32-bit indexed vector
let mut vec32 = ValVec32::new();
vec32.push(42).unwrap();
println!("ValVec32 uses u32 indices vs usize for Vec, saving space on large collections");

// Small map optimized for ≤8 elements
let mut small_map = SmallMap::new();
small_map.insert("key", "value").unwrap();

// Fixed-size circular queue with lock-free operations
let mut fixed_queue: FixedCircularQueue<i32, 16> = FixedCircularQueue::new();
fixed_queue.push_back(1).unwrap();
assert_eq!(fixed_queue.pop_front(), Some(1));

// Auto-growing circular queue
let mut auto_queue = AutoGrowCircularQueue::new();
for i in 0..100 { auto_queue.push_back(i).unwrap(); }

// Zero-copy string operations
let s = FastStr::from_string("hello world");
println!("Hash: {:x}", s.hash_fast());

// Advanced trie operations
let mut trie = LoudsTrie::new();
trie.insert(b"hello").unwrap();
assert!(trie.contains(b"hello"));

// High-performance hash map
let mut map = GoldHashMap::new();
map.insert("key", "value").unwrap();

Re-exports§

pub use containers::AutoGrowCircularQueue;
pub use containers::EasyHashMap;
pub use containers::EasyHashMapBuilder;
pub use containers::EasyHashMapStats;
pub use containers::FastVec;
pub use containers::FixedCircularQueue;
pub use containers::FixedLenStrVec;
pub use containers::FixedStr4Vec;
pub use containers::FixedStr8Vec;
pub use containers::FixedStr16Vec;
pub use containers::FixedStr32Vec;
pub use containers::FixedStr64Vec;
pub use containers::GoldHashIdx;
pub use containers::HashStrMap;
pub use containers::HashStrMapStats;
pub use containers::SmallMap;
pub use containers::SortableStrIter;
pub use containers::SortableStrSortedIter;
pub use containers::SortableStrVec;
pub use containers::UintVector;
pub use containers::ValVec32;
pub use containers::ZoSortedStrVec;
pub use containers::ZoSortedStrVecIter;
pub use containers::ZoSortedStrVecRange;
pub use error::Result;
pub use error::ZiporaError;
pub use string::FastStr;
pub use string::LexicographicIterator;
pub use string::SortedVecLexIterator;
pub use string::StreamingLexIterator;
pub use string::LexIteratorBuilder;
pub use string::UnicodeProcessor;
pub use string::UnicodeAnalysis;
pub use string::Utf8ToUtf32Iterator;
pub use string::LineProcessor;
pub use string::LineProcessorConfig;
pub use string::LineProcessorStats;
pub use string::LineSplitter;
pub use string::utf8_byte_count;
pub use string::validate_utf8_and_count_chars;
pub use succinct::BitVector;
pub use succinct::BitwiseOp;
pub use succinct::BuilderOptions;
pub use succinct::CpuFeatures;
pub use succinct::MixedDimensionView;
pub use succinct::PerformanceStats;
pub use succinct::RankSelect256;
pub use succinct::RankSelectBuilder;
pub use succinct::RankSelectFew;
pub use succinct::RankSelectFewBuilder;
pub use succinct::RankSelectInterleaved256;
pub use succinct::RankSelectMixedIL256;
pub use succinct::RankSelectMixedSE512;
pub use succinct::RankSelectMixedXL256;
pub use succinct::RankSelectMultiDimensional;
pub use succinct::RankSelectOps;
pub use succinct::RankSelectPerformanceOps;
pub use succinct::RankSelectSe256;
pub use succinct::RankSelectSeparated256;
pub use succinct::RankSelectSeparated512;
pub use succinct::RankSelectSimple;
pub use succinct::RankSelectSparse;
pub use succinct::SimdCapabilities;
pub use succinct::SimdOps;
pub use succinct::bulk_popcount_simd;
pub use succinct::bulk_rank1_simd;
pub use succinct::bulk_select1_simd;
pub use blob_store::BlobStore;
pub use blob_store::MemoryBlobStore;
pub use blob_store::PlainBlobStore;
pub use fsa::CritBitTrie;
pub use fsa::DoubleArrayTrie;
pub use fsa::DoubleArrayTrieBuilder;
pub use fsa::DoubleArrayTrieConfig;
pub use fsa::FiniteStateAutomaton;
pub use fsa::LoudsTrie;
pub use fsa::PatriciaTrie;
pub use fsa::Trie;
pub use io::DataInput;
pub use io::DataOutput;
pub use io::VarInt;
pub use hash_map::GoldHashMap;
pub use io::MemoryMappedInput;
pub use io::MemoryMappedOutput;
pub use blob_store::DictionaryBlobStore;
pub use blob_store::EntropyAlgorithm;
pub use blob_store::EntropyCompressionStats;
pub use blob_store::HuffmanBlobStore;
pub use blob_store::RansBlobStore;
pub use entropy::dictionary::Dictionary;
pub use entropy::rans::RansSymbol;
pub use entropy::DictionaryBuilder;
pub use entropy::DictionaryCompressor;
pub use entropy::EntropyStats;
pub use entropy::HuffmanDecoder;
pub use entropy::HuffmanEncoder;
pub use entropy::HuffmanTree;
pub use entropy::OptimizedDictionaryCompressor;
pub use entropy::RansDecoder;
pub use entropy::RansEncoder;
pub use entropy::RansState;
pub use memory::BumpAllocator;
pub use memory::BumpArena;
pub use memory::CACHE_LINE_SIZE;
pub use memory::CacheAlignedVec;
pub use memory::MemoryConfig;
pub use memory::MemoryPool;
pub use memory::MemoryStats;
pub use memory::NumaPoolStats;
pub use memory::NumaStats;
pub use memory::PoolConfig;
pub use memory::PooledBuffer;
pub use memory::PooledVec;
pub use memory::SecureMemoryPool;
pub use memory::SecurePoolConfig;
pub use memory::SecurePoolStats;
pub use memory::SecurePooledPtr;
pub use memory::clear_numa_pools;
pub use memory::get_global_pool_for_size;
pub use memory::get_global_secure_pool_stats;
pub use memory::get_numa_stats;
pub use memory::get_optimal_numa_node;
pub use memory::init_numa_pools;
pub use memory::numa_alloc_aligned;
pub use memory::numa_dealloc;
pub use memory::set_current_numa_node;
pub use memory::size_to_class;
pub use memory::HugePage;
pub use memory::HugePageAllocator;
pub use algorithms::AlgorithmConfig;
pub use algorithms::ExternalSort;
pub use algorithms::LcpArray;
pub use algorithms::LoserTree;
pub use algorithms::LoserTreeConfig;
pub use algorithms::MergeSource;
pub use algorithms::MultiWayMerge;
pub use algorithms::RadixSort;
pub use algorithms::RadixSortConfig;
pub use algorithms::ReplaceSelectSort;
pub use algorithms::ReplaceSelectSortConfig;
pub use algorithms::SuffixArray;
pub use algorithms::SuffixArrayBuilder;
pub use algorithms::TournamentNode;
pub use concurrency::AsyncBlobStore;
pub use concurrency::AsyncFileStore;
pub use concurrency::AsyncMemoryBlobStore;
pub use concurrency::ConcurrencyConfig;
pub use concurrency::Fiber;
pub use concurrency::FiberHandle;
pub use concurrency::FiberId;
pub use concurrency::FiberPool;
pub use concurrency::FiberPoolConfig;
pub use concurrency::FiberStats;
pub use concurrency::ParallelLoudsTrie;
pub use concurrency::ParallelTrieBuilder;
pub use concurrency::Pipeline;
pub use concurrency::PipelineBuilder;
pub use concurrency::PipelineStage;
pub use concurrency::PipelineStats;
pub use concurrency::Task;
pub use concurrency::WorkStealingExecutor;
pub use concurrency::WorkStealingQueue;
pub use compression::AdaptiveCompressor;
pub use compression::AdaptiveConfig;
pub use compression::Algorithm;
pub use compression::CompressionMode;
pub use compression::CompressionProfile;
pub use compression::CompressionStats;
pub use compression::Compressor;
pub use compression::CompressorFactory;
pub use compression::PerformanceRequirements;
pub use compression::RealtimeCompressor;
pub use compression::RealtimeConfig;
pub use system::CpuFeatureSet;
pub use system::RuntimeCpuFeatures;
pub use system::get_cpu_features;
pub use system::has_cpu_feature;
pub use system::PerfTimer;
pub use system::BenchmarkSuite;
pub use system::HighPrecisionTimer;
pub use system::ProfiledFunction;
pub use system::ProcessManager;
pub use system::ProcessPool;
pub use system::BidirectionalPipe;
pub use system::ProcessExecutor;
pub use system::AdaptiveBase64;
pub use system::SimdBase64Encoder;
pub use system::SimdBase64Decoder;
pub use system::base64_encode_simd;
pub use system::base64_decode_simd;
pub use system::VmManager;
pub use system::PageAlignedAlloc;
pub use system::KernelInfo;
pub use system::vm_prefetch;
pub use system::get_kernel_info;
pub use dev_infrastructure::FactoryRegistry;
pub use dev_infrastructure::GlobalFactory;
pub use dev_infrastructure::AutoRegister;
pub use dev_infrastructure::Factoryable;
pub use dev_infrastructure::FactoryBuilder;
pub use dev_infrastructure::global_factory;
pub use dev_infrastructure::HighPrecisionTimer as DevHighPrecisionTimer;
pub use dev_infrastructure::ScopedTimer;
pub use dev_infrastructure::BenchmarkSuite as DevBenchmarkSuite;
pub use dev_infrastructure::BenchmarkResult;
pub use dev_infrastructure::MemoryDebugger;
pub use dev_infrastructure::MemoryStats as DevMemoryStats;
pub use dev_infrastructure::PerformanceProfiler;
pub use dev_infrastructure::global_profiler;
pub use dev_infrastructure::global_memory_debugger;
pub use dev_infrastructure::format_duration;
pub use dev_infrastructure::Histogram;
pub use dev_infrastructure::U32Histogram;
pub use dev_infrastructure::U64Histogram;
pub use dev_infrastructure::HistogramStats;
pub use dev_infrastructure::StatAccumulator;
pub use dev_infrastructure::AccumulatorStats;
pub use dev_infrastructure::MultiDimensionalStats;
pub use dev_infrastructure::GlobalStatsRegistry;
pub use dev_infrastructure::global_stats;
pub use dev_infrastructure::StatIndex;
pub use thread::PlatformSync;
pub use thread::DefaultPlatformSync;
pub use thread::InstanceTls;
pub use thread::OwnerTls;
pub use thread::TlsPool;
pub use thread::AtomicExt;
pub use thread::AsAtomic;
pub use thread::AtomicNode;
pub use thread::AtomicStack;
pub use thread::AtomicBitOps;
pub use thread::spin_loop_hint;
pub use thread::memory_ordering;
pub use cache::LruPageCache;
pub use cache::SingleLruPageCache;
pub use cache::PageCacheConfig;
pub use cache::CacheBuffer;
pub use cache::LockingConfig;
pub use cache::MemoryConfig as CacheMemoryConfig;
pub use cache::KernelAdvice;
pub use cache::PerformanceConfig;
pub use cache::EvictionConfig;
pub use cache::EvictionAlgorithm;
pub use cache::WarmingStrategy;
pub use cache::MaintenanceConfig;
pub use cache::CacheStatistics;
pub use cache::CacheStatsSnapshot;
pub use cache::BufferPool;
pub use cache::BufferPoolStats;
pub use cache::CacheError;
pub use cache::CacheHitType;
pub use cache::FileId;
pub use cache::PageId;
pub use cache::NodeIndex;
pub use cache::hash_file_page;
pub use cache::get_shard_id;
pub use cache::prefetch_hint;
pub use cache::PAGE_SIZE;
pub use cache::PAGE_BITS;
pub use cache::HUGE_PAGE_SIZE;
pub use cache::MAX_SHARDS;
pub use cache::CACHE_LINE_SIZE as CACHE_CACHE_LINE_SIZE;
pub use thread::LinuxFutex;
pub use thread::FutexMutex;
pub use thread::FutexCondvar;
pub use thread::FutexRwLock;
pub use thread::FutexGuard;
pub use thread::FutexReadGuard;
pub use thread::FutexWriteGuard;
pub use thread::x86_64_optimized;
pub use blob_store::ZstdBlobStore;

Modules§

algorithms
Specialized algorithms for high-performance data processing
blob_store
Blob storage systems
cache
LRU Page Cache
compression
Real-time compression with adaptive algorithms
concurrency
Fiber-based concurrency and pipeline processing
containers
High-performance container types
dev_infrastructure
Development Infrastructure
entropy
Entropy coding and compression algorithms
error
Error handling for the zipora library
ffi
C FFI compatibility layer
fsa
Finite State Automata and Trie structures
hash_map
High-performance hash map implementations
io
I/O operations and streaming
memory
Memory management utilities and allocators
string
Zero-copy string operations with SIMD optimization
succinct
Succinct data structures with constant-time rank and select operations
system
System Integration Utilities
thread
Thread and synchronization utilities

Macros§

debug_assert_msg
Debug assertion with custom message and optional panic
debug_print
Conditional debug print macro
impl_complex_serialize
Macro for implementing ComplexSerialize for custom structs
measure_time
Performance measurement macro
register_factory
Macro for convenient factory registration
register_factory_type
Macro for factory registration with automatic type name
since_version
time_block
Macro for timing code blocks
time_expr
Macro for timing expressions
versioned_field
Convenience macros for version management
versioned_field_with_default

Constants§

VERSION
Library version information

Functions§

has_simd_support
Check if SIMD optimizations are available
init
Initialize the library (currently no-op, for future use)

Type Aliases§

RecordId
Record identifier type for blob store operations
StateId
State identifier type for FSA operations