Skip to main content

Crate mismall

Crate mismall 

Source
Expand description

§mismall - Streaming Huffman Compression Library

A sophisticated Rust library for file compression and decompression built around canonical Huffman coding with streaming architecture. Designed to handle arbitrarily large files with bounded memory usage and optional AES-256-GCM encryption.

§🚀 Quick Start

§Basic Usage

use mismall::{compress_file, decompress_file};

// Note: These examples require existing files
// let result = compress_file("document.txt", None)?;
// println!("Compressed {} -> {} bytes", result.original_size, result.compressed_size);
//
// // Decompress a file  
// let result = decompress_file("document.txt.small", None)?;
// println!("Decompressed {} bytes", result.original_size);

§Advanced Usage

use mismall::compress::CompressionBuilder;

// Note: This requires an existing file
// let result = CompressionBuilder::new("large_video.mp4")
//     .with_password("secret123")
//     .with_chunk_size(64 * 1024 * 1024) // 64MB chunks
//     .with_progress_callback(|progress: &mismall::progress::ProgressInfo| {
//         println!("Progress: {}%", progress.percentage);
//     })
//     .compress()?;
//
// println!("Compressed with {:.1}% ratio", result.compression_ratio);

§Archive Operations

use mismall::archive::{ArchiveBuilder, ArchiveExtractor};

// Create archive
ArchiveBuilder::new()
    .add_file("doc1.pdf", b"PDF content")?
    .add_file("image.jpg", b"JPG content")?
    .with_password("archive_secret")
    .build("backup.small")?;

// Note: This requires an existing archive
// // Extract from archive
// ArchiveExtractor::new("backup.small")
//     .with_password("archive_secret")
//     .extract_all()?;

§Features

  • Streaming Architecture: Bounded memory usage (16MB default) with chunked I/O
  • AES-256-GCM Encryption: Optional password-based encryption with authenticated data integrity
  • Archive Support: Pack multiple files into single .small containers with metadata
  • Memory Efficient: Uses temporary files for intermediate processing, never loads entire files into RAM
  • Raw-Store Heuristic: Automatically stores uncompressed data if compression would expand file size
  • Configurable Chunk Sizes: Users can adjust memory usage from 64KB to 1GB+

§Quick Start

use mismall::{compress_file, decompress_file};

// Note: These examples require existing files
// let compressed = compress_file("document.txt", None)?;
// println!("File compressed successfully!");
//
// // Simple decompression
// let decompressed = decompress_file("document.txt.small", None)?;
// println!("File decompressed successfully!");

§Advanced Usage

use mismall::compress::CompressionBuilder;

// Note: This requires an existing file
// let result = CompressionBuilder::new("large_video.mp4")
//     .with_password("secret123")
//     .with_chunk_size(64 * 1024 * 1024) // 64MB chunks
//     .with_progress_callback(|progress: &mismall::progress::ProgressInfo| {
//         println!("Progress: {}%", progress.percentage);
//     })
//     .compress()?;
//
// println!("Compressed {} bytes to {} bytes",
//          result.original_size, result.compressed_size);

§Archive Operations

use mismall::archive::{ArchiveBuilder, ArchiveExtractor};

// Create archive
ArchiveBuilder::new()
    .add_file("doc1.pdf", b"PDF content")?
    .add_file("image.jpg", b"JPG content")?
    .with_password("archive_secret")
    .build("backup.small")?;

// Note: This requires an existing archive
// // Extract from archive
// ArchiveExtractor::new("backup.small")
//     .with_password("archive_secret")
//     .extract_file("doc1.pdf", "restored_doc.pdf")?;

§Streaming Utilities

use mismall::stream::{Compressor, Decompressor, stream_reader, stream_writer};
use std::fs::File;
use std::io::{Write, Read};

// Note: This example shows the pattern but requires actual files
// // Streaming compression
// let output_file = File::create("data.txt.small")?;
// let mut compressor = stream_writer(output_file, "data.txt", None);
// compressor.write_all(b"Hello, ")?;
// compressor.write_all(b"world!")?;
// compressor.finish()?;
//
// // Streaming decompression
// let input_file = File::open("data.txt.small")?;
// let mut reader = stream_reader(input_file, None);
// let mut buffer = String::new();
// reader.read_to_string(&mut buffer)?;
// println!("Decompressed: {}", buffer);

§🎯 Core APIs

§Simple API

§Builder API

  • [CompressionBuilder] - Advanced compression with options
  • [DecompressionBuilder] - Advanced decompression with options
  • ArchiveBuilder - Create multi-file archives
  • ArchiveExtractor - Extract from archives with options

§Streaming API

§Progress Tracking

  • [ProgressInfo] - Progress information for long operations
  • [ProgressCallback] - Callback type for progress updates
  • [ProcessingStage] - Different stages of compression/decompression

§📚 Module Organization

§compress

High-level compression functions and builder pattern for advanced options.

§decompress

High-level decompression functions and builder pattern for advanced options.

§archive

Multi-file archive operations including creation, extraction, and listing.

§stream

Low-level streaming utilities for custom I/O patterns.

§error

Comprehensive error hierarchy with context and suggestions.

§progress

Progress tracking utilities with callback support.

§💾 Memory Management

mismall is designed for bounded memory usage regardless of file size:

System TypeRecommended Chunk SizeMemory Usage
Low (1GB)64KBMinimal
Standard (8GB+)16MB (default)Balanced
High (32GB+)1GBMaximum

Why this matters: The streaming architecture processes files in chunks, never loading entire files into memory. This enables compression of multi-gigabyte files on resource-constrained systems.

§🔒 Security Features

  • AES-256-GCM encryption with authenticated data integrity
  • Password-based key derivation using PBKDF2 with random salt
  • Authentication tags prevent tampering and verify data integrity
  • Optional encryption -.compress without passwords for speed

§⚡ Performance Characteristics

  • Compression: Huffman coding optimized for text and binary data
  • Raw-store heuristic: Automatically skips compression for incompressible data
  • Chunked I/O: Overlaps computation with I/O for better throughput
  • Zero-copy operations: Minimizes memory allocations where possible

§Feature Flags

  • compression (default): Compression and decompression functionality
  • archives (default): Multi-file archive operations
  • encryption (default): AES-256-GCM encryption support
  • cli: Command-line interface (enables all other features)

§Error Handling

All library functions return Result<T, MismallError> where MismallError provides detailed error information with context for troubleshooting.

§License

MIT - do whatever you want, just don’t claim you wrote it.

Re-exports§

pub use error::CompressionError;
pub use error::DecompressionError;
pub use error::MismallError;
pub use compress::compress_file;
pub use compress::compress_stream;
pub use compress::validate_chunk_size;
pub use decompress::decompress_file;
pub use decompress::decompress_stream;
pub use archive::extract_archive;
pub use archive::list_archive_contents;
pub use archive::ArchiveBuilder;
pub use archive::ArchiveExtractor;
pub use archive::ArchiveInfo;
pub use archive::FileInfo;
pub use stream::stream_reader;
pub use stream::stream_writer;
pub use stream::Compressor;
pub use stream::Decompressor;

Modules§

archive
Archive operations API
compress
High-level compression API
decompress
High-level decompression API
error
Error types for mismall compression library
progress
Progress tracking and callback utilities
stream
Streaming utilities for stateful compression and decompression
tests

Constants§

DEFAULT_CHUNK_SIZE
MAGIC_BYTES
VERSION