ranged-mmap 0.1.2

Type-safe memory-mapped file library with lock-free concurrent writes to non-overlapping ranges
Documentation

ranged-mmap

Crates.io Documentation License

English | δΈ­ζ–‡

A type-safe, high-performance memory-mapped file library optimized for lock-free concurrent writes to non-overlapping ranges.

Features

  • πŸš€ Zero-Copy Writes: Data is written directly to mapped memory without system calls
  • πŸ”’ Lock-Free Concurrency: Multiple threads can write to different file regions simultaneously without locks
  • βœ… Type-Safe API: Prevents overlapping writes at compile-time through the type system
  • πŸ“¦ Reference Counting: Can be cloned and shared among multiple workers
  • ⚑ High Performance: Optimized for concurrent random writes (see benchmarks)
  • πŸ”§ Manual Flushing: Fine-grained control over when data is synchronized to disk
  • 🌐 Runtime Agnostic: Works with any async runtime (tokio, async-std) or without one

When to Use

Perfect for:

  • 🌐 Multi-threaded downloaders: Concurrent writes to different file chunks
  • πŸ“ Logging systems: Multiple threads writing to different log regions
  • πŸ’Ύ Database systems: Concurrent updates to different data blocks
  • πŸ“Š Large file processing: Parallel processing of different file sections

Not suitable for:

  • Files that need dynamic resizing (size must be known at creation)
  • Sequential or small file operations (overhead not justified)
  • Systems with limited virtual memory

Quick Start

Add to your Cargo.toml:

[dependencies]

ranged-mmap = "0.1"

Type-Safe Version (Recommended)

The MmapFile API provides compile-time safety guarantees through range allocation:

use ranged_mmap::MmapFile;

fn main() -> std::io::Result<()> {
    // Create a 1MB file and range allocator
    let (file, mut allocator) = MmapFile::create("output.bin", 1024 * 1024)?;

    // Allocate non-overlapping ranges in the main thread
    let range1 = allocator.allocate(512 * 1024).unwrap(); // [0, 512KB)
    let range2 = allocator.allocate(512 * 1024).unwrap(); // [512KB, 1MB)

    // Concurrent writes to different ranges (compile-time safe!)
    std::thread::scope(|s| {
        let f1 = file.clone();
        let f2 = file.clone();
        
        s.spawn(move || {
            let receipt = f1.write_range(range1, &vec![1u8; 512 * 1024]).unwrap();
            f1.flush_range(receipt).unwrap();
        });
        
        s.spawn(move || {
            let receipt = f2.write_range(range2, &vec![2u8; 512 * 1024]).unwrap();
            f2.flush_range(receipt).unwrap();
        });
    });

    // Final synchronous flush to ensure all data is written
    unsafe { file.sync_all()?; }
    Ok(())
}

Unsafe Version (Maximum Performance)

For scenarios where you can manually guarantee non-overlapping writes:

use ranged_mmap::MmapFileInner;

fn main() -> std::io::Result<()> {
    let file = MmapFileInner::create("output.bin", 1024)?;

    let file1 = file.clone();
    let file2 = file.clone();

    std::thread::scope(|s| {
        // ⚠️ Safety: You must ensure non-overlapping regions
        s.spawn(|| unsafe { 
            file1.write_at(0, &[1; 512]).unwrap();
        });
        s.spawn(|| unsafe { 
            file2.write_at(512, &[2; 512]).unwrap();
        });
    });

    unsafe { file.flush()?; }
    Ok(())
}

API Overview

Main Types

  • MmapFile: Type-safe memory-mapped file with compile-time safety
  • MmapFileInner: Unsafe high-performance version for manual safety management
  • RangeAllocator: Allocates non-overlapping file ranges sequentially
  • AllocatedRange: Represents a valid, non-overlapping file range
  • WriteReceipt: Proof that a range has been written (enables type-safe flushing)

Core Methods

MmapFile (Type-Safe)

// Create file and allocator
let (file, mut allocator) = MmapFile::create(path, size)?;

// Allocate ranges (main thread)
let range = allocator.allocate(1024)?;

// Write to range (returns receipt)
let receipt = file.write_range(range, data)?;

// Flush using receipt
file.flush_range(receipt)?;

// Sync all data to disk
unsafe { file.sync_all()?; }

MmapFileInner (Unsafe)

// Create file
let file = MmapFileInner::create(path, size)?;

// Write at offset (must ensure non-overlapping)
unsafe { file.write_at(offset, data)?; }

// Flush to disk
unsafe { file.flush()?; }

Safety Guarantees

Compile-Time Safety (MmapFile)

The type system ensures:

  • βœ… All ranges are allocated through RangeAllocator
  • βœ… Ranges are allocated sequentially, preventing overlaps
  • βœ… Data length must match range length
  • βœ… Only written ranges can be flushed (via WriteReceipt)

Runtime Safety (MmapFileInner)

You must ensure:

  • ⚠️ Different threads write to non-overlapping memory regions
  • ⚠️ No reads occur to a region during writes
  • ⚠️ Proper synchronization if violating the above rules

Performance

This library is optimized for concurrent random write scenarios. Compared to standard tokio::fs::File, it offers:

  • Zero system calls for writes: Direct memory modification
  • No locks required: True parallel writes to different regions
  • Batch flushing: Control when data is synchronized to disk

See benches/concurrent_write.rs for detailed benchmarks.

Advanced Usage

With Tokio Runtime

use ranged_mmap::MmapFile;
use tokio::task;

#[tokio::main]
async fn main() -> std::io::Result<()> {
    let (file, mut allocator) = MmapFile::create("output.bin", 1024 * 1024)?;
    
    // Allocate ranges
    let range1 = allocator.allocate(512 * 1024).unwrap();
    let range2 = allocator.allocate(512 * 1024).unwrap();
    
    // Spawn async tasks
    let f1 = file.clone();
    let f2 = file.clone();
    
    let task1 = task::spawn_blocking(move || {
        f1.write_range(range1, &vec![1u8; 512 * 1024])
    });
    
    let task2 = task::spawn_blocking(move || {
        f2.write_range(range2, &vec![2u8; 512 * 1024])
    });
    
    let receipt1 = task1.await.unwrap()?;
    let receipt2 = task2.await.unwrap()?;
    
    // Flush specific ranges
    file.flush_range(receipt1)?;
    file.flush_range(receipt2)?;
    
    unsafe { file.sync_all()?; }
    Ok(())
}

Reading Data

use ranged_mmap::MmapFile;

fn main() -> std::io::Result<()> {
    let (file, mut allocator) = MmapFile::create("output.bin", 1024)?;
    let range = allocator.allocate(100).unwrap();
    
    // Write data
    file.write_range(range, &[42u8; 100])?;
    
    // Read back
    let mut buf = vec![0u8; 100];
    file.read_range(range, &mut buf)?;
    
    assert_eq!(buf, vec![42u8; 100]);
    Ok(())
}

Opening Existing Files

use ranged_mmap::MmapFile;

fn main() -> std::io::Result<()> {
    // Open existing file
    let (file, mut allocator) = MmapFile::open("existing.bin")?;
    
    println!("File size: {} bytes", file.size());
    println!("Remaining allocatable: {} bytes", allocator.remaining());
    
    // Continue allocating and writing
    if let Some(range) = allocator.allocate(1024) {
        file.write_range(range, &[0u8; 1024])?;
    }
    
    Ok(())
}

Limitations

  • Fixed Size: File size must be specified at creation and cannot be changed
  • Virtual Memory: Maximum file size is limited by system virtual memory
  • Platform Support: Currently optimized for Unix-like systems and Windows
  • No Built-in Locking: Users must manage concurrent access patterns

How It Works

  1. Memory Mapping: The file is memory-mapped using memmap2, making it accessible as a continuous memory region
  2. Range Allocation: RangeAllocator sequentially allocates non-overlapping ranges
  3. Type Safety: AllocatedRange can only be created through the allocator, guaranteeing validity
  4. Lock-Free Writes: Each thread writes to its own AllocatedRange, avoiding locks
  5. Manual Flushing: Users control when data is synchronized to disk for optimal performance

Comparison

Feature ranged-mmap tokio::fs::File std::fs::File
Concurrent writes βœ… Lock-free ❌ Requires locks ❌ Requires locks
Zero-copy βœ… Yes ❌ No ❌ No
Type safety βœ… Compile-time ⚠️ Runtime ⚠️ Runtime
System calls (write) βœ… Zero ❌ Per write ❌ Per write
Dynamic size ❌ Fixed βœ… Yes βœ… Yes
Async support βœ… Runtime agnostic βœ… Tokio only ❌ No

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

License

This project is licensed under either of:

at your option.

Acknowledgments

Built on top of the excellent memmap2 crate.