Skip to main content

Crate mmap_snapshot

Crate mmap_snapshot 

Source
Expand description

Safe mmap() with snapshot isolation and atomic commits.

(Linux-only, works best on XFS/btrfs.)

§Example

Mmap a file as &[u8]:

let mmap = Mmap::open(&path)?;
assert_eq!(mmap.len(), 12);
assert_eq!(&mmap[..], b"Hello world!");

Mmap a file as &mut [u8], committing the changes back to disk:

let mut mmap = MmapMut::open(&path)?;
mmap[6..11].copy_from_slice(b"sekai");
mmap.commit()?;
assert_eq!(std::fs::read_to_string(&path)?, "Hello sekai!");

§Safety

The unsafe thing about mmapping a file is that it gives you volatile memory: when someone modifies the file, the memory changes. This is not the way a respectable &[u8] (or even &mut [u8]) should behave.

So we use a trick: instead of mapping the file directly, we map a private “snapshot” of the file which doesn’t change, even when the file is externally modified. The only way to modify the snapshot is via the mmap, which makes it valid according to Rust’s rules.

See the SAFETY comments in the code for a more thorough explanation.

§Performance

The cost of safety? On my machine, Mmap::open() takes just 0.1 ms longer than File::open() - that’s it! And it doesn’t matter how big the file is. A small price to pay.

But there’s a catch: if the file is on a filesystem which doesn’t support reflinks then we have to copy the whole file. Therefore, while the semantics are the same on all filesystems, the performance characteristics vary wildly.

This table shows whether methods are constant-time or linear-time in the size of the file:

MethodXFSbtrfsext4tmpfs
open()O(1)O(1)O(n)O(n)
commit()O(1)O(1)O(n)O(n)
commit_and_close()O(1)O(1)O(1)O(1)

See the method docs for more details.

If the file is on a reflink-capable filesystem, the overhead is so tiny that there’s really no reason not to snapshot it. However, although many distros now default to reflink-capable filesystems for new installs1, it will obviously be common to encounter ext4 in the wild for many years to come. So be aware that a subset of your users may experience stalls when mmapping large files.

§Platform support

We make the snapshot by cloning the original file into a private (unlinked) file. It’s impossible for anyone else to modify this file, which is what makes it safe to mmap. On Linux we use O_TMPFILE for this. I don’t know of a race-free way to create an unlinked file on MacOS/Windows; if one exists, please open an issue to let me know!


  1. The major exceptions are Debian and Ubuntu, which select ext4 by default in the installer. This is, frankly, a bad decision. From its creation, ext4 was intended as a “stop-gap” to give people more time to migrate away from the ext* family of filesystems. Encouraging its use on fresh installs is poor. 

Structs§

Mmap
A point-in-time snapshot of a file
MmapMut
A mutable snapshot of a file