pub struct File { /* private fields */ }io-filesystem only.Expand description
A file implementing Blocks.
These files can be used either directly, or via FileSequence. In both
cases, the behavior for creating and opening a file is the same.
Internally, it uses std::fs::File API.
§Linux specifics
The implementation has tweaks for Linux which are really optimizations around the syscall API it provides.
When the file is created, it is pre-allocated with fallocate, so that
the actual space is guaranteed by the filesystem and blocks will not run
out of it during runtime.
Reads are doing a normal vectored IO via p* syscalls, saving on a seek.
The kernel is advised about the read pattern, which is sequential, to
double the read-ahead page count. After every read the page cache is
dropped for that range.
Writes are normally done in chunks of 8 MiB, which could be smaller or
larger, depending on the block size and maximum length of iovec per
syscall. These chunks are aligned at block size boundary. If the write fits
into a single syscall, it will be done with RWF_DSYNC flag. Otherwise,
sync_file_range is used to start asynchronous sync on a chunk, then
proceed writing the next chunk with asynchronous sync, followed by a wait
of a sync on the previous chunk. Since the pattern is append-only and no
overwrites are expected once the blocks are done, and reads are onto the
memory that is managed by a block stream, page cache is dropped every time
the chunk has been written. So, in the ideal case, page caches should not
consume more than a chunk size worth of memory.
§Handling sync errors
This implementation takes on a paranoid approach of failing any further
reads or writes if sync returns an error. The returned error is set to
FileSyncError to allow callers to detect this specific case. To recover,
the file has to be re-opened.
For more context, you can check a good summary on PostgreSQL wiki, which also includes a link to fsyncgate thread somewhere from 2018, which is an interesting read.
Implementations§
Source§impl File
impl File
Sourcepub fn create<P: AsRef<Path>>(
path: P,
block_count: u64,
block_shift: u32,
) -> Result<File>
pub fn create<P: AsRef<Path>>( path: P, block_count: u64, block_shift: u32, ) -> Result<File>
Creates a file at path with the provided block_count and
block_shift.
The resulting file length is set to the maximum based on the input arguments, which is to save from syncing metadata during operation. Additionally, on Linux the file is pre-allocated, preventing it from running out of space.
§Errors
If block_count is 0, block_shift is less than 12 or greater than 28,
or the total file size is greater than i64::MAX, then the error is
of io::ErrorKind::InvalidInput kind with the message explaining the
problem.
In other cases returns the IO error from the underlying fs::File
API or the operating system.
Sourcepub fn open<P: AsRef<Path>>(path: P, block_shift: u32) -> Result<File>
pub fn open<P: AsRef<Path>>(path: P, block_shift: u32) -> Result<File>
Opens a file at path with the given block_shift.
The block count is calculated from the file length, which must be aligned.
§Errors
If block_shift is less than 12 or greater than 28, then the error is
of io::ErrorKind::InvalidInput kind with the message explaining the
problem.
If file length is 0, larger than i64::MAX, or not divisible by
block size, the error of io::ErrorKind::InvalidData is returned
along with the message.
In other cases returns the IO error from the underlying fs::File
API or the operating system.
Trait Implementations§
Source§impl Blocks for File
impl Blocks for File
Source§fn load_from(&mut self, block: u64, bufs: &mut [IoSliceMut<'_>]) -> Result<()>
fn load_from(&mut self, block: u64, bufs: &mut [IoSliceMut<'_>]) -> Result<()>
Loads the data from a file into bufs starting from block.
There are two implementation: a generic one and a specialized for the
Linux kernel. Generic implementation does seek to a block, followed
by a vectored read. For Linux specifics, check
type documentation.
§Errors
If bufs length exceeds i32::MAX, an error of
io::ErrorKind::InvalidInput kind is returned.
If the end of file is reached before the bufs has been filled, an
error of io::ErrorKind::UnexpectedEof is returned. In other cases,
returns the underlying IO error.
Note, that sync errors persistently make the file unusable. See type documentation for more details.
Source§fn store_at(&mut self, block: u64, bufs: &mut [IoSlice<'_>]) -> Result<()>
fn store_at(&mut self, block: u64, bufs: &mut [IoSlice<'_>]) -> Result<()>
Stores the data from bufs by writing them to a file starting at
block.
There are two implementation: a generic one and a specialized for the
Linux kernel. Generic implementation does seek to a block, followed
by vectored write of buffers, finished with a single flush. This is
not the most optimal pattern, but it is surely portable.
For Linux specifics, check type documentation.
§Errors
If bufs structure do not match what is guaranteed by BlockStream
implementation, an error of io::ErrorKind::InvalidInput is returned.
See Blocks::store_at for details.
If the total length of bufs exceed the capcity of a file, an error
of io::ErrorKind::OutOfMemory is returned. If the underlying file
writes zero bytes, an error of io::ErrorKind::WriteZero is returned.
In other cases, an IO error is returned.
Note, that sync errors persistently make the file unusable. See
File documentation for more details.
Source§fn block_count(&self) -> u64
fn block_count(&self) -> u64
Stream.Source§fn block_shift(&self) -> u32
fn block_shift(&self) -> u32
Stream.