1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
// Unless explicitly stated otherwise all files in this repository are licensed
// under the MIT/Apache-2.0 License, at your convenience
//
// This product includes software developed at Datadog (https://www.datadoghq.com/). Copyright 2020 Datadog, Inc.
//! `glommio::io` provides data structures targeted towards File I/O.
//!
//! File I/O in Glommio comes in two kinds: Buffered and Direct I/O.
//!
//! Ideally an application would pick one of them according to its needs and not
//! mix both. However if you do want to mix both, it is recommended that you do
//! not do so in the same device: Kernel settings like I/O schedulers and merge
//! settings that are beneficial to one of them can be detrimental to the
//! others.
//!
//! If you absolutely must use both in the same device, avoid issuing both
//! Direct and Buffered I/O in the same file: at this point you are just trying
//! to drive Linux crazy.
//!
//! Buffered I/O
//! ============
//!
//! Buffered I/O will use the operating system page cache. It is ideal for
//! simpler applications that don't want to deal with caching policies and have
//! I/O performance as a maybe important, but definitely not crucial part of
//! their performance story.
//!
//! Disadvantages of Buffered I/O:
//!  * Hard to know when resources are really used, which make controlled
//!    processes almost impossible (the time of write to device is detached from
//!    the file write time)
//!  * More copies than necessary, as the data has to be copied from the device
//!    to the page cache, from the page cache to the internal file buffers, and
//!    in abstract linear implementations like [`AsyncWriteExt`] and
//!    [`AsyncReadExt`] from user-provided buffers to the file internal buffers.
//!  * Advanced features for io_uring like Non-interrupt mode, registered files,
//!    registered buffers, will not work with Buffered I/O
//!  * Read amplification for small random reads, as the OS is bounded by the
//!    page size (usually 4kB), even though modern NVMe devices are perfectly
//!    capable of issuing 512-byte I/O.
//!
//! The main structure to deal with Buffered I/O is the [`BufferedFile`] struct.
//! It is targeted at random I/O. Reads from and writes to it expect a position.
//!
//! Direct I/O
//! ==========
//!
//! Direct I/O will not use the Operating System page cache and will always
//! touch the device directly. That will always work very well for stream-based
//! workloads (scanning a file much larger than memory, writing a buffer that
//! will not be read from in the near future, etc) but will require a
//! user-provided cache for good random performance.
//!
//! There are advantages to using a user-provided cache: Files usually contain
//! serialized objects and every read have to deserialize them. A user-provided
//! cache can cache the parsed objects, among others. Still, not all
//! applications can or want to deal with that complexity.
//!
//! Disadvantages of Direct I/O:
//! * I/O needs to be aligned. Both the buffers and the file positions need
//!   specific alignments. The [`DmaBuffer`] should hide most of that
//!   complexity, but you may still end up with heavy read amplification if you
//!   are not careful.
//! * Without a user-provided cache, random performance can be bad.
//!
//! There are two main structs that deal with File Direct I/O:
//!
//! [`DmaFile`] is targeted at random Direct I/O. Reads from and writes to it
//! expect a position.
//!
//! [`DmaStreamWriter`] and [`DmaStreamReader`] perform sequential I/O and their
//! interface is a lot closer to other mainstream rust interfaces in `std::fs`.
//!
//! However, despite being sequential, I/O for the two Stream structs are
//! parallel: [`DmaStreamWriter`] exposes a setting for write-behind, meaning
//! that it will keep accepting writes to its internal buffers even with older
//! writes are still in-flight. In turn, [`DmaStreamReader`] exposes a setting
//! for read-ahead meaning it will initiate I/O for positions you will read into
//! the future sooner.
//!
//! [`BufferedFile`]: struct.BufferedFile.html
//! [`DmaFile`]: struct.DmaFile.html
//! [`DmaBuffer`]: struct.DmaBuffer.html
//! [`DmaStreamWriter`]: struct.DmaStreamWriter.html
//! [`DmaStreamReader`]: struct.DmaStreamReader.html
//! [`AsyncReadExt`]: https://docs.rs/futures-lite/1.11.2/futures_lite/io/trait.AsyncReadExt.html
//! [`AsyncWriteExt`]: https://docs.rs/futures-lite/1.11.2/futures_lite/io/trait.AsyncWriteExt.html

macro_rules! enhanced_try {
    ($expr:expr, $op:expr, $path:expr, $fd:expr) => {{
        match $expr {
            Ok(val) => Ok(val),
            Err(source) => {
                let enhanced = crate::error::GlommioError::<()>::EnhancedIoError {
                    source,
                    op: $op,
                    path: $path.and_then(|x| Some(x.to_path_buf())),
                    fd: $fd,
                };
                Err(enhanced)
            }
        }
    }};
    ($expr:expr, $op:expr, $obj:expr) => {{
        enhanced_try!(
            $expr,
            $op,
            $obj.path.as_ref().and_then(|x| Some(x.as_path())),
            Some($obj.as_raw_fd())
        )
    }};
}

mod buffered_file;
mod buffered_file_stream;
mod directory;
mod dma_file;
mod dma_file_stream;
mod dma_open_options;
mod glommio_file;
mod read_result;

use crate::sys;
use std::path::Path;

pub(super) type Result<T> = crate::Result<T, ()>;

/// rename an existing file.
///
/// Warning: synchronous operation, will block the reactor
pub async fn rename<P: AsRef<Path>, Q: AsRef<Path>>(old_path: P, new_path: Q) -> Result<()> {
    sys::rename_file(&old_path.as_ref(), &new_path.as_ref())?;
    Ok(())
}

/// remove an existing file given its name
///
/// Warning: synchronous operation, will block the reactor
pub async fn remove<P: AsRef<Path>>(path: P) -> Result<()> {
    enhanced_try!(
        sys::remove_file(path.as_ref()),
        "Removing",
        Some(path.as_ref()),
        None
    )
}

pub use self::{
    buffered_file::BufferedFile,
    buffered_file_stream::{
        stdin,
        StreamReader,
        StreamReaderBuilder,
        StreamWriter,
        StreamWriterBuilder,
    },
    directory::Directory,
    dma_file::DmaFile,
    dma_file_stream::{
        DmaStreamReader,
        DmaStreamReaderBuilder,
        DmaStreamWriter,
        DmaStreamWriterBuilder,
    },
    dma_open_options::DmaOpenOptions,
    read_result::ReadResult,
};
pub use crate::sys::DmaBuffer;

#[cfg(test)]
mod test {
    use super::*;
    use crate::LocalExecutor;

    #[test]
    fn remove_nonexistent() {
        let local_ex = LocalExecutor::default();
        local_ex.run(async {
            let x = remove("/tmp/this_file_does_not_exist_and_if_you_created_just_to_mess_with_me_you_deserve_this_test_to_fail_and_I_am_not_even_sorry").await;
            assert_eq!(x.unwrap_err().raw_os_error().unwrap(), libc::ENOENT);
        });
    }
}