1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
// SPDX-License-Identifier: Apache-2.0
// Copyright 2023-2025 SUSE LLC
// Author: Nicolai Stange <nstange@suse.de>
//! Definition of the [`NvBlkDev`] trait, a block device abstraction for
//! [`NvFs`](super::fs::NvFs) implementations to build on.
extern crate alloc;
use crateutils_common;
use ;
pub use ;
/// Error type returned by [`NvBlkDev`] primitives.
/// Debugging friendly helper for [`NvBlkDev`] implementations to instantiate
/// [`NvBlkDevIoError::Internal`].
///
/// Panics if `cfg!(debug_assertions)` is on, to allow for debugger examination
/// at the point the logic error has happened. Otherwise a
/// [`NvBlkDevIoError::Internal`] is returned.
/// Future trait implemented by all [`NvBlkDev`] related futures.
///
/// `NvBlkDevFuture` differs from the standard [Rust
/// `Future`](core::future::Future) only in that it takes an additional `dev`
/// argument, thereby potentially avoiding the need of creating and
/// storing additional [`SyncRcPtr`](crate::utils_async::sync_types::SyncRcPtr)
/// clones for the [`NvBlkDev`] instance.
/// Trait defining an interface to block device like storage backends for
/// [`NvFs`](super::fs::NvFs) implementations to build on.
///
/// Define primitives for querying a physical storage backend about its
/// dimensions and characteristics, as well as for reading, writing and trimming
/// contiguous, [block](Self::io_block_size_128b_log2) aligned regions.
///
/// Most of the API is specified in terms of Rust `async` [`Future`] concept in
/// order to enable dependant [`NvFs`](super::fs::NvFs) implementations to
/// target a wide range of possible execution environments with different
/// characteristics.
///
/// In general, a storage backend `NvBlkDev` implementation is tightly coupled
/// to the target `async` execution environment by nature though. `NvBlkDev`
/// implementations may therefore assume a specific `async` executor
/// implementation to be deployed with. For example, if targetting some minimal
/// executor like [`Pollster`](https://docs.rs/pollster/latest/pollster/), it would be
/// absolutely legitimate to block the current's thread's execution for IO.
///
/// The `NvBlkDev` methods don't in fact return [`Future`]s, but
/// [`NvBlkDevFuture`]s. The latter differ from the former only in that they
/// take an additional `dev` argument, thereby potentially avoiding the need of
/// creating and storing additional
/// [`SyncRcPtr`](crate::utils_async::sync_types::SyncRcPtr) clones for the
/// `NvBlkDev` instance.
///
/// # Coherence considerations
/// ## Intra-power-cycle coherence
///
/// By the very nature of the `async` execution model, there can be concurrent
/// reads and writes to overlapping regions on storage. In what follows, it is
/// assumed there's a total order on all points in time where some IO operation
/// is initiated or [polled](NvBlkDevFuture::poll) to completion.
/// The following coherence rules apply for any sequence of operations initiated
/// from the same power cycle, in order of their priority:
///
/// * *Superseding pending writes* - Polling a [write
/// barrier](Self::WriteBarrierFuture) to completion implicitly completes all
/// pending [write](Self::WriteBarrierFuture) or [trim](Self::TrimFuture)
/// operations initiated prior to it with unspecified result. Note that this
/// only affects the aforementioned total order within a power-cycle for the
/// rules that follow, no promises are being made regarding the state of the
/// physical backing storage. Polling further on a [write](Self::WriteFuture)
/// or [trim](Self::TrimFuture) future implicitly completed this way results
/// in implementation defined behavior -- that is, it's an `unreachable()`
/// condition.
/// * *Conflicting writes* - In the absence of any [write
/// barriers](Self::write_barrier), [initiating a write](Self::write) to a
/// region overlapping with an already pending
/// [write](Self::WriteBarrierFuture) or [trim](Self::TrimFuture) not yet
/// polled to completion results in implementation defined behavior. That is,
/// it's an `unreachable()` condition.
/// * *Read-write conflicts* - Further polling on a [read
/// future](Self::ReadFuture) after a [write](Self::write) or
/// [trim](Self::trim) to some region overlapping with it has been initiated
/// results in implementation defined behavior. That is, it's an
/// `unreachable()` condition.
/// * *Write-read conflicts* - [Initiating a read](Self::read) from a region
/// overlapping with a pending [write](Self::WriteFuture) or
/// [trim](Self::TrimFuture) not yet polled to completion results in
/// implementation defined behavior. That is, it's an `unreachable()`
/// condition.
/// * *Reading from failed writes or trimmed regions* - [Initiating a
/// read](Self::read) from a region overlapping with a prior
/// [write](Self::WriteFuture) that had been completed with error, or with a
/// [trim](Self::TrimFuture) completed with either status, neither of which
/// had been superseded by a subsequent successfully completed write since,
/// results in implementation defined behavior. That is, it's an
/// `unreachable()` condition.
/// * *Cache coherence* - Reading from a region overlapping with a previous
/// [write](Self::WriteFuture) from a future polled to *successful* completion
/// by the time the read operation started, and which had not been superseded
/// by a later write to or trim of that region overlap, must return the
/// updated data from the most recent write for the overlap.
///
/// ## Inter-power-cycle coherence
///
/// Inter-power-cycle coherence concerns the order in which writes become
/// effective on physical storage. More specifically how reads after a power
/// cycling event relate to writes and trims somewhen before it.
///
/// It is assumed that the minimum unit of IO, i.e. a ["Device IO
/// Block"](Self::io_block_size_128b_log2), has the following semantics:
/// * [Writes](Self::write) to or [trims](Self::trim) of one ["Device IO
/// Block"](Self::io_block_size_128b_log2) do not affect any other [Device
/// IO Blocks](Self::io_block_size_128b_log2).
/// * [Writes](Self::write) to a single [Device IO
/// Block](Self::io_block_size_128b_log2) are not necessarily atomic, but
/// -- assuming the absence of any power cycling events -- there is a point in
/// time when its physical state fully reflects the to be written state. It is
/// said that "a write becomes effective on physical storage" at that point in
/// time. Starting from when a write was initiatied up to when it possibly
/// becomes effective on physical storage, the [Device IO
/// Block](Self::io_block_size_128b_log2) "is under write". In
/// particular, if a [Device IO Block](Self::io_block_size_128b_log2) is
/// under write at the time a power cycle event happens, it remains so until
/// eventually overwritten again (or trimmed) in a later power cycle.
/// * [Trim](Self::trim) requests are at some point getting transmitted to the
/// physical storage backend, from when on they're said to have "commenced".
/// * For a single given [Device IO Block](Self::io_block_size_128b_log2),
/// there is a total order on the writes and trims. That is a given [Device IO
/// Block](Self::io_block_size_128b_log2) can be either under write, a
/// write to it may have become effective on physical storage or a trim may
/// have commenced.
/// * Reading from a [Device IO Block](Self::io_block_size_128b_log2) under
/// write results in arbitrary data to be returned.
/// * Reading from a [Device IO Block](Self::io_block_size_128b_log2) for
/// which a trim has commenced results in implementation defined behavior.
/// That is, it's an `unreachable()` condition.
/// * Power cycle events behave as if a virtual [write
/// barrier](Self::write_barrier) had been issued and polled to completion at
/// that point.
/// * A [write sync](Self::write_sync) operation has implicit write barrier
/// semantics. Furthermore, once the corresponding
/// [future](Self::WriteSyncFuture) has been polled to a successful
/// completion, it is guaranteed that any writes initiated prior to it have
/// become effective on physical storage.
/// * In the absence of any [write barrier](Self::write_barrier), writes to and
/// trims of *different* [Device IO Block](Self::io_block_size_128b_log2)
/// may become effective on physical storage or commence respectively in any
/// order.
/// - [Writes](Self::write) issued after a [write
/// barrier](Self::WriteBarrierFuture) has been polled to completion must
/// not become effective on physical storage before any
/// [writes](Self::WriteFuture) polled to completion before the [write
/// barrier request](Self::write_barrier) had been issued.
/// - [Trims](Self::trim) issued after a [write
/// barrier](Self::WriteBarrierFuture) has been polled to completion must
/// not commence before any [writes](Self::WriteFuture) polled to completion
/// before the [write barrier request](Self::write_barrier) became effective
/// on physical storage.
/// Trait defining the common interface to [`NvBlkDev`] write requests to be
/// submitted to [`write()`](NvBlkDev::write).
///
/// The `NvBlkDevWriteRequest` interface is intended to provide a means to
/// obtain all required information about the write destination location as well
/// as access to the source data buffers in a generic way. Note that the
/// [`NvBlkDevWriteRequest`] instance is always getting returned again one way
/// or the other out of [`write()`](NvBlkDev::write) or the associated
/// [`WriteFuture`](NvBlkDev::WriteFuture) respectively, enabling temporary
/// ownership transfers of any required ressources, like e.g. the source
/// buffers, for the duration of the write request.
///
/// The write request source data may be split across equally sized buffers,
/// so-called "chunks", whose layout is described alongside the physical write
/// destination location by means of the [`ChunkedIoRegion`] returned by
/// [`region()`](Self::region). The region is required to be
/// [aligned](ChunkedIoRegion::is_aligned) to the [Device IO
/// Block](NvBlkDev::io_block_size_128b_log2) size.
///
/// Access to the chunked source buffers is provided by making the
/// [`NvBlkDevWriteRequest`] instance indexable with
/// [`ChunkedIoRegionChunkRange`] "indices" emitted by the aforementioned
/// [`ChunkedIoRegion`]'s iterators.
/// Trait defining the common interface to [`NvBlkDev`] read requests to be
/// submitted to [`read()`](NvBlkDev::read).
///
/// The `NvBlkDevReadRequest` interface is intended to provide a means to obtain
/// all required information about the read source location as well as access to
/// the destination data buffers in a generic way. Note that the
/// [`NvBlkDevReadRequest`] instance is always getting returned again one way or
/// the other out of [`read()`](NvBlkDev::read) or the associated
/// [`ReadFuture`](NvBlkDev::ReadFuture) respectively, enabling temporary
/// ownership transfers of any required ressources, like e.g. the source
/// buffers, for the duration of the read request.
///
/// The read request destination memory may be split across equally sized
/// buffers, so-called "chunks", whose layout is described alongside the
/// physical read source location by means of the [`ChunkedIoRegion`] returned
/// by [`region()`](Self::region). The region is required to be
/// [aligned](ChunkedIoRegion::is_aligned) to the [Device IO
/// Block](NvBlkDev::io_block_size_128b_log2) size.
///
/// Access to the chunked destination buffers is provided by making the
/// [`NvBlkDevReadRequest`] instance indexable with
/// [`ChunkedIoRegionChunkRange`] "indices" emitted by the aforementioned
/// [`ChunkedIoRegion`]'s iterators.