iroh_blobs/protocol.rs
1//! Protocol for transferring content-addressed blobs over [`iroh`] p2p QUIC connections.
2//!
3//! # Participants
4//!
5//! The protocol is a request/response protocol with two parties, a *provider* that
6//! serves blobs and a *getter* that requests blobs.
7//!
8//! # Goals
9//!
10//! - Be paranoid about data integrity.
11//!
12//! Data integrity is considered more important than performance. Data will be validated both on
13//! the provider and getter side. A well behaved provider will never send invalid data. Responses
14//! to range requests contain sufficient information to validate the data.
15//!
16//! Note: Validation using blake3 is extremely fast, so in almost all scenarios the validation
17//! will not be the bottleneck even if we validate both on the provider and getter side.
18//!
19//! - Do not limit the size of blobs or collections.
20//!
21//! Blobs can be of arbitrary size, up to terabytes. Likewise, collections can contain an
22//! arbitrary number of links. A well behaved implementation will not require the entire blob or
23//! collection to be in memory at once.
24//!
25//! - Be efficient when transferring large blobs, including range requests.
26//!
27//! It is possible to request entire blobs or ranges of blobs, where the minimum granularity is a
28//! chunk group of 16KiB or 16 blake3 chunks. The worst case overhead when doing range requests
29//! is about two chunk groups per range.
30//!
31//! - Be efficient when transferring multiple tiny blobs.
32//!
33//! For tiny blobs the overhead of sending the blob hashes and the round-trip time for each blob
34//! would be prohibitive.
35//!
36//! To avoid roundtrips, the protocol allows grouping multiple blobs into *collections*.
37//! The semantic meaning of a collection is up to the application. For the purpose
38//! of this protocol, a collection is just a grouping of related blobs.
39//!
40//! # Non-goals
41//!
42//! - Do not attempt to be generic in terms of the used hash function.
43//!
44//! The protocol makes extensive use of the [blake3](https://crates.io/crates/blake3) hash
45//! function and it's special properties such as blake3 verified streaming.
46//!
47//! - Do not support graph traversal.
48//!
49//! The protocol only supports collections that directly contain blobs. If you have deeply nested
50//! graph data, you will need to either do multiple requests or flatten the graph into a single
51//! temporary collection.
52//!
53//! - Do not support discovery.
54//!
55//! The protocol does not yet have a discovery mechanism for asking the provider what ranges are
56//! available for a given blob. Currently you have to have some out-of-band knowledge about what
57//! node has data for a given hash, or you can just try to retrieve the data and see if it is
58//! available.
59//!
60//! A discovery protocol is planned in the future though.
61//!
62//! # Requests
63//!
64//! ## Getter defined requests
65//!
66//! In this case the getter knows the hash of the blob it wants to retrieve and
67//! whether it wants to retrieve a single blob or a collection.
68//!
69//! The getter needs to define exactly what it wants to retrieve and send the
70//! request to the provider.
71//!
72//! The provider will then respond with the bao encoded bytes for the requested
73//! data and then close the connection. It will immediately close the connection
74//! in case some data is not available or invalid.
75//!
76//! ## Provider defined requests
77//!
78//! In this case the getter sends a blob to the provider. This blob can contain
79//! some kind of query. The exact details of the query are up to the application.
80//!
81//! The provider evaluates the query and responds with a serialized request in
82//! the same format as the getter defined requests, followed by the bao encoded
83//! data. From then on the protocol is the same as for getter defined requests.
84//!
85//! ## Specifying the required data
86//!
87//! A [`GetRequest`] contains a hash and a specification of what data related to
88//! that hash is required. The specification is using a [`ChunkRangesSeq`] which
89//! has a compact representation on the wire but is otherwise identical to a
90//! sequence of sets of ranges.
91//!
92//! In the following, we describe how the [`GetRequest`] is to be created for
93//! different common scenarios.
94//!
95//! Under the hood, this is using the [`ChunkRangesSeq`] type, but the most
96//! convenient way to create a [`GetRequest`] is to use the builder API.
97//!
98//! Ranges are always given in terms of 1024 byte blake3 chunks, *not* in terms
99//! of bytes or chunk groups. The reason for this is that chunks are the fundamental
100//! unit of hashing in BLAKE3. Addressing anything smaller than a chunk is not
101//! possible, and combining multiple chunks is merely an optimization to reduce
102//! metadata overhead.
103//!
104//! ### Individual blobs
105//!
106//! In the easiest case, the getter just wants to retrieve a single blob. In this
107//! case, the getter specifies [`ChunkRangesSeq`] that contains a single element.
108//! This element is the set of all chunks to indicate that we
109//! want the entire blob, no matter how many chunks it has.
110//!
111//! Since this is a very common case, there is a convenience method
112//! [`GetRequest::blob`] that only requires the hash of the blob.
113//!
114//! ```rust
115//! # use iroh_blobs::protocol::GetRequest;
116//! # let hash: iroh_blobs::Hash = [0; 32].into();
117//! let request = GetRequest::blob(hash);
118//! ```
119//!
120//! ### Ranges of blobs
121//!
122//! In this case, we have a (possibly large) blob and we want to retrieve only
123//! some ranges of chunks. This is useful in similar cases as HTTP range requests.
124//!
125//! We still need just a single element in the [`ChunkRangesSeq`], since we are
126//! still only interested in a single blob. However, this element contains all
127//! the chunk ranges we want to retrieve.
128//!
129//! For example, if we want to retrieve chunks 0-10 of a blob, we would
130//! create a [`ChunkRangesSeq`] like this:
131//!
132//! ```rust
133//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt};
134//! # let hash: iroh_blobs::Hash = [0; 32].into();
135//! let request = GetRequest::builder()
136//! .root(ChunkRanges::chunks(..10))
137//! .build(hash);
138//! ```
139//!
140//! While not that common, it is also possible to request multiple ranges of a
141//! single blob. For example, if we want to retrieve chunks `0-10` and `100-110`
142//! of a large file, we would create a [`GetRequest`] like this:
143//!
144//! ```rust
145//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt, ChunkRangesSeq};
146//! # let hash: iroh_blobs::Hash = [0; 32].into();
147//! let request = GetRequest::builder()
148//! .root(ChunkRanges::chunks(..10) | ChunkRanges::chunks(100..110))
149//! .build(hash);
150//! ```
151//!
152//! This is all great, but in most cases we are not interested in chunks but
153//! in bytes. The [`ChunkRanges`] type has a constructor that allows providing
154//! byte ranges instead of chunk ranges. These will be rounded up to the
155//! nearest chunk.
156//!
157//! ```rust
158//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt, ChunkRangesSeq};
159//! # let hash: iroh_blobs::Hash = [0; 32].into();
160//! let request = GetRequest::builder()
161//! .root(ChunkRanges::bytes(..1000) | ChunkRanges::bytes(10000..11000))
162//! .build(hash);
163//! ```
164//!
165//! There are also methods to request a single chunk or a single byte offset,
166//! as well as a special constructor for the last chunk of a blob.
167//!
168//! ```rust
169//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt, ChunkRangesSeq};
170//! # let hash: iroh_blobs::Hash = [0; 32].into();
171//! let request = GetRequest::builder()
172//! .root(ChunkRanges::offset(1) | ChunkRanges::last_chunk())
173//! .build(hash);
174//! ```
175//!
176//! To specify chunk ranges, we use the [`ChunkRanges`] type alias.
177//! This is actually the [`RangeSet`] type from the
178//! [range_collections](https://crates.io/crates/range_collections) crate. This
179//! type supports efficient boolean operations on sets of non-overlapping ranges.
180//!
181//! The [`RangeSet2`] type is a type alias for [`RangeSet`] that can store up to
182//! 2 boundaries without allocating. This is sufficient for most use cases.
183//!
184//! [`RangeSet`]: range_collections::range_set::RangeSet
185//! [`RangeSet2`]: range_collections::range_set::RangeSet2
186//!
187//! ### Hash sequences
188//!
189//! In this case the provider has a hash sequence that refers multiple blobs.
190//! We want to retrieve all blobs in the hash sequence.
191//!
192//! When used for hash sequences, the first element of a [`ChunkRangesSeq`] refers
193//! to the hash seq itself, and all subsequent elements refer to the blobs
194//! in the hash seq. When a [`ChunkRangesSeq`] specifies ranges for more than
195//! one blob, the provider will interpret this as a request for a hash seq.
196//!
197//! One thing to note is that we might not yet know how many blobs are in the
198//! hash sequence. Therefore, it is not possible to download an entire hash seq
199//! by just specifying [`ChunkRanges::all()`] for all children.
200//!
201//! Instead, [`ChunkRangesSeq`] allows defining infinite sequences of range sets.
202//! The [`ChunkRangesSeq::all()`] method returns a [`ChunkRangesSeq`] that, when iterated
203//! over, will yield [`ChunkRanges::all()`] forever.
204//!
205//! So a get request to download a hash sequence blob and all its children
206//! would look like this:
207//!
208//! ```rust
209//! # use iroh_blobs::protocol::{ChunkRanges, ChunkRangesExt, GetRequest};
210//! # let hash: iroh_blobs::Hash = [0; 32].into();
211//! let request = GetRequest::builder()
212//! .root(ChunkRanges::all())
213//! .build_open(hash); // repeats the last range forever
214//! ```
215//!
216//! Downloading an entire hash seq is also a very common case, so there is a
217//! convenience method [`GetRequest::all`] that only requires the hash of the
218//! hash sequence blob.
219//!
220//! ```rust
221//! # use iroh_blobs::protocol::{ChunkRanges, ChunkRangesExt, GetRequest};
222//! # let hash: iroh_blobs::Hash = [0; 32].into();
223//! let request = GetRequest::all(hash);
224//! ```
225//!
226//! ### Parts of hash sequences
227//!
228//! The most complex common case is when we have retrieved a hash seq and
229//! it's children, but were interrupted before we could retrieve all children.
230//!
231//! In this case we need to specify the hash seq we want to retrieve, but
232//! exclude the children and parts of children that we already have.
233//!
234//! For example, if we have a hash with 3 children, and we already have
235//! the first child and the first 1000000 chunks of the second child.
236//!
237//! We would create a [`GetRequest`] like this:
238//!
239//! ```rust
240//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt};
241//! # let hash: iroh_blobs::Hash = [0; 32].into();
242//! let request = GetRequest::builder()
243//! .child(1, ChunkRanges::chunks(1000000..)) // we don't need the first child;
244//! .next(ChunkRanges::all()) // we need the second child and all subsequent children completely
245//! .build_open(hash);
246//! ```
247//!
248//! ### Requesting chunks for each child
249//!
250//! The ChunkRangesSeq allows some scenarios that are not covered above. E.g. you
251//! might want to request a hash seq and the first chunk of each child blob to
252//! do something like mime type detection.
253//!
254//! You do not know how many children the collection has, so you need to use
255//! an infinite sequence.
256//!
257//! ```rust
258//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt, ChunkRangesSeq};
259//! # let hash: iroh_blobs::Hash = [0; 32].into();
260//! let request = GetRequest::builder()
261//! .root(ChunkRanges::all())
262//! .next(ChunkRanges::chunk(1)) // the first chunk of each child)
263//! .build_open(hash);
264//! ```
265//!
266//! ### Requesting a single child
267//!
268//! It is of course possible to request a single child of a collection. E.g.
269//! the following would download the second child of a collection:
270//!
271//! ```rust
272//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt};
273//! # let hash: iroh_blobs::Hash = [0; 32].into();
274//! let request = GetRequest::builder()
275//! .child(1, ChunkRanges::all()) // we need the second child completely
276//! .build(hash);
277//! ```
278//!
279//! However, if you already have the collection, you might as well locally
280//! look up the hash of the child and request it directly.
281//!
282//! ```rust
283//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesSeq};
284//! # let child_hash: iroh_blobs::Hash = [0; 32].into();
285//! let request = GetRequest::blob(child_hash);
286//! ```
287//!
288//! ### Why ChunkRanges and ChunkRangesSeq?
289//!
290//! You might wonder why we have [`ChunkRangesSeq`], when a simple
291//! sequence of [`ChunkRanges`] might also do.
292//!
293//! The [`ChunkRangesSeq`] type exist to provide an efficient
294//! representation of the request on the wire. In the wire encoding of [`ChunkRangesSeq`],
295//! [`ChunkRanges`] are encoded alternating intervals of selected and non-selected chunks.
296//! This results in smaller numbers that will result in fewer bytes on the wire when using
297//! the [postcard](https://crates.io/crates/postcard) encoding format that uses variable
298//! length integers.
299//!
300//! Likewise, the [`ChunkRangesSeq`] type
301//! does run length encoding to remove repeating elements. It also allows infinite
302//! sequences of [`ChunkRanges`] to be encoded, unlike a simple sequence of
303//! [`ChunkRanges`]s.
304//!
305//! [`ChunkRangesSeq`] should be efficient even in case of very fragmented availability
306//! of chunks, like a download from multiple providers that was frequently interrupted.
307//!
308//! # Responses
309//!
310//! The response stream contains the bao encoded bytes for the requested data.
311//! The data will be sent in the order in which it was requested, so ascending
312//! chunks for each blob, and blobs in the order in which they appear in the
313//! hash seq.
314//!
315//! For details on the bao encoding, see the [bao specification](https://github.com/oconnor663/bao/blob/master/docs/spec.md)
316//! and the [bao-tree](https://crates.io/crates/bao-tree) crate. The bao-tree crate
317//! is identical to the bao crate, except that it allows combining multiple BLAKE3
318//! chunks to chunk groups for efficiency.
319//!
320//! For a complete response, the chunks are guaranteed to completely cover the
321//! requested ranges.
322//!
323//! Reasons for not retrieving a complete response are two-fold:
324//!
325//! - the connection to the provider was interrupted, or the provider encountered
326//! an internal error. In this case the provider will close the entire quinn connection.
327//!
328//! - the provider does not have the requested data, or discovered on send that the
329//! requested data is not valid.
330//!
331//! In this case the provider will close just the stream used to send the response.
332//! The exact location of the missing data can be retrieved from the error.
333//!
334//! # Requesting multiple unrelated blobs
335//!
336//! Let's say you don't have a hash sequence on the provider side, but you
337//! nevertheless want to request multiple unrelated blobs in a single request.
338//!
339//! For this, there is the [`GetManyRequest`] type, which also comes with a
340//! builder API.
341//!
342//! ```rust
343//! # use iroh_blobs::protocol::{GetManyRequest, ChunkRanges, ChunkRangesExt};
344//! # let hash1: iroh_blobs::Hash = [0; 32].into();
345//! # let hash2: iroh_blobs::Hash = [1; 32].into();
346//! GetManyRequest::builder()
347//! .hash(hash1, ChunkRanges::all())
348//! .hash(hash2, ChunkRanges::all())
349//! .build();
350//! ```
351//! If you accidentally or intentionally request ranges for the same hash
352//! multiple times, they will be merged into a single [`ChunkRanges`].
353//!
354//! ```rust
355//! # use iroh_blobs::protocol::{GetManyRequest, ChunkRanges, ChunkRangesExt};
356//! # let hash1: iroh_blobs::Hash = [0; 32].into();
357//! # let hash2: iroh_blobs::Hash = [1; 32].into();
358//! GetManyRequest::builder()
359//! .hash(hash1, ChunkRanges::chunk(1))
360//! .hash(hash2, ChunkRanges::all())
361//! .hash(hash1, ChunkRanges::last_chunk())
362//! .build();
363//! ```
364//!
365//! This is mostly useful for requesting multiple tiny blobs in a single request.
366//! For large or even medium sized blobs, multiple requests are not expensive.
367//! Multiple requests just create multiple streams on the same connection,
368//! which is *very* cheap in QUIC.
369//!
370//! In case nodes are permanently exchanging data, it is somewhat valuable to
371//! keep a connection open and reuse it for multiple requests. However, creating
372//! a new connection is also very cheap, so you would only do this to optimize
373//! a large existing system that has demonstrated performance issues.
374//!
375//! If in doubt, just use multiple requests and multiple connections.
376use std::io;
377
378use builder::GetRequestBuilder;
379use derive_more::From;
380use iroh::endpoint::VarInt;
381use irpc::util::AsyncReadVarintExt;
382use postcard::experimental::max_size::MaxSize;
383use serde::{Deserialize, Serialize};
384mod range_spec;
385pub use bao_tree::ChunkRanges;
386pub use range_spec::{ChunkRangesSeq, NonEmptyRequestRangeSpecIter, RangeSpec};
387use snafu::{GenerateImplicitData, Snafu};
388use tokio::io::AsyncReadExt;
389
390pub use crate::util::ChunkRangesExt;
391use crate::{api::blobs::Bitfield, provider::CountingReader, BlobFormat, Hash, HashAndFormat};
392
393/// Maximum message size is limited to 100MiB for now.
394pub const MAX_MESSAGE_SIZE: usize = 1024 * 1024;
395
396/// The ALPN used with quic for the iroh blobs protocol.
397pub const ALPN: &[u8] = b"/iroh-bytes/4";
398
399#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone, From)]
400/// A request to the provider
401pub enum Request {
402 /// A get request for a blob or collection
403 Get(GetRequest),
404 Observe(ObserveRequest),
405 Slot2,
406 Slot3,
407 Slot4,
408 Slot5,
409 Slot6,
410 Slot7,
411 /// The inverse of a get request - push data to the provider
412 ///
413 /// Note that providers will in many cases reject this request, e.g. if
414 /// they don't have write access to the store or don't want to ingest
415 /// unknown data.
416 Push(PushRequest),
417 /// Get multiple blobs in a single request, from a single provider
418 ///
419 /// This is identical to a [`GetRequest`] for a [`crate::hashseq::HashSeq`], but the provider
420 /// does not need to have the hash seq.
421 GetMany(GetManyRequest),
422}
423
424/// This must contain the request types in the same order as the full requests
425#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone, Copy, MaxSize)]
426pub enum RequestType {
427 Get,
428 Observe,
429 Slot2,
430 Slot3,
431 Slot4,
432 Slot5,
433 Slot6,
434 Slot7,
435 Push,
436 GetMany,
437}
438
439impl Request {
440 pub async fn read_async(
441 reader: &mut CountingReader<&mut iroh::endpoint::RecvStream>,
442 ) -> io::Result<Self> {
443 let request_type = reader.read_u8().await?;
444 let request_type: RequestType = postcard::from_bytes(std::slice::from_ref(&request_type))
445 .map_err(|_| {
446 io::Error::new(
447 io::ErrorKind::InvalidData,
448 "failed to deserialize request type",
449 )
450 })?;
451 Ok(match request_type {
452 RequestType::Get => reader
453 .read_to_end_as::<GetRequest>(MAX_MESSAGE_SIZE)
454 .await?
455 .into(),
456 RequestType::GetMany => reader
457 .read_to_end_as::<GetManyRequest>(MAX_MESSAGE_SIZE)
458 .await?
459 .into(),
460 RequestType::Observe => reader
461 .read_to_end_as::<ObserveRequest>(MAX_MESSAGE_SIZE)
462 .await?
463 .into(),
464 RequestType::Push => reader
465 .read_length_prefixed::<PushRequest>(MAX_MESSAGE_SIZE)
466 .await?
467 .into(),
468 _ => {
469 return Err(io::Error::new(
470 io::ErrorKind::InvalidData,
471 "failed to deserialize request type",
472 ));
473 }
474 })
475 }
476}
477
478/// A get request
479#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone, Hash)]
480pub struct GetRequest {
481 /// blake3 hash
482 pub hash: Hash,
483 /// The range of data to request
484 ///
485 /// The first element is the parent, all subsequent elements are children.
486 pub ranges: ChunkRangesSeq,
487}
488
489impl From<HashAndFormat> for GetRequest {
490 fn from(value: HashAndFormat) -> Self {
491 match value.format {
492 BlobFormat::Raw => Self::blob(value.hash),
493 BlobFormat::HashSeq => Self::all(value.hash),
494 }
495 }
496}
497
498impl GetRequest {
499 pub fn builder() -> GetRequestBuilder {
500 GetRequestBuilder::default()
501 }
502
503 pub fn content(&self) -> HashAndFormat {
504 HashAndFormat {
505 hash: self.hash,
506 format: if self.ranges.is_blob() {
507 BlobFormat::Raw
508 } else {
509 BlobFormat::HashSeq
510 },
511 }
512 }
513
514 /// Request a blob or collection with specified ranges
515 pub fn new(hash: Hash, ranges: ChunkRangesSeq) -> Self {
516 Self { hash, ranges }
517 }
518
519 /// Request a collection and all its children
520 pub fn all(hash: impl Into<Hash>) -> Self {
521 Self {
522 hash: hash.into(),
523 ranges: ChunkRangesSeq::all(),
524 }
525 }
526
527 /// Request just a single blob
528 pub fn blob(hash: impl Into<Hash>) -> Self {
529 Self {
530 hash: hash.into(),
531 ranges: ChunkRangesSeq::from_ranges([ChunkRanges::all()]),
532 }
533 }
534
535 /// Request ranges from a single blob
536 pub fn blob_ranges(hash: Hash, ranges: ChunkRanges) -> Self {
537 Self {
538 hash,
539 ranges: ChunkRangesSeq::from_ranges([ranges]),
540 }
541 }
542}
543
544/// A push request contains a description of what to push, but will be followed
545/// by the data to push.
546#[derive(
547 Deserialize, Serialize, Debug, PartialEq, Eq, Clone, derive_more::From, derive_more::Deref,
548)]
549pub struct PushRequest(GetRequest);
550
551impl PushRequest {
552 pub fn new(hash: Hash, ranges: ChunkRangesSeq) -> Self {
553 Self(GetRequest::new(hash, ranges))
554 }
555}
556
557/// A GetMany request is a request to get multiple blobs via a single request.
558///
559/// It is identical to a [`GetRequest`] for a HashSeq, but the HashSeq is provided
560/// by the requester.
561#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone)]
562pub struct GetManyRequest {
563 /// The hashes of the blobs to get
564 pub hashes: Vec<Hash>,
565 /// The ranges of data to request
566 ///
567 /// There is no range request for the parent, since we just sent the hashes
568 /// and therefore have the parent already.
569 pub ranges: ChunkRangesSeq,
570}
571
572impl<I: Into<Hash>> FromIterator<I> for GetManyRequest {
573 fn from_iter<T: IntoIterator<Item = I>>(iter: T) -> Self {
574 let mut res = iter.into_iter().map(Into::into).collect::<Vec<Hash>>();
575 res.sort();
576 res.dedup();
577 let n = res.len() as u64;
578 Self {
579 hashes: res,
580 ranges: ChunkRangesSeq(smallvec::smallvec![
581 (0, ChunkRanges::all()),
582 (n, ChunkRanges::empty())
583 ]),
584 }
585 }
586}
587
588impl GetManyRequest {
589 pub fn new(hashes: Vec<Hash>, ranges: ChunkRangesSeq) -> Self {
590 Self { hashes, ranges }
591 }
592
593 pub fn builder() -> builder::GetManyRequestBuilder {
594 builder::GetManyRequestBuilder::default()
595 }
596}
597
598/// A request to observe a raw blob bitfield.
599#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone, Hash)]
600pub struct ObserveRequest {
601 /// blake3 hash
602 pub hash: Hash,
603 /// ranges to observe.
604 pub ranges: RangeSpec,
605}
606
607impl ObserveRequest {
608 pub fn new(hash: Hash) -> Self {
609 Self {
610 hash,
611 ranges: RangeSpec::all(),
612 }
613 }
614}
615
616#[derive(Deserialize, Serialize, Debug, PartialEq, Eq)]
617pub struct ObserveItem {
618 pub size: u64,
619 pub ranges: ChunkRanges,
620}
621
622impl From<&Bitfield> for ObserveItem {
623 fn from(value: &Bitfield) -> Self {
624 Self {
625 size: value.size,
626 ranges: value.ranges.clone(),
627 }
628 }
629}
630
631impl From<&ObserveItem> for Bitfield {
632 fn from(value: &ObserveItem) -> Self {
633 Self {
634 size: value.size,
635 ranges: value.ranges.clone(),
636 }
637 }
638}
639
640/// Reasons to close connections or stop streams.
641///
642/// A QUIC **connection** can be *closed* and a **stream** can request the other side to
643/// *stop* sending data. Both closing and stopping have an associated `error_code`, closing
644/// also adds a `reason` as some arbitrary bytes.
645///
646/// This enum exists so we have a single namespace for `error_code`s used.
647#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
648#[repr(u16)]
649pub enum Closed {
650 /// The [`RecvStream`] was dropped.
651 ///
652 /// Used implicitly when a [`RecvStream`] is dropped without explicit call to
653 /// [`RecvStream::stop`]. We don't use this explicitly but this is here as
654 /// documentation as to what happened to `0`.
655 ///
656 /// [`RecvStream`]: iroh::endpoint::RecvStream
657 /// [`RecvStream::stop`]: iroh::endpoint::RecvStream::stop
658 StreamDropped = 0,
659 /// The provider is terminating.
660 ///
661 /// When a provider terminates all connections and associated streams are closed.
662 ProviderTerminating = 1,
663 /// The provider has received the request.
664 ///
665 /// Only a single request is allowed on a stream, if more data is received after this a
666 /// provider may send this error code in a STOP_STREAM frame.
667 RequestReceived = 2,
668}
669
670impl Closed {
671 /// The close reason as bytes. This is a valid utf8 string describing the reason.
672 pub fn reason(&self) -> &'static [u8] {
673 match self {
674 Closed::StreamDropped => b"stream dropped",
675 Closed::ProviderTerminating => b"provider terminating",
676 Closed::RequestReceived => b"request received",
677 }
678 }
679}
680
681impl From<Closed> for VarInt {
682 fn from(source: Closed) -> Self {
683 VarInt::from(source as u16)
684 }
685}
686
687/// Unknown error_code, can not be converted into [`Closed`].
688#[derive(Debug, Snafu)]
689#[snafu(display("Unknown error_code: {code}"))]
690pub struct UnknownErrorCode {
691 code: u64,
692 backtrace: Option<snafu::Backtrace>,
693}
694
695impl UnknownErrorCode {
696 pub(crate) fn new(code: u64) -> Self {
697 Self {
698 code,
699 backtrace: GenerateImplicitData::generate(),
700 }
701 }
702}
703
704impl TryFrom<VarInt> for Closed {
705 type Error = UnknownErrorCode;
706
707 fn try_from(value: VarInt) -> std::result::Result<Self, Self::Error> {
708 match value.into_inner() {
709 0 => Ok(Self::StreamDropped),
710 1 => Ok(Self::ProviderTerminating),
711 2 => Ok(Self::RequestReceived),
712 val => Err(UnknownErrorCode::new(val)),
713 }
714 }
715}
716
717pub mod builder {
718 use std::collections::BTreeMap;
719
720 use bao_tree::ChunkRanges;
721
722 use super::ChunkRangesSeq;
723 use crate::{
724 protocol::{GetManyRequest, GetRequest},
725 Hash,
726 };
727
728 #[derive(Debug, Clone, Default)]
729 pub struct ChunkRangesSeqBuilder {
730 ranges: BTreeMap<u64, ChunkRanges>,
731 }
732
733 #[derive(Debug, Clone, Default)]
734 pub struct GetRequestBuilder {
735 builder: ChunkRangesSeqBuilder,
736 }
737
738 impl GetRequestBuilder {
739 /// Add a range to the request.
740 pub fn offset(mut self, offset: u64, ranges: impl Into<ChunkRanges>) -> Self {
741 self.builder = self.builder.offset(offset, ranges);
742 self
743 }
744
745 /// Add a range to the request.
746 pub fn child(mut self, child: u64, ranges: impl Into<ChunkRanges>) -> Self {
747 self.builder = self.builder.offset(child + 1, ranges);
748 self
749 }
750
751 /// Specify ranges for the root blob (the HashSeq)
752 pub fn root(mut self, ranges: impl Into<ChunkRanges>) -> Self {
753 self.builder = self.builder.offset(0, ranges);
754 self
755 }
756
757 /// Specify ranges for the next offset.
758 pub fn next(mut self, ranges: impl Into<ChunkRanges>) -> Self {
759 self.builder = self.builder.next(ranges);
760 self
761 }
762
763 /// Build a get request for the given hash, with the ranges specified in the builder.
764 pub fn build(self, hash: impl Into<Hash>) -> GetRequest {
765 let ranges = self.builder.build();
766 GetRequest::new(hash.into(), ranges)
767 }
768
769 /// Build a get request for the given hash, with the ranges specified in the builder
770 /// and the last non-empty range repeating indefinitely.
771 pub fn build_open(self, hash: impl Into<Hash>) -> GetRequest {
772 let ranges = self.builder.build_open();
773 GetRequest::new(hash.into(), ranges)
774 }
775 }
776
777 impl ChunkRangesSeqBuilder {
778 /// Add a range to the request.
779 pub fn offset(self, offset: u64, ranges: impl Into<ChunkRanges>) -> Self {
780 self.at_offset(offset, ranges.into())
781 }
782
783 /// Specify ranges for the next offset.
784 pub fn next(self, ranges: impl Into<ChunkRanges>) -> Self {
785 let offset = self.next_offset_value();
786 self.at_offset(offset, ranges.into())
787 }
788
789 /// Build a get request for the given hash, with the ranges specified in the builder.
790 pub fn build(self) -> ChunkRangesSeq {
791 ChunkRangesSeq::from_ranges(self.build0())
792 }
793
794 /// Build a get request for the given hash, with the ranges specified in the builder
795 /// and the last non-empty range repeating indefinitely.
796 pub fn build_open(self) -> ChunkRangesSeq {
797 ChunkRangesSeq::from_ranges_infinite(self.build0())
798 }
799
800 /// Add ranges at the given offset.
801 fn at_offset(mut self, offset: u64, ranges: ChunkRanges) -> Self {
802 self.ranges
803 .entry(offset)
804 .and_modify(|v| *v |= ranges.clone())
805 .or_insert(ranges);
806 self
807 }
808
809 /// Build the request.
810 fn build0(mut self) -> impl Iterator<Item = ChunkRanges> {
811 let mut ranges = Vec::new();
812 self.ranges.retain(|_, v| !v.is_empty());
813 let until_key = self.next_offset_value();
814 for offset in 0..until_key {
815 ranges.push(self.ranges.remove(&offset).unwrap_or_default());
816 }
817 ranges.into_iter()
818 }
819
820 /// Get the next offset value.
821 fn next_offset_value(&self) -> u64 {
822 self.ranges
823 .last_key_value()
824 .map(|(k, _)| *k + 1)
825 .unwrap_or_default()
826 }
827 }
828
829 #[derive(Debug, Clone, Default)]
830 pub struct GetManyRequestBuilder {
831 ranges: BTreeMap<Hash, ChunkRanges>,
832 }
833
834 impl GetManyRequestBuilder {
835 /// Specify ranges for the given hash.
836 ///
837 /// Note that if you specify a hash that is already in the request, the ranges will be
838 /// merged with the existing ranges.
839 pub fn hash(mut self, hash: impl Into<Hash>, ranges: impl Into<ChunkRanges>) -> Self {
840 let ranges = ranges.into();
841 let hash = hash.into();
842 self.ranges
843 .entry(hash)
844 .and_modify(|v| *v |= ranges.clone())
845 .or_insert(ranges);
846 self
847 }
848
849 /// Build a `GetManyRequest`.
850 pub fn build(self) -> GetManyRequest {
851 let (hashes, ranges): (Vec<Hash>, Vec<ChunkRanges>) = self
852 .ranges
853 .into_iter()
854 .filter(|(_, v)| !v.is_empty())
855 .unzip();
856 let ranges = ChunkRangesSeq::from_ranges(ranges);
857 GetManyRequest { hashes, ranges }
858 }
859 }
860
861 #[cfg(test)]
862 mod tests {
863 use bao_tree::ChunkNum;
864
865 use super::*;
866 use crate::{protocol::GetManyRequest, util::ChunkRangesExt};
867
868 #[test]
869 fn chunk_ranges_ext() {
870 let ranges = ChunkRanges::bytes(1..2)
871 | ChunkRanges::chunks(100..=200)
872 | ChunkRanges::offset(1024 * 10)
873 | ChunkRanges::chunk(1024)
874 | ChunkRanges::last_chunk();
875 assert_eq!(
876 ranges,
877 ChunkRanges::from(ChunkNum(0)..ChunkNum(1)) // byte range 1..2
878 | ChunkRanges::from(ChunkNum(10)..ChunkNum(11)) // chunk at byte offset 1024*10
879 | ChunkRanges::from(ChunkNum(100)..ChunkNum(201)) // chunk range 100..=200
880 | ChunkRanges::from(ChunkNum(1024)..ChunkNum(1025)) // chunk 1024
881 | ChunkRanges::last_chunk() // last chunk
882 );
883 }
884
885 #[test]
886 fn get_request_builder() {
887 let hash = [0; 32];
888 let request = GetRequest::builder()
889 .root(ChunkRanges::all())
890 .next(ChunkRanges::all())
891 .next(ChunkRanges::bytes(0..100))
892 .build(hash);
893 assert_eq!(request.hash.as_bytes(), &hash);
894 assert_eq!(
895 request.ranges,
896 ChunkRangesSeq::from_ranges([
897 ChunkRanges::all(),
898 ChunkRanges::all(),
899 ChunkRanges::from(..ChunkNum(1)),
900 ])
901 );
902
903 let request = GetRequest::builder()
904 .root(ChunkRanges::all())
905 .child(2, ChunkRanges::bytes(0..100))
906 .build(hash);
907 assert_eq!(request.hash.as_bytes(), &hash);
908 assert_eq!(
909 request.ranges,
910 ChunkRangesSeq::from_ranges([
911 ChunkRanges::all(), // root
912 ChunkRanges::empty(), // child 0
913 ChunkRanges::empty(), // child 1
914 ChunkRanges::from(..ChunkNum(1)) // child 2,
915 ])
916 );
917
918 let request = GetRequest::builder()
919 .root(ChunkRanges::all())
920 .next(ChunkRanges::bytes(0..1024) | ChunkRanges::last_chunk())
921 .build_open(hash);
922 assert_eq!(request.hash.as_bytes(), &[0; 32]);
923 assert_eq!(
924 request.ranges,
925 ChunkRangesSeq::from_ranges_infinite([
926 ChunkRanges::all(),
927 ChunkRanges::from(..ChunkNum(1)) | ChunkRanges::last_chunk(),
928 ])
929 );
930 }
931
932 #[test]
933 fn get_many_request_builder() {
934 let hash1 = [0; 32];
935 let hash2 = [1; 32];
936 let hash3 = [2; 32];
937 let request = GetManyRequest::builder()
938 .hash(hash1, ChunkRanges::all())
939 .hash(hash2, ChunkRanges::empty()) // will be ignored!
940 .hash(hash3, ChunkRanges::bytes(0..100))
941 .build();
942 assert_eq!(
943 request.hashes,
944 vec![Hash::from([0; 32]), Hash::from([2; 32])]
945 );
946 assert_eq!(
947 request.ranges,
948 ChunkRangesSeq::from_ranges([
949 ChunkRanges::all(), // hash 0
950 ChunkRanges::from(..ChunkNum(1)), // hash 2
951 ])
952 );
953 }
954 }
955}
956
957#[cfg(test)]
958mod tests {
959 use iroh_test::{assert_eq_hex, hexdump::parse_hexdump};
960 use postcard::experimental::max_size::MaxSize;
961
962 use super::{GetRequest, Request, RequestType};
963 use crate::Hash;
964
965 #[test]
966 fn request_wire_format() {
967 let hash: Hash = [0xda; 32].into();
968 let cases = [
969 (
970 Request::from(GetRequest::blob(hash)),
971 r"
972 00 # enum variant for GetRequest
973 dadadadadadadadadadadadadadadadadadadadadadadadadadadadadadadada # the hash
974 020001000100 # the ChunkRangesSeq
975 ",
976 ),
977 (
978 Request::from(GetRequest::all(hash)),
979 r"
980 00 # enum variant for GetRequest
981 dadadadadadadadadadadadadadadadadadadadadadadadadadadadadadadada # the hash
982 01000100 # the ChunkRangesSeq
983 ",
984 ),
985 ];
986 for (case, expected_hex) in cases {
987 let expected = parse_hexdump(expected_hex).unwrap();
988 let bytes = postcard::to_stdvec(&case).unwrap();
989 assert_eq_hex!(bytes, expected);
990 }
991 }
992
993 #[test]
994 fn request_type_size() {
995 assert_eq!(RequestType::POSTCARD_MAX_SIZE, 1);
996 }
997}