iroh_blobs/
protocol.rs

1//! Protocol for transferring content-addressed blobs over [`iroh`] p2p QUIC connections.
2//!
3//! # Participants
4//!
5//! The protocol is a request/response protocol with two parties, a *provider* that
6//! serves blobs and a *getter* that requests blobs.
7//!
8//! # Goals
9//!
10//! - Be paranoid about data integrity.
11//!
12//!   Data integrity is considered more important than performance. Data will be validated both on
13//!   the provider and getter side. A well behaved provider will never send invalid data. Responses
14//!   to range requests contain sufficient information to validate the data.
15//!
16//!   Note: Validation using blake3 is extremely fast, so in almost all scenarios the validation
17//!   will not be the bottleneck even if we validate both on the provider and getter side.
18//!
19//! - Do not limit the size of blobs or collections.
20//!
21//!   Blobs can be of arbitrary size, up to terabytes. Likewise, collections can contain an
22//!   arbitrary number of links. A well behaved implementation will not require the entire blob or
23//!   collection to be in memory at once.
24//!
25//! - Be efficient when transferring large blobs, including range requests.
26//!
27//!   It is possible to request entire blobs or ranges of blobs, where the minimum granularity is a
28//!   chunk group of 16KiB or 16 blake3 chunks. The worst case overhead when doing range requests
29//!   is about two chunk groups per range.
30//!
31//! - Be efficient when transferring multiple tiny blobs.
32//!
33//!   For tiny blobs the overhead of sending the blob hashes and the round-trip time for each blob
34//!   would be prohibitive.
35//!
36//! To avoid roundtrips, the protocol allows grouping multiple blobs into *collections*.
37//! The semantic meaning of a collection is up to the application. For the purpose
38//! of this protocol, a collection is just a grouping of related blobs.
39//!
40//! # Non-goals
41//!
42//! - Do not attempt to be generic in terms of the used hash function.
43//!
44//!   The protocol makes extensive use of the [blake3](https://crates.io/crates/blake3) hash
45//!   function and it's special properties such as blake3 verified streaming.
46//!
47//! - Do not support graph traversal.
48//!
49//!   The protocol only supports collections that directly contain blobs. If you have deeply nested
50//!   graph data, you will need to either do multiple requests or flatten the graph into a single
51//!   temporary collection.
52//!
53//! - Do not support discovery.
54//!
55//!   The protocol does not yet have a discovery mechanism for asking the provider what ranges are
56//!   available for a given blob. Currently you have to have some out-of-band knowledge about what
57//!   node has data for a given hash, or you can just try to retrieve the data and see if it is
58//!   available.
59//!
60//! A discovery protocol is planned in the future though.
61//!
62//! # Requests
63//!
64//! ## Getter defined requests
65//!
66//! In this case the getter knows the hash of the blob it wants to retrieve and
67//! whether it wants to retrieve a single blob or a collection.
68//!
69//! The getter needs to define exactly what it wants to retrieve and send the
70//! request to the provider.
71//!
72//! The provider will then respond with the bao encoded bytes for the requested
73//! data and then close the connection. It will immediately close the connection
74//! in case some data is not available or invalid.
75//!
76//! ## Provider defined requests
77//!
78//! In this case the getter sends a blob to the provider. This blob can contain
79//! some kind of query. The exact details of the query are up to the application.
80//!
81//! The provider evaluates the query and responds with a serialized request in
82//! the same format as the getter defined requests, followed by the bao encoded
83//! data. From then on the protocol is the same as for getter defined requests.
84//!
85//! ## Specifying the required data
86//!
87//! A [`GetRequest`] contains a hash and a specification of what data related to
88//! that hash is required. The specification is using a [`ChunkRangesSeq`] which
89//! has a compact representation on the wire but is otherwise identical to a
90//! sequence of sets of ranges.
91//!
92//! In the following, we describe how the [`GetRequest`] is to be created for
93//! different common scenarios.
94//!
95//! Under the hood, this is using the [`ChunkRangesSeq`] type, but the most
96//! convenient way to create a [`GetRequest`] is to use the builder API.
97//!
98//! Ranges are always given in terms of 1024 byte blake3 chunks, *not* in terms
99//! of bytes or chunk groups. The reason for this is that chunks are the fundamental
100//! unit of hashing in BLAKE3. Addressing anything smaller than a chunk is not
101//! possible, and combining multiple chunks is merely an optimization to reduce
102//! metadata overhead.
103//!
104//! ### Individual blobs
105//!
106//! In the easiest case, the getter just wants to retrieve a single blob. In this
107//! case, the getter specifies [`ChunkRangesSeq`] that contains a single element.
108//! This element is the set of all chunks to indicate that we
109//! want the entire blob, no matter how many chunks it has.
110//!
111//! Since this is a very common case, there is a convenience method
112//! [`GetRequest::blob`] that only requires the hash of the blob.
113//!
114//! ```rust
115//! # use iroh_blobs::protocol::GetRequest;
116//! # let hash: iroh_blobs::Hash = [0; 32].into();
117//! let request = GetRequest::blob(hash);
118//! ```
119//!
120//! ### Ranges of blobs
121//!
122//! In this case, we have a (possibly large) blob and we want to retrieve only
123//! some ranges of chunks. This is useful in similar cases as HTTP range requests.
124//!
125//! We still need just a single element in the [`ChunkRangesSeq`], since we are
126//! still only interested in a single blob. However, this element contains all
127//! the chunk ranges we want to retrieve.
128//!
129//! For example, if we want to retrieve chunks 0-10 of a blob, we would
130//! create a [`ChunkRangesSeq`] like this:
131//!
132//! ```rust
133//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt};
134//! # let hash: iroh_blobs::Hash = [0; 32].into();
135//! let request = GetRequest::builder()
136//!     .root(ChunkRanges::chunks(..10))
137//!     .build(hash);
138//! ```
139//!
140//! While not that common, it is also possible to request multiple ranges of a
141//! single blob. For example, if we want to retrieve chunks `0-10` and `100-110`
142//! of a large file, we would create a [`GetRequest`] like this:
143//!
144//! ```rust
145//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt, ChunkRangesSeq};
146//! # let hash: iroh_blobs::Hash = [0; 32].into();
147//! let request = GetRequest::builder()
148//!     .root(ChunkRanges::chunks(..10) | ChunkRanges::chunks(100..110))
149//!     .build(hash);
150//! ```
151//!
152//! This is all great, but in most cases we are not interested in chunks but
153//! in bytes. The [`ChunkRanges`] type has a constructor that allows providing
154//! byte ranges instead of chunk ranges. These will be rounded up to the
155//! nearest chunk.
156//!
157//! ```rust
158//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt, ChunkRangesSeq};
159//! # let hash: iroh_blobs::Hash = [0; 32].into();
160//! let request = GetRequest::builder()
161//!     .root(ChunkRanges::bytes(..1000) | ChunkRanges::bytes(10000..11000))
162//!     .build(hash);
163//! ```
164//!
165//! There are also methods to request a single chunk or a single byte offset,
166//! as well as a special constructor for the last chunk of a blob.
167//!
168//! ```rust
169//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt, ChunkRangesSeq};
170//! # let hash: iroh_blobs::Hash = [0; 32].into();
171//! let request = GetRequest::builder()
172//!     .root(ChunkRanges::offset(1) | ChunkRanges::last_chunk())
173//!     .build(hash);
174//! ```
175//!
176//! To specify chunk ranges, we use the [`ChunkRanges`] type alias.
177//! This is actually the [`RangeSet`] type from the
178//! [range_collections](https://crates.io/crates/range_collections) crate. This
179//! type supports efficient boolean operations on sets of non-overlapping ranges.
180//!
181//! The [`RangeSet2`] type is a type alias for [`RangeSet`] that can store up to
182//! 2 boundaries without allocating. This is sufficient for most use cases.
183//!
184//! [`RangeSet`]: range_collections::range_set::RangeSet
185//! [`RangeSet2`]: range_collections::range_set::RangeSet2
186//!
187//! ### Hash sequences
188//!
189//! In this case the provider has a hash sequence that refers multiple blobs.
190//! We want to retrieve all blobs in the hash sequence.
191//!
192//! When used for hash sequences, the first element of a [`ChunkRangesSeq`] refers
193//! to the hash seq itself, and all subsequent elements refer to the blobs
194//! in the hash seq. When a [`ChunkRangesSeq`] specifies ranges for more than
195//! one blob, the provider will interpret this as a request for a hash seq.
196//!
197//! One thing to note is that we might not yet know how many blobs are in the
198//! hash sequence. Therefore, it is not possible to download an entire hash seq
199//! by just specifying [`ChunkRanges::all()`] for all children.
200//!
201//! Instead, [`ChunkRangesSeq`] allows defining infinite sequences of range sets.
202//! The [`ChunkRangesSeq::all()`] method returns a [`ChunkRangesSeq`] that, when iterated
203//! over, will yield [`ChunkRanges::all()`] forever.
204//!
205//! So a get request to download a hash sequence blob and all its children
206//! would look like this:
207//!
208//! ```rust
209//! # use iroh_blobs::protocol::{ChunkRanges, ChunkRangesExt, GetRequest};
210//! # let hash: iroh_blobs::Hash = [0; 32].into();
211//! let request = GetRequest::builder()
212//!     .root(ChunkRanges::all())
213//!     .build_open(hash); // repeats the last range forever
214//! ```
215//!
216//! Downloading an entire hash seq is also a very common case, so there is a
217//! convenience method [`GetRequest::all`] that only requires the hash of the
218//! hash sequence blob.
219//!
220//! ```rust
221//! # use iroh_blobs::protocol::{ChunkRanges, ChunkRangesExt, GetRequest};
222//! # let hash: iroh_blobs::Hash = [0; 32].into();
223//! let request = GetRequest::all(hash);
224//! ```
225//!
226//! ### Parts of hash sequences
227//!
228//! The most complex common case is when we have retrieved a hash seq and
229//! it's children, but were interrupted before we could retrieve all children.
230//!
231//! In this case we need to specify the hash seq we want to retrieve, but
232//! exclude the children and parts of children that we already have.
233//!
234//! For example, if we have a hash with 3 children, and we already have
235//! the first child and the first 1000000 chunks of the second child.
236//!
237//! We would create a [`GetRequest`] like this:
238//!
239//! ```rust
240//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt};
241//! # let hash: iroh_blobs::Hash = [0; 32].into();
242//! let request = GetRequest::builder()
243//!     .child(1, ChunkRanges::chunks(1000000..)) // we don't need the first child;
244//!     .next(ChunkRanges::all()) // we need the second child and all subsequent children completely
245//!     .build_open(hash);
246//! ```
247//!
248//! ### Requesting chunks for each child
249//!
250//! The ChunkRangesSeq allows some scenarios that are not covered above. E.g. you
251//! might want to request a hash seq and the first chunk of each child blob to
252//! do something like mime type detection.
253//!
254//! You do not know how many children the collection has, so you need to use
255//! an infinite sequence.
256//!
257//! ```rust
258//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt, ChunkRangesSeq};
259//! # let hash: iroh_blobs::Hash = [0; 32].into();
260//! let request = GetRequest::builder()
261//!     .root(ChunkRanges::all())
262//!     .next(ChunkRanges::chunk(1)) // the first chunk of each child)
263//!     .build_open(hash);
264//! ```
265//!
266//! ### Requesting a single child
267//!
268//! It is of course possible to request a single child of a collection. E.g.
269//! the following would download the second child of a collection:
270//!
271//! ```rust
272//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesExt};
273//! # let hash: iroh_blobs::Hash = [0; 32].into();
274//! let request = GetRequest::builder()
275//!     .child(1, ChunkRanges::all()) // we need the second child completely
276//!     .build(hash);
277//! ```
278//!
279//! However, if you already have the collection, you might as well locally
280//! look up the hash of the child and request it directly.
281//!
282//! ```rust
283//! # use iroh_blobs::protocol::{GetRequest, ChunkRanges, ChunkRangesSeq};
284//! # let child_hash: iroh_blobs::Hash = [0; 32].into();
285//! let request = GetRequest::blob(child_hash);
286//! ```
287//!
288//! ### Why ChunkRanges and ChunkRangesSeq?
289//!
290//! You might wonder why we have [`ChunkRangesSeq`], when a simple
291//! sequence of [`ChunkRanges`] might also do.
292//!
293//! The [`ChunkRangesSeq`] type exist to provide an efficient
294//! representation of the request on the wire. In the wire encoding of [`ChunkRangesSeq`],
295//! [`ChunkRanges`] are encoded alternating intervals of selected and non-selected chunks.
296//! This results in smaller numbers that will result in fewer bytes on the wire when using
297//! the [postcard](https://crates.io/crates/postcard) encoding format that uses variable
298//! length integers.
299//!
300//! Likewise, the [`ChunkRangesSeq`] type
301//! does run length encoding to remove repeating elements. It also allows infinite
302//! sequences of [`ChunkRanges`] to be encoded, unlike a simple sequence of
303//! [`ChunkRanges`]s.
304//!
305//! [`ChunkRangesSeq`] should be efficient even in case of very fragmented availability
306//! of chunks, like a download from multiple providers that was frequently interrupted.
307//!
308//! # Responses
309//!
310//! The response stream contains the bao encoded bytes for the requested data.
311//! The data will be sent in the order in which it was requested, so ascending
312//! chunks for each blob, and blobs in the order in which they appear in the
313//! hash seq.
314//!
315//! For details on the bao encoding, see the [bao specification](https://github.com/oconnor663/bao/blob/master/docs/spec.md)
316//! and the [bao-tree](https://crates.io/crates/bao-tree) crate. The bao-tree crate
317//! is identical to the bao crate, except that it allows combining multiple BLAKE3
318//! chunks to chunk groups for efficiency.
319//!
320//! For a complete response, the chunks are guaranteed to completely cover the
321//! requested ranges.
322//!
323//! Reasons for not retrieving a complete response are two-fold:
324//!
325//! - the connection to the provider was interrupted, or the provider encountered
326//!   an internal error. In this case the provider will close the entire quinn connection.
327//!
328//! - the provider does not have the requested data, or discovered on send that the
329//!   requested data is not valid.
330//!
331//! In this case the provider will close just the stream used to send the response.
332//! The exact location of the missing data can be retrieved from the error.
333//!
334//! # Requesting multiple unrelated blobs
335//!
336//! Let's say you don't have a hash sequence on the provider side, but you
337//! nevertheless want to request multiple unrelated blobs in a single request.
338//!
339//! For this, there is the [`GetManyRequest`] type, which also comes with a
340//! builder API.
341//!
342//! ```rust
343//! # use iroh_blobs::protocol::{GetManyRequest, ChunkRanges, ChunkRangesExt};
344//! # let hash1: iroh_blobs::Hash = [0; 32].into();
345//! # let hash2: iroh_blobs::Hash = [1; 32].into();
346//! GetManyRequest::builder()
347//!     .hash(hash1, ChunkRanges::all())
348//!     .hash(hash2, ChunkRanges::all())
349//!     .build();
350//! ```
351//! If you accidentally or intentionally request ranges for the same hash
352//! multiple times, they will be merged into a single [`ChunkRanges`].
353//!
354//! ```rust
355//! # use iroh_blobs::protocol::{GetManyRequest, ChunkRanges, ChunkRangesExt};
356//! # let hash1: iroh_blobs::Hash = [0; 32].into();
357//! # let hash2: iroh_blobs::Hash = [1; 32].into();
358//! GetManyRequest::builder()
359//!     .hash(hash1, ChunkRanges::chunk(1))
360//!     .hash(hash2, ChunkRanges::all())
361//!     .hash(hash1, ChunkRanges::last_chunk())
362//!     .build();
363//! ```
364//!
365//! This is mostly useful for requesting multiple tiny blobs in a single request.
366//! For large or even medium sized blobs, multiple requests are not expensive.
367//! Multiple requests just create multiple streams on the same connection,
368//! which is *very* cheap in QUIC.
369//!
370//! In case nodes are permanently exchanging data, it is somewhat valuable to
371//! keep a connection open and reuse it for multiple requests. However, creating
372//! a new connection is also very cheap, so you would only do this to optimize
373//! a large existing system that has demonstrated performance issues.
374//!
375//! If in doubt, just use multiple requests and multiple connections.
376use std::io;
377
378use builder::GetRequestBuilder;
379use derive_more::From;
380use iroh::endpoint::VarInt;
381use irpc::util::AsyncReadVarintExt;
382use postcard::experimental::max_size::MaxSize;
383use serde::{Deserialize, Serialize};
384mod range_spec;
385pub use bao_tree::ChunkRanges;
386pub use range_spec::{ChunkRangesSeq, NonEmptyRequestRangeSpecIter, RangeSpec};
387use snafu::{GenerateImplicitData, Snafu};
388use tokio::io::AsyncReadExt;
389
390pub use crate::util::ChunkRangesExt;
391use crate::{api::blobs::Bitfield, provider::CountingReader, BlobFormat, Hash, HashAndFormat};
392
393/// Maximum message size is limited to 100MiB for now.
394pub const MAX_MESSAGE_SIZE: usize = 1024 * 1024;
395
396/// The ALPN used with quic for the iroh blobs protocol.
397pub const ALPN: &[u8] = b"/iroh-bytes/4";
398
399#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone, From)]
400/// A request to the provider
401pub enum Request {
402    /// A get request for a blob or collection
403    Get(GetRequest),
404    Observe(ObserveRequest),
405    Slot2,
406    Slot3,
407    Slot4,
408    Slot5,
409    Slot6,
410    Slot7,
411    /// The inverse of a get request - push data to the provider
412    ///
413    /// Note that providers will in many cases reject this request, e.g. if
414    /// they don't have write access to the store or don't want to ingest
415    /// unknown data.
416    Push(PushRequest),
417    /// Get multiple blobs in a single request, from a single provider
418    ///
419    /// This is identical to a [`GetRequest`] for a [`crate::hashseq::HashSeq`], but the provider
420    /// does not need to have the hash seq.
421    GetMany(GetManyRequest),
422}
423
424/// This must contain the request types in the same order as the full requests
425#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone, Copy, MaxSize)]
426pub enum RequestType {
427    Get,
428    Observe,
429    Slot2,
430    Slot3,
431    Slot4,
432    Slot5,
433    Slot6,
434    Slot7,
435    Push,
436    GetMany,
437}
438
439impl Request {
440    pub async fn read_async(
441        reader: &mut CountingReader<&mut iroh::endpoint::RecvStream>,
442    ) -> io::Result<Self> {
443        let request_type = reader.read_u8().await?;
444        let request_type: RequestType = postcard::from_bytes(std::slice::from_ref(&request_type))
445            .map_err(|_| {
446            io::Error::new(
447                io::ErrorKind::InvalidData,
448                "failed to deserialize request type",
449            )
450        })?;
451        Ok(match request_type {
452            RequestType::Get => reader
453                .read_to_end_as::<GetRequest>(MAX_MESSAGE_SIZE)
454                .await?
455                .into(),
456            RequestType::GetMany => reader
457                .read_to_end_as::<GetManyRequest>(MAX_MESSAGE_SIZE)
458                .await?
459                .into(),
460            RequestType::Observe => reader
461                .read_to_end_as::<ObserveRequest>(MAX_MESSAGE_SIZE)
462                .await?
463                .into(),
464            RequestType::Push => reader
465                .read_length_prefixed::<PushRequest>(MAX_MESSAGE_SIZE)
466                .await?
467                .into(),
468            _ => {
469                return Err(io::Error::new(
470                    io::ErrorKind::InvalidData,
471                    "failed to deserialize request type",
472                ));
473            }
474        })
475    }
476}
477
478/// A get request
479#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone, Hash)]
480pub struct GetRequest {
481    /// blake3 hash
482    pub hash: Hash,
483    /// The range of data to request
484    ///
485    /// The first element is the parent, all subsequent elements are children.
486    pub ranges: ChunkRangesSeq,
487}
488
489impl From<HashAndFormat> for GetRequest {
490    fn from(value: HashAndFormat) -> Self {
491        match value.format {
492            BlobFormat::Raw => Self::blob(value.hash),
493            BlobFormat::HashSeq => Self::all(value.hash),
494        }
495    }
496}
497
498impl GetRequest {
499    pub fn builder() -> GetRequestBuilder {
500        GetRequestBuilder::default()
501    }
502
503    pub fn content(&self) -> HashAndFormat {
504        HashAndFormat {
505            hash: self.hash,
506            format: if self.ranges.is_blob() {
507                BlobFormat::Raw
508            } else {
509                BlobFormat::HashSeq
510            },
511        }
512    }
513
514    /// Request a blob or collection with specified ranges
515    pub fn new(hash: Hash, ranges: ChunkRangesSeq) -> Self {
516        Self { hash, ranges }
517    }
518
519    /// Request a collection and all its children
520    pub fn all(hash: impl Into<Hash>) -> Self {
521        Self {
522            hash: hash.into(),
523            ranges: ChunkRangesSeq::all(),
524        }
525    }
526
527    /// Request just a single blob
528    pub fn blob(hash: impl Into<Hash>) -> Self {
529        Self {
530            hash: hash.into(),
531            ranges: ChunkRangesSeq::from_ranges([ChunkRanges::all()]),
532        }
533    }
534
535    /// Request ranges from a single blob
536    pub fn blob_ranges(hash: Hash, ranges: ChunkRanges) -> Self {
537        Self {
538            hash,
539            ranges: ChunkRangesSeq::from_ranges([ranges]),
540        }
541    }
542}
543
544/// A push request contains a description of what to push, but will be followed
545/// by the data to push.
546#[derive(
547    Deserialize, Serialize, Debug, PartialEq, Eq, Clone, derive_more::From, derive_more::Deref,
548)]
549pub struct PushRequest(GetRequest);
550
551impl PushRequest {
552    pub fn new(hash: Hash, ranges: ChunkRangesSeq) -> Self {
553        Self(GetRequest::new(hash, ranges))
554    }
555}
556
557/// A GetMany request is a request to get multiple blobs via a single request.
558///
559/// It is identical to a [`GetRequest`] for a HashSeq, but the HashSeq is provided
560/// by the requester.
561#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone)]
562pub struct GetManyRequest {
563    /// The hashes of the blobs to get
564    pub hashes: Vec<Hash>,
565    /// The ranges of data to request
566    ///
567    /// There is no range request for the parent, since we just sent the hashes
568    /// and therefore have the parent already.
569    pub ranges: ChunkRangesSeq,
570}
571
572impl<I: Into<Hash>> FromIterator<I> for GetManyRequest {
573    fn from_iter<T: IntoIterator<Item = I>>(iter: T) -> Self {
574        let mut res = iter.into_iter().map(Into::into).collect::<Vec<Hash>>();
575        res.sort();
576        res.dedup();
577        let n = res.len() as u64;
578        Self {
579            hashes: res,
580            ranges: ChunkRangesSeq(smallvec::smallvec![
581                (0, ChunkRanges::all()),
582                (n, ChunkRanges::empty())
583            ]),
584        }
585    }
586}
587
588impl GetManyRequest {
589    pub fn new(hashes: Vec<Hash>, ranges: ChunkRangesSeq) -> Self {
590        Self { hashes, ranges }
591    }
592
593    pub fn builder() -> builder::GetManyRequestBuilder {
594        builder::GetManyRequestBuilder::default()
595    }
596}
597
598/// A request to observe a raw blob bitfield.
599#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone, Hash)]
600pub struct ObserveRequest {
601    /// blake3 hash
602    pub hash: Hash,
603    /// ranges to observe.
604    pub ranges: RangeSpec,
605}
606
607impl ObserveRequest {
608    pub fn new(hash: Hash) -> Self {
609        Self {
610            hash,
611            ranges: RangeSpec::all(),
612        }
613    }
614}
615
616#[derive(Deserialize, Serialize, Debug, PartialEq, Eq)]
617pub struct ObserveItem {
618    pub size: u64,
619    pub ranges: ChunkRanges,
620}
621
622impl From<&Bitfield> for ObserveItem {
623    fn from(value: &Bitfield) -> Self {
624        Self {
625            size: value.size,
626            ranges: value.ranges.clone(),
627        }
628    }
629}
630
631impl From<&ObserveItem> for Bitfield {
632    fn from(value: &ObserveItem) -> Self {
633        Self {
634            size: value.size,
635            ranges: value.ranges.clone(),
636        }
637    }
638}
639
640/// Reasons to close connections or stop streams.
641///
642/// A QUIC **connection** can be *closed* and a **stream** can request the other side to
643/// *stop* sending data.  Both closing and stopping have an associated `error_code`, closing
644/// also adds a `reason` as some arbitrary bytes.
645///
646/// This enum exists so we have a single namespace for `error_code`s used.
647#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
648#[repr(u16)]
649pub enum Closed {
650    /// The [`RecvStream`] was dropped.
651    ///
652    /// Used implicitly when a [`RecvStream`] is dropped without explicit call to
653    /// [`RecvStream::stop`].  We don't use this explicitly but this is here as
654    /// documentation as to what happened to `0`.
655    ///
656    /// [`RecvStream`]: iroh::endpoint::RecvStream
657    /// [`RecvStream::stop`]: iroh::endpoint::RecvStream::stop
658    StreamDropped = 0,
659    /// The provider is terminating.
660    ///
661    /// When a provider terminates all connections and associated streams are closed.
662    ProviderTerminating = 1,
663    /// The provider has received the request.
664    ///
665    /// Only a single request is allowed on a stream, if more data is received after this a
666    /// provider may send this error code in a STOP_STREAM frame.
667    RequestReceived = 2,
668}
669
670impl Closed {
671    /// The close reason as bytes. This is a valid utf8 string describing the reason.
672    pub fn reason(&self) -> &'static [u8] {
673        match self {
674            Closed::StreamDropped => b"stream dropped",
675            Closed::ProviderTerminating => b"provider terminating",
676            Closed::RequestReceived => b"request received",
677        }
678    }
679}
680
681impl From<Closed> for VarInt {
682    fn from(source: Closed) -> Self {
683        VarInt::from(source as u16)
684    }
685}
686
687/// Unknown error_code, can not be converted into [`Closed`].
688#[derive(Debug, Snafu)]
689#[snafu(display("Unknown error_code: {code}"))]
690pub struct UnknownErrorCode {
691    code: u64,
692    backtrace: Option<snafu::Backtrace>,
693}
694
695impl UnknownErrorCode {
696    pub(crate) fn new(code: u64) -> Self {
697        Self {
698            code,
699            backtrace: GenerateImplicitData::generate(),
700        }
701    }
702}
703
704impl TryFrom<VarInt> for Closed {
705    type Error = UnknownErrorCode;
706
707    fn try_from(value: VarInt) -> std::result::Result<Self, Self::Error> {
708        match value.into_inner() {
709            0 => Ok(Self::StreamDropped),
710            1 => Ok(Self::ProviderTerminating),
711            2 => Ok(Self::RequestReceived),
712            val => Err(UnknownErrorCode::new(val)),
713        }
714    }
715}
716
717pub mod builder {
718    use std::collections::BTreeMap;
719
720    use bao_tree::ChunkRanges;
721
722    use super::ChunkRangesSeq;
723    use crate::{
724        protocol::{GetManyRequest, GetRequest},
725        Hash,
726    };
727
728    #[derive(Debug, Clone, Default)]
729    pub struct ChunkRangesSeqBuilder {
730        ranges: BTreeMap<u64, ChunkRanges>,
731    }
732
733    #[derive(Debug, Clone, Default)]
734    pub struct GetRequestBuilder {
735        builder: ChunkRangesSeqBuilder,
736    }
737
738    impl GetRequestBuilder {
739        /// Add a range to the request.
740        pub fn offset(mut self, offset: u64, ranges: impl Into<ChunkRanges>) -> Self {
741            self.builder = self.builder.offset(offset, ranges);
742            self
743        }
744
745        /// Add a range to the request.
746        pub fn child(mut self, child: u64, ranges: impl Into<ChunkRanges>) -> Self {
747            self.builder = self.builder.offset(child + 1, ranges);
748            self
749        }
750
751        /// Specify ranges for the root blob (the HashSeq)
752        pub fn root(mut self, ranges: impl Into<ChunkRanges>) -> Self {
753            self.builder = self.builder.offset(0, ranges);
754            self
755        }
756
757        /// Specify ranges for the next offset.
758        pub fn next(mut self, ranges: impl Into<ChunkRanges>) -> Self {
759            self.builder = self.builder.next(ranges);
760            self
761        }
762
763        /// Build a get request for the given hash, with the ranges specified in the builder.
764        pub fn build(self, hash: impl Into<Hash>) -> GetRequest {
765            let ranges = self.builder.build();
766            GetRequest::new(hash.into(), ranges)
767        }
768
769        /// Build a get request for the given hash, with the ranges specified in the builder
770        /// and the last non-empty range repeating indefinitely.
771        pub fn build_open(self, hash: impl Into<Hash>) -> GetRequest {
772            let ranges = self.builder.build_open();
773            GetRequest::new(hash.into(), ranges)
774        }
775    }
776
777    impl ChunkRangesSeqBuilder {
778        /// Add a range to the request.
779        pub fn offset(self, offset: u64, ranges: impl Into<ChunkRanges>) -> Self {
780            self.at_offset(offset, ranges.into())
781        }
782
783        /// Specify ranges for the next offset.
784        pub fn next(self, ranges: impl Into<ChunkRanges>) -> Self {
785            let offset = self.next_offset_value();
786            self.at_offset(offset, ranges.into())
787        }
788
789        /// Build a get request for the given hash, with the ranges specified in the builder.
790        pub fn build(self) -> ChunkRangesSeq {
791            ChunkRangesSeq::from_ranges(self.build0())
792        }
793
794        /// Build a get request for the given hash, with the ranges specified in the builder
795        /// and the last non-empty range repeating indefinitely.
796        pub fn build_open(self) -> ChunkRangesSeq {
797            ChunkRangesSeq::from_ranges_infinite(self.build0())
798        }
799
800        /// Add ranges at the given offset.
801        fn at_offset(mut self, offset: u64, ranges: ChunkRanges) -> Self {
802            self.ranges
803                .entry(offset)
804                .and_modify(|v| *v |= ranges.clone())
805                .or_insert(ranges);
806            self
807        }
808
809        /// Build the request.
810        fn build0(mut self) -> impl Iterator<Item = ChunkRanges> {
811            let mut ranges = Vec::new();
812            self.ranges.retain(|_, v| !v.is_empty());
813            let until_key = self.next_offset_value();
814            for offset in 0..until_key {
815                ranges.push(self.ranges.remove(&offset).unwrap_or_default());
816            }
817            ranges.into_iter()
818        }
819
820        /// Get the next offset value.
821        fn next_offset_value(&self) -> u64 {
822            self.ranges
823                .last_key_value()
824                .map(|(k, _)| *k + 1)
825                .unwrap_or_default()
826        }
827    }
828
829    #[derive(Debug, Clone, Default)]
830    pub struct GetManyRequestBuilder {
831        ranges: BTreeMap<Hash, ChunkRanges>,
832    }
833
834    impl GetManyRequestBuilder {
835        /// Specify ranges for the given hash.
836        ///
837        /// Note that if you specify a hash that is already in the request, the ranges will be
838        /// merged with the existing ranges.
839        pub fn hash(mut self, hash: impl Into<Hash>, ranges: impl Into<ChunkRanges>) -> Self {
840            let ranges = ranges.into();
841            let hash = hash.into();
842            self.ranges
843                .entry(hash)
844                .and_modify(|v| *v |= ranges.clone())
845                .or_insert(ranges);
846            self
847        }
848
849        /// Build a `GetManyRequest`.
850        pub fn build(self) -> GetManyRequest {
851            let (hashes, ranges): (Vec<Hash>, Vec<ChunkRanges>) = self
852                .ranges
853                .into_iter()
854                .filter(|(_, v)| !v.is_empty())
855                .unzip();
856            let ranges = ChunkRangesSeq::from_ranges(ranges);
857            GetManyRequest { hashes, ranges }
858        }
859    }
860
861    #[cfg(test)]
862    mod tests {
863        use bao_tree::ChunkNum;
864
865        use super::*;
866        use crate::{protocol::GetManyRequest, util::ChunkRangesExt};
867
868        #[test]
869        fn chunk_ranges_ext() {
870            let ranges = ChunkRanges::bytes(1..2)
871                | ChunkRanges::chunks(100..=200)
872                | ChunkRanges::offset(1024 * 10)
873                | ChunkRanges::chunk(1024)
874                | ChunkRanges::last_chunk();
875            assert_eq!(
876                ranges,
877                ChunkRanges::from(ChunkNum(0)..ChunkNum(1)) // byte range 1..2
878                    | ChunkRanges::from(ChunkNum(10)..ChunkNum(11)) // chunk at byte offset 1024*10
879                    | ChunkRanges::from(ChunkNum(100)..ChunkNum(201)) // chunk range 100..=200
880                    | ChunkRanges::from(ChunkNum(1024)..ChunkNum(1025)) // chunk 1024
881                    | ChunkRanges::last_chunk() // last chunk
882            );
883        }
884
885        #[test]
886        fn get_request_builder() {
887            let hash = [0; 32];
888            let request = GetRequest::builder()
889                .root(ChunkRanges::all())
890                .next(ChunkRanges::all())
891                .next(ChunkRanges::bytes(0..100))
892                .build(hash);
893            assert_eq!(request.hash.as_bytes(), &hash);
894            assert_eq!(
895                request.ranges,
896                ChunkRangesSeq::from_ranges([
897                    ChunkRanges::all(),
898                    ChunkRanges::all(),
899                    ChunkRanges::from(..ChunkNum(1)),
900                ])
901            );
902
903            let request = GetRequest::builder()
904                .root(ChunkRanges::all())
905                .child(2, ChunkRanges::bytes(0..100))
906                .build(hash);
907            assert_eq!(request.hash.as_bytes(), &hash);
908            assert_eq!(
909                request.ranges,
910                ChunkRangesSeq::from_ranges([
911                    ChunkRanges::all(),               // root
912                    ChunkRanges::empty(),             // child 0
913                    ChunkRanges::empty(),             // child 1
914                    ChunkRanges::from(..ChunkNum(1))  // child 2,
915                ])
916            );
917
918            let request = GetRequest::builder()
919                .root(ChunkRanges::all())
920                .next(ChunkRanges::bytes(0..1024) | ChunkRanges::last_chunk())
921                .build_open(hash);
922            assert_eq!(request.hash.as_bytes(), &[0; 32]);
923            assert_eq!(
924                request.ranges,
925                ChunkRangesSeq::from_ranges_infinite([
926                    ChunkRanges::all(),
927                    ChunkRanges::from(..ChunkNum(1)) | ChunkRanges::last_chunk(),
928                ])
929            );
930        }
931
932        #[test]
933        fn get_many_request_builder() {
934            let hash1 = [0; 32];
935            let hash2 = [1; 32];
936            let hash3 = [2; 32];
937            let request = GetManyRequest::builder()
938                .hash(hash1, ChunkRanges::all())
939                .hash(hash2, ChunkRanges::empty()) // will be ignored!
940                .hash(hash3, ChunkRanges::bytes(0..100))
941                .build();
942            assert_eq!(
943                request.hashes,
944                vec![Hash::from([0; 32]), Hash::from([2; 32])]
945            );
946            assert_eq!(
947                request.ranges,
948                ChunkRangesSeq::from_ranges([
949                    ChunkRanges::all(),               // hash 0
950                    ChunkRanges::from(..ChunkNum(1)), // hash 2
951                ])
952            );
953        }
954    }
955}
956
957#[cfg(test)]
958mod tests {
959    use iroh_test::{assert_eq_hex, hexdump::parse_hexdump};
960    use postcard::experimental::max_size::MaxSize;
961
962    use super::{GetRequest, Request, RequestType};
963    use crate::Hash;
964
965    #[test]
966    fn request_wire_format() {
967        let hash: Hash = [0xda; 32].into();
968        let cases = [
969            (
970                Request::from(GetRequest::blob(hash)),
971                r"
972                    00 # enum variant for GetRequest
973                    dadadadadadadadadadadadadadadadadadadadadadadadadadadadadadadada # the hash
974                    020001000100 # the ChunkRangesSeq
975            ",
976            ),
977            (
978                Request::from(GetRequest::all(hash)),
979                r"
980                    00 # enum variant for GetRequest
981                    dadadadadadadadadadadadadadadadadadadadadadadadadadadadadadadada # the hash
982                    01000100 # the ChunkRangesSeq
983            ",
984            ),
985        ];
986        for (case, expected_hex) in cases {
987            let expected = parse_hexdump(expected_hex).unwrap();
988            let bytes = postcard::to_stdvec(&case).unwrap();
989            assert_eq_hex!(bytes, expected);
990        }
991    }
992
993    #[test]
994    fn request_type_size() {
995        assert_eq!(RequestType::POSTCARD_MAX_SIZE, 1);
996    }
997}