cboritem 0.1.3

Types for serialized CBOR items
Documentation
//! # cboritem: A serialized CBOR item
//!
//! A [`CborItem<'a>`](CborItem) is a newtype around [`&'a [u8]`](u8) that upholds the invariant of
//! containing a single serialized CBOR item. A [`ThinCborItem<'a>`](ThinCborItem) is its start
//! pointer; when accessing it, users rely on that property to stop reading at the end of the item.
//!
//! In a sense, the types are similar to [`&str`](str) and [`CStr`](core::ffi::CStr), respectively,
//! once the latter follows the [plan] of eventually becoming a thin pointer.
//!
//! ## Library use
//!
//! Their use is for efficiently storing pre-verified slices of CBOR items, eg. when implementing
//! [packed CBOR]. They can also serve as an interface point between CBOR libraries (easing error
//! handling because end-of-stream and bytes-after-the-data errors are panic-worthy invariants;
//! other parsing errors may still occur if there is any well-formed data the parser can not
//! process, or that violates basic validity requirements), and as a marker type by which CBOR
//! parsers can be instructed to process any item into a slice for later detailed inspection.
//!
//! This crate is not a CBOR library, thus it contains no functions to safely create any of its
//! types (as that requires a CBOR parser). Instead, its intention is to be produced by CBOR
//! parsers after they have verified that the invariant upholds.
//!
//! ## Invariant definition
//!
//! The invariant upheld in this crate's types is that their bytes are exactly one well-formed CBOR
//! item as defined in [RFC8949]; in particular, that means they contain at least one byte.
//!
//! The invariants in this crates are **soundness invariants**: Receivers of a CBOR item may not
//! just panic when they find invalid CBOR, but they may invoke undefined behavior (eg. by calling
//! [`unreachable_unchecked`](core::hint::unreachable_unchecked). Consequently, creating a CBOR
//! item requires use of the `unsafe` keyword, with the invariants being checked by the parser that
//! creates the item.
//!
//! This is a necessary consequence of providing thin pointers: The invariants are relied on by
//! users who read through a raw pointer, which on inaccurate data would result in reads beyond the
//! original allocated object, which is undefined behavior.
//!
//! ## Examples
//!
//! ```
//! use cboritem::{CborItem, ThinCborItem};
//! let onehundred = [0x24, 0x64];
//! assert_eq!(onehundred[0], 0x24);
//! let onehundred = unsafe { CborItem::new(&onehundred) };
//! let onehundred: ThinCborItem<'_> = onehundred.as_thin();
//! assert_eq!(core::mem::size_of_val(&onehundred), core::mem::size_of::<&u8>());
//! // One byte can always be read, so we don't need a CBOR parser to tell us it is safe
//! if onehundred.first() == 0x24 {
//!     assert_eq!(unsafe { onehundred.offset(1).read() }, 100);
//! } else {
//!     panic!("Unexpected type or integer size");
//! }
//! ```
//!
//! ## Future development
//!
//! Later versions may add types or variants of the current types (by means of associated types
//! with a default), eg. to descibe additional constraints such as
//!
//! * Embedded strings are UTF-8
//! * No duplicate keys are present
//! * Adhers to the Common Deterministic Encoding
//! * Contains no indefinite-length items
//! * The CBOR item conforms to some particular CDDL structure
//!
//! If any extensions are made that change CBOR's validity rules (eg. i=28 is defined for 128-bit
//! integer arguments), this crate would go through a major release to support them.
//!
//! [plan]: https://internals.rust-lang.org/t/pre-rfc-make-cstr-a-thin-pointer/6258
//! [packed CBOR]: https://datatracker.ietf.org/doc/draft-ietf-cbor-packed/
//! [RFC8949]: https://datatracker.ietf.org/doc/html/rfc8949
#![doc = concat!("## ", "Feature flags")]
#![doc = document_features::document_features!()]
#![no_std]

/// A transparent newtype around a `&[u8]` with the invariant that the slice contains one encoded CBOR item
///
/// Its main use is through dereferencing it into as a slice.
///
/// # Invariants
///
/// The slice contains exactly one well-formed CBOR item.
#[derive(PartialEq, Eq)]
#[repr(transparent)]
pub struct CborItem<'a>(&'a [u8]);

#[cfg(not(feature = "debug-edn"))]
impl core::fmt::Debug for CborItem<'_> {
    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        let mut iter = self.iter();
        if let Some(i) = iter.next() {
            write!(f, "{:02x}", i)?;
            for i in iter {
                write!(f, " {:02x}", i)?;
            }
        }
        Ok(())
    }
}

#[cfg(feature = "debug-edn")]
impl core::fmt::Debug for CborItem<'_> {
    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        let loaded = cbor_edn::Item::from_cbor(self.0).expect("Type invariant requires valid CBOR");
        // Here we could do arbitrarily fancy application literal or inner CBOR recognition.
        let serialized = loaded.serialize();
        f.write_str(&serialized)
    }
}

#[cfg(feature = "defmt")]
impl defmt::Format for CborItem<'_> {
    fn format(&self, fmt: defmt::Formatter) {
        defmt::write!(fmt, "{=[u8]:cbor}", self.0);
    }
}

impl<'a> CborItem<'a> {
    /// Annotate a slice to contain a CBOR item
    ///
    /// # Safety
    ///
    /// `val` must contain exactly one well-formed CBOR item.
    pub const unsafe fn new(val: &'a [u8]) -> Self {
        Self(val)
    }

    /// Discard the length information, returning a slimmer type.
    pub fn as_thin(&self) -> ThinCborItem<'a> {
        ThinCborItem {
            pointer: self.0.as_ptr(),
            _phantom: Default::default(),
        }
    }
}

impl AsRef<[u8]> for CborItem<'_> {
    fn as_ref(&self) -> &[u8] {
        // AsRef documentation: "Types that implement Deref should consider implementing AsRef<T>
        // as follows:" (they also have a `.as_ref()` in there, but that'd trigger clippy as it is
        // not needed in this concrete case)
        use core::ops::Deref;
        self.deref()
    }
}

// FIXME: This should be Target = [u8]. (The &self takes care not to
// produce slices that outlive self, and the produced type should also have the right variance, and
// this may even fix the missing `.as_ref()` in AsRef).
impl<'a> core::ops::Deref for CborItem<'a> {
    type Target = &'a [u8];

    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

/// A thin reference to self-delimited memory that contians a single encoded CBOR item
///
/// # Invariants
///
/// The pointer is readable for the lifetime of the type. There is a single CBOR item behind the
/// pointer, which may be read through it. The position after the CBOR item can be calculated, but
/// reading it may not be allowed.
pub struct ThinCborItem<'a> {
    pointer: *const u8,
    _phantom: core::marker::PhantomData<&'a [u8]>,
}

impl<'a> ThinCborItem<'a> {
    /// Annotate a pointer to contain a CBOR item
    ///
    /// # Safety
    ///
    /// The data behind `ptr` must contain a well-formed CBOR item. It must be readable through the
    /// pointer up to the end of the CBOR item for `'a`.
    pub const unsafe fn new(pointer: *const u8) -> Self {
        Self {
            pointer,
            _phantom: core::marker::PhantomData,
        }
    }

    /// Return the ThinCborItem as a pointer
    ///
    /// This is primarily useful when used with any form of pointer compression or tagging;
    /// the pointer can safely be used with [`Self::new()`] later as long as the lifetimes match.
    pub fn as_ptr(&self) -> *const u8 {
        self.pointer
    }

    /// Access the first byte of the item.
    ///
    /// This is always possible without checks; later accesses are unsafe, and only sound if they
    /// are justified by the content of earlier bytes.
    pub fn first(&self) -> u8 {
        // UNSAFE: By invariant and definition of CBOR
        unsafe { self.pointer.read() }
    }

    /// Restore the pointer into a slice with length
    ///
    /// # Safety
    ///
    /// `length` must be the exact number of bytes that constitute the one well-formed CBOR item
    /// that is present behind self's pointer by the type's invariant.
    pub unsafe fn as_fat(&self, length: usize) -> CborItem<'a> {
        CborItem(core::slice::from_raw_parts(self.pointer, length))
    }
}

impl core::ops::Deref for ThinCborItem<'_> {
    type Target = *const u8;

    fn deref(&self) -> &Self::Target {
        &self.pointer
    }
}

#[cfg(test)]
mod tests {
    // This module is almost empty: All relevant tests (construct an item, read from it, turn it
    // into a thin item, read from there and check that it is thin) are already covered in the few
    // lines of example code in the documentation.

    use super::*;

    // This also tests const constructability
    static NONDETERMINISTIC_U64: CborItem<'static> =
        unsafe { CborItem::new(&[0x1b, 0, 0, 0, 0, 0, 0, 0, 0]) };

    #[test]
    fn as_fat() {
        let thin = NONDETERMINISTIC_U64.as_thin();
        let thin_ptr = thin.as_ptr();
        // UNSAFE: We've obtained thin_ptr from thin, and the ThinCborItem we obtained it from was
        // 'static.
        let thin = unsafe { ThinCborItem::new(thin_ptr) };
        // UNSAFE: Prior knowledge from above
        let fat: &[u8] = *unsafe { thin.as_fat(9) };
        assert_eq!(fat, *NONDETERMINISTIC_U64);
    }

    #[test]
    fn format() {
        extern crate std;
        extern crate alloc;
        assert_eq!(
            std::format!("{:?}", NONDETERMINISTIC_U64),
            alloc::string::String::from(if cfg!(feature = "debug-edn") {
                "0_3"
            } else {
                "1b 00 00 00 00 00 00 00 00"
            })
        );
    }
}