hugr_model/v0/
mod.rs

1//! Version 0 (unstable).
2//!
3//! **Warning**: This module is still under development and is expected to change.
4//! It is included in the library to allow for early experimentation, and for
5//! the core and model to converge incrementally.
6//!
7//! This module defines representations of the hugr IR as plain data, designed
8//! to be as independent of implementation details as feasible. It can be used
9//! by the core compiler, alternative implementations or tooling that does not
10//! need the power/complexity of the full compiler. We provide the following
11//! in-memory representations:
12//!
13//! - [Table]: Efficient intermediate data structure to facilitate conversions.
14//! - [AST]: Abstract syntax tree that uses direct references rather than table indices.
15//!
16//! The table and AST format are interconvertible and can be serialised to
17//! a binary and text format, respectively:
18//!
19//! - [Binary]: Binary serialisation format optimised for performance and size.
20//! - [Text]: Human readable s-expression based text format.
21//!
22//! # Logical Format
23//!
24//! The hugr IR is a hierarchical graph data structure. __Nodes__ represent both
25//! __instructions__ that manipulate runtime values and __symbols__ which
26//! represent named language objects. Instructions have __input__ and __output__ ports
27//! and runtime values flow between ports when they are connected by a __link__.
28//!
29//! Nodes are organised into __regions__ and do not have any explicit ordering
30//! between them; any schedule that respects the data dependencies between nodes
31//! is valid. Regions come in three different kinds. __Module regions__ form the
32//! top level of a module and can only contain symbols. __Dataflow regions__
33//! describe how data flows from the region's __source__ ports to the region's
34//! __target__ ports. __Controlflow regions__ are control flow graphs containing
35//! dataflow __blocks__, with control flow originating from the region's source
36//! ports and ending in the region's target ports.
37//!
38//! __Terms__ form a meta language that is used to describe types, parameters and metadata that
39//! are known statically. To allow types to be parameterized by values, types and values
40//! are treated uniformly as terms, enabling a restricted form of dependent typing.
41//! Terms are extensible declaratively via __constructors__.
42//! __Constraints__ can be used to express more complex validation rules.
43//!
44//! # Remaining Mismatch with `hugr-core`
45//!
46//! This data model was designed to encode as much of `hugr-core` as possible while also
47//! filling in conceptual gaps and providing a forward-compatible foundation for future
48//! development. However, there are still some mismatches with `hugr-core` that are not
49//! addressed by conversions in import/export:
50//!
51//! - Some static types can not yet be represented in `hugr-core` although they should be.
52//! - The model does not have types with a copy bound as `hugr-core` does, and instead uses
53//!   a more general form of type constraints ([#1556]). Similarly, the model does not have
54//!   bounded naturals. We perform a conversion for compatibility where possible, but this does
55//!   not fully cover all potential cases of bounds.
56//! - `hugr-model` allows to declare term constructors that serve as blueprints for constructing
57//!   runtime values. This allows constants to have potentially multiple representations,
58//!   which can be essential in case of very large constants that require efficient encodings.
59//!   `hugr-core` is more restricted, requiring a canonical representation for constant values.
60//! - `hugr-model` has support for passing closed regions as static parameters to operations,
61//!   which allows for higher-order operations that require their function arguments to be
62//!   statically known. We currently do not yet support converting this to `hugr-core`.
63//! - In a model module, ports are connected when they share the same link. This differs from
64//!   the type of port connectivity in the graph data structure used by `hugr-core`. However,
65//!   `hugr-core` restricts connectivity so that in any group of connected ports there is at
66//!   most one output port (for dataflow) or at most one input port (for control flow). In
67//!   these cases, there is no mismatch.
68//! - `hugr-core` only allows to define type aliases, but not aliases for other terms.
69//! - `hugr-model` has no concept of order edges, encoding a strong preference that ordering
70//!   requirements be encoded within the dataflow paradigm.
71//! - Both `hugr-model` and `hugr-core` support metadata, but they use different encodings.
72//!   `hugr-core` encodes metadata as JSON objects, while `hugr-model` uses terms. Using
73//!   terms has the advantage that metadata can be validated with the same type checking
74//!   mechanism as the rest of the model ([#1553]).
75//! - `hugr-model` have a root region that corresponds to a root `Module` in `hugr-core`.
76//!   `hugr-core` however can have nodes with different operations as their root ([#1554]).
77//!
78//! [#1556]: https://github.com/CQCL/hugr/discussions/1556
79//! [#1553]: https://github.com/CQCL/hugr/issues/1553
80//! [#1554]: https://github.com/CQCL/hugr/issues/1554
81//! [Text]: crate::v0::ast
82//! [Binary]: crate::v0::binary
83//! [Table]: crate::v0::table
84//! [AST]: crate::v0::ast
85use ordered_float::OrderedFloat;
86#[cfg(feature = "pyo3")]
87use pyo3::types::PyAnyMethods as _;
88#[cfg(feature = "pyo3")]
89use pyo3::PyTypeInfo as _;
90use smol_str::SmolStr;
91use std::sync::Arc;
92use table::LinkIndex;
93
94/// Core function types.
95///
96/// - **Parameter:** `?inputs : (core.list core.type)`
97/// - **Parameter:** `?outputs : (core.list core.type)`
98/// - **Result:** `core.type`
99pub const CORE_FN: &str = "core.fn";
100
101/// The type of runtime types.
102///
103/// Runtime types are the types of values that can flow between nodes at runtime.
104///
105/// - **Result:** `?type : core.static`
106pub const CORE_TYPE: &str = "core.type";
107
108/// The type of static types.
109///
110/// Static types are the types of statically known parameters.
111///
112/// This is the only term that is its own type.
113///
114/// - **Result:** `?type : core.static`
115pub const CORE_STATIC: &str = "core.static";
116
117/// The type of constraints.
118///
119/// - **Result:** `?type : core.static`
120pub const CORE_CONSTRAINT: &str = "core.constraint";
121
122/// The constraint for non-linear runtime data.
123///
124/// Runtime values are copied implicitly by connecting an output port to more
125/// than one input port. Similarly runtime values can be deleted implicitly when
126/// an output port is not connected to any input port. In either of these cases
127/// the type of the runtime value must satisfy this constraint.
128///
129/// - **Parameter:** `?type : core.type`
130/// - **Result:** `core.constraint`
131pub const CORE_NON_LINEAR: &str = "core.nonlinear";
132
133/// The type of metadata.
134///
135/// - **Result:** `?type : core.static`
136pub const CORE_META: &str = "core.meta";
137
138/// Runtime algebraic data types.
139///
140/// Algebraic data types are sums of products of other runtime types.
141///
142/// - **Parameter:** `?variants : (core.list (core.list core.type))`
143/// - **Result:** `core.type`
144pub const CORE_ADT: &str = "core.adt";
145
146/// Type of string literals.
147///
148/// - **Result:** `core.static`
149pub const CORE_STR_TYPE: &str = "core.str";
150
151/// Type of natural number literals.
152///
153/// - **Result:** `core.static`
154pub const CORE_NAT_TYPE: &str = "core.nat";
155
156/// Type of bytes literals.
157///
158/// - **Result:** `core.static`
159pub const CORE_BYTES_TYPE: &str = "core.bytes";
160
161/// Type of float literals.
162///
163/// - **Result:** `core.static`
164pub const CORE_FLOAT_TYPE: &str = "core.float";
165
166/// Type of a control flow edge.
167///
168/// - **Parameter:** `?types : (core.list core.type)`
169/// - **Result:** `core.ctrl_type`
170pub const CORE_CTRL: &str = "core.ctrl";
171
172/// The type of the types for control flow edges.
173///
174/// - **Result:** `?type : core.static`
175pub const CORE_CTRL_TYPE: &str = "core.ctrl_type";
176
177/// The type for runtime constants.
178///
179/// - **Parameter:** `?type : core.type`
180/// - **Result:** `core.static`
181pub const CORE_CONST: &str = "core.const";
182
183/// Constants for runtime algebraic data types.
184///
185/// - **Parameter:** `?variants : (core.list core.type)`
186/// - **Parameter:** `?types : (core.list core.static)`
187/// - **Parameter:** `?tag : core.nat`
188/// - **Parameter:** `?values : (core.tuple ?types)`
189/// - **Result:** `(core.const (core.adt ?variants))`
190pub const CORE_CONST_ADT: &str = "core.const.adt";
191
192/// The type for lists of static data.
193///
194/// Lists are finite sequences such that all elements have the same type.
195/// For heterogeneous sequences, see [`CORE_TUPLE_TYPE`].
196///
197/// - **Parameter:** `?type : core.static`
198/// - **Result:** `core.static`
199pub const CORE_LIST_TYPE: &str = "core.list";
200
201/// The type for tuples of static data.
202///
203/// Tuples are finite sequences that allow elements to have different types.
204/// For homogeneous sequences, see [`CORE_LIST_TYPE`].
205///
206/// - **Parameter:** `?types : (core.list core.static)`
207/// - **Result:** `core.static`
208pub const CORE_TUPLE_TYPE: &str = "core.tuple";
209
210/// Operation to call a statically known function.
211///
212/// - **Parameter:** `?inputs : (core.list core.type)`
213/// - **Parameter:** `?outputs : (core.list core.type)`
214/// - **Parameter:** `?func : (core.const (core.fn ?inputs ?outputs))`
215/// - **Result:** `(core.fn ?inputs ?outputs ?ext)`
216pub const CORE_CALL: &str = "core.call";
217
218/// Operation to call a functiion known at runtime.
219///
220/// - **Parameter:** `?inputs : (core.list core.type)`
221/// - **Parameter:** `?outputs : (core.list core.type)`
222/// - **Result:** `(core.fn [(core.fn ?inputs ?outputs) ?inputs ...] ?outputs)`
223pub const CORE_CALL_INDIRECT: &str = "core.call_indirect";
224
225/// Operation to load a constant value.
226///
227/// - **Parameter:** `?type : core.type`
228/// - **Parameter:** `?value : (core.const ?type)`
229/// - **Result:** `(core.fn [] [?type])`
230pub const CORE_LOAD_CONST: &str = "core.load_const";
231
232/// Operation to create a value of an algebraic data type.
233///
234/// - **Parameter:** `?variants : (core.list (core.list core.type))`
235/// - **Parameter:** `?types : (core.list core.type)`
236/// - **Parameter:** `?tag : core.nat`
237/// - **Result:** `(core.fn ?types [(core.adt ?variants)])`
238pub const CORE_MAKE_ADT: &str = "core.make_adt";
239
240/// Constructor for documentation metadata.
241///
242/// - **Parameter:** `?description : core.str`
243/// - **Result:** `core.meta`
244pub const CORE_META_DESCRIPTION: &str = "core.meta.description";
245
246/// Constructor for JSON encoded metadata.
247///
248/// This is included in the model to allow for compatibility with `hugr-core`.
249/// The intention is to deprecate this in the future in favor of metadata
250/// expressed with custom constructors.
251///
252/// - **Parameter:** `?name : core.str`
253/// - **Parameter:** `?json : core.str`
254/// - **Result:** `core.meta`
255pub const COMPAT_META_JSON: &str = "compat.meta_json";
256
257/// Constructor for JSON encoded constants.
258///
259/// This is included in the model to allow for compatibility with `hugr-core`.
260/// The intention is to deprecate this in the future in favor of constants
261/// expressed with custom constructors.
262///
263/// - **Parameter:** `?type : core.type`
264/// - **Parameter:** `?json : core.str`
265/// - **Result:** `(core.const ?type)`
266pub const COMPAT_CONST_JSON: &str = "compat.const_json";
267
268/// Metadata constructor for order hint keys.
269///
270/// Nodes in a dataflow region can be annotated with a key. Each node may have
271/// at most one key and the key must be unique among all nodes in the same
272/// dataflow region. The parent dataflow graph can then use the
273/// `order_hint.order` metadata to imply a desired ordering relation, referring
274/// to the nodes by their key.
275///
276/// - **Parameter:** `?key : core.nat`
277/// - **Result:** `core.meta`
278pub const ORDER_HINT_KEY: &str = "core.order_hint.key";
279
280/// Metadata constructor for order hints.
281///
282/// When this metadata is attached to a dataflow region, it can indicate a
283/// preferred ordering relation between child nodes. Code generation must take
284/// this into account when deciding on an execution order. The child nodes are
285/// identified by a key, using the `order_hint.key` metadata.
286///
287/// The graph consisting of both value dependencies between nodes and order
288/// hints must be directed acyclic.
289///
290/// - **Parameter:** `?before : core.nat`
291/// - **Parameter:** `?after : core.nat`
292/// - **Result:** `core.meta`
293pub const ORDER_HINT_ORDER: &str = "core.order_hint.order";
294
295pub mod ast;
296pub mod binary;
297pub mod scope;
298pub mod table;
299
300pub use bumpalo;
301
302/// Type to indicate whether scopes are open or closed.
303#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
304pub enum ScopeClosure {
305    /// A scope that is open and therefore not isolated from its parent scope.
306    #[default]
307    Open,
308    /// A scope that is closed and therefore isolated from its parent scope.
309    Closed,
310}
311
312#[cfg(feature = "pyo3")]
313impl<'py> pyo3::FromPyObject<'py> for ScopeClosure {
314    fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
315        let value: usize = ob.getattr("value")?.extract()?;
316        match value {
317            0 => Ok(Self::Open),
318            1 => Ok(Self::Closed),
319            _ => Err(pyo3::exceptions::PyTypeError::new_err(
320                "Invalid ScopeClosure.",
321            )),
322        }
323    }
324}
325
326#[cfg(feature = "pyo3")]
327impl<'py> pyo3::IntoPyObject<'py> for ScopeClosure {
328    type Target = pyo3::PyAny;
329    type Output = pyo3::Bound<'py, Self::Target>;
330    type Error = pyo3::PyErr;
331
332    fn into_pyobject(self, py: pyo3::Python<'py>) -> Result<Self::Output, Self::Error> {
333        let py_module = py.import("hugr.model")?;
334        let py_class = py_module.getattr("ScopeClosure")?;
335
336        match self {
337            ScopeClosure::Open => py_class.getattr("OPEN"),
338            ScopeClosure::Closed => py_class.getattr("CLOSED"),
339        }
340    }
341}
342
343/// The kind of a region.
344#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
345pub enum RegionKind {
346    /// Data flow region.
347    #[default]
348    DataFlow = 0,
349    /// Control flow region.
350    ControlFlow = 1,
351    /// Module region.
352    Module = 2,
353}
354
355#[cfg(feature = "pyo3")]
356impl<'py> pyo3::FromPyObject<'py> for RegionKind {
357    fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
358        let value: usize = ob.getattr("value")?.extract()?;
359        match value {
360            0 => Ok(Self::DataFlow),
361            1 => Ok(Self::ControlFlow),
362            2 => Ok(Self::Module),
363            _ => Err(pyo3::exceptions::PyTypeError::new_err(
364                "Invalid RegionKind.",
365            )),
366        }
367    }
368}
369
370#[cfg(feature = "pyo3")]
371impl<'py> pyo3::IntoPyObject<'py> for RegionKind {
372    type Target = pyo3::PyAny;
373    type Output = pyo3::Bound<'py, Self::Target>;
374    type Error = pyo3::PyErr;
375
376    fn into_pyobject(self, py: pyo3::Python<'py>) -> Result<Self::Output, Self::Error> {
377        let py_module = py.import("hugr.model")?;
378        let py_class = py_module.getattr("RegionKind")?;
379
380        match self {
381            RegionKind::DataFlow => py_class.getattr("DATA_FLOW"),
382            RegionKind::ControlFlow => py_class.getattr("CONTROL_FLOW"),
383            RegionKind::Module => py_class.getattr("MODULE"),
384        }
385    }
386}
387
388/// The name of a variable.
389#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
390pub struct VarName(SmolStr);
391
392impl VarName {
393    /// Create a new variable name.
394    pub fn new(name: impl Into<SmolStr>) -> Self {
395        Self(name.into())
396    }
397}
398
399impl AsRef<str> for VarName {
400    fn as_ref(&self) -> &str {
401        self.0.as_ref()
402    }
403}
404
405#[cfg(feature = "pyo3")]
406impl<'py> pyo3::FromPyObject<'py> for VarName {
407    fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
408        let name: String = ob.extract()?;
409        Ok(Self::new(name))
410    }
411}
412
413#[cfg(feature = "pyo3")]
414impl<'py> pyo3::IntoPyObject<'py> for &VarName {
415    type Target = pyo3::types::PyString;
416    type Output = pyo3::Bound<'py, Self::Target>;
417    type Error = pyo3::PyErr;
418
419    fn into_pyobject(self, py: pyo3::Python<'py>) -> Result<Self::Output, Self::Error> {
420        Ok(self.as_ref().into_pyobject(py)?)
421    }
422}
423
424/// The name of a symbol.
425#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
426pub struct SymbolName(SmolStr);
427
428impl SymbolName {
429    /// Create a new symbol name.
430    pub fn new(name: impl Into<SmolStr>) -> Self {
431        Self(name.into())
432    }
433}
434
435impl AsRef<str> for SymbolName {
436    fn as_ref(&self) -> &str {
437        self.0.as_ref()
438    }
439}
440
441#[cfg(feature = "pyo3")]
442impl<'py> pyo3::FromPyObject<'py> for SymbolName {
443    fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
444        let name: String = ob.extract()?;
445        Ok(Self::new(name))
446    }
447}
448
449/// The name of a link.
450#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
451pub struct LinkName(SmolStr);
452
453impl LinkName {
454    /// Create a new link name.
455    pub fn new(name: impl Into<SmolStr>) -> Self {
456        Self(name.into())
457    }
458
459    /// Create a new link name from a link index.
460    pub fn new_index(index: LinkIndex) -> Self {
461        // TODO: Should named and numbered links have different namespaces?
462        Self(format!("{}", index).into())
463    }
464}
465
466impl AsRef<str> for LinkName {
467    fn as_ref(&self) -> &str {
468        self.0.as_ref()
469    }
470}
471
472#[cfg(feature = "pyo3")]
473impl<'py> pyo3::FromPyObject<'py> for LinkName {
474    fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
475        let name: String = ob.extract()?;
476        Ok(Self::new(name))
477    }
478}
479
480#[cfg(feature = "pyo3")]
481impl<'py> pyo3::IntoPyObject<'py> for &LinkName {
482    type Target = pyo3::types::PyString;
483    type Output = pyo3::Bound<'py, Self::Target>;
484    type Error = pyo3::PyErr;
485
486    fn into_pyobject(self, py: pyo3::Python<'py>) -> Result<Self::Output, Self::Error> {
487        Ok(self.as_ref().into_pyobject(py)?)
488    }
489}
490
491/// A static literal value.
492///
493/// Literal values may be large since they can include strings and byte
494/// sequences of arbitrary length. To enable cheap cloning and sharing,
495/// strings and byte sequences use reference counting.
496#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
497pub enum Literal {
498    /// String literal.
499    Str(SmolStr),
500    /// Natural number literal (unsigned 64 bit).
501    Nat(u64),
502    /// Byte sequence literal.
503    Bytes(Arc<[u8]>),
504    /// Floating point literal
505    Float(OrderedFloat<f64>),
506}
507
508#[cfg(feature = "pyo3")]
509impl<'py> pyo3::FromPyObject<'py> for Literal {
510    fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
511        if pyo3::types::PyString::is_type_of(ob) {
512            let value: String = ob.extract()?;
513            Ok(Literal::Str(value.into()))
514        } else if pyo3::types::PyInt::is_type_of(ob) {
515            let value: u64 = ob.extract()?;
516            Ok(Literal::Nat(value))
517        } else if pyo3::types::PyFloat::is_type_of(ob) {
518            let value: f64 = ob.extract()?;
519            Ok(Literal::Float(value.into()))
520        } else if pyo3::types::PyBytes::is_type_of(ob) {
521            let value: Vec<u8> = ob.extract()?;
522            Ok(Literal::Bytes(value.into()))
523        } else {
524            Err(pyo3::exceptions::PyTypeError::new_err(
525                "Invalid literal value.",
526            ))
527        }
528    }
529}
530
531#[cfg(feature = "pyo3")]
532impl<'py> pyo3::IntoPyObject<'py> for &Literal {
533    type Target = pyo3::PyAny;
534    type Output = pyo3::Bound<'py, Self::Target>;
535    type Error = pyo3::PyErr;
536
537    fn into_pyobject(self, py: pyo3::Python<'py>) -> Result<Self::Output, Self::Error> {
538        Ok(match self {
539            Literal::Str(s) => s.as_str().into_pyobject(py)?.into_any(),
540            Literal::Nat(n) => n.into_pyobject(py)?.into_any(),
541            Literal::Bytes(b) => pyo3::types::PyBytes::new(py, b)
542                .into_pyobject(py)?
543                .into_any(),
544            Literal::Float(f) => f.0.into_pyobject(py)?.into_any(),
545        })
546    }
547}
548
549#[cfg(test)]
550mod test {
551    use super::*;
552    use proptest::{prelude::*, string::string_regex};
553
554    impl Arbitrary for Literal {
555        type Parameters = ();
556        type Strategy = BoxedStrategy<Self>;
557
558        fn arbitrary_with(_args: Self::Parameters) -> Self::Strategy {
559            prop_oneof![
560                any::<String>().prop_map(|s| Literal::Str(s.into())),
561                any::<u64>().prop_map(Literal::Nat),
562                prop::collection::vec(any::<u8>(), 0..100).prop_map(|v| Literal::Bytes(v.into())),
563                any::<f64>().prop_map(|f| Literal::Float(OrderedFloat(f)))
564            ]
565            .boxed()
566        }
567    }
568
569    impl Arbitrary for SymbolName {
570        type Parameters = ();
571        type Strategy = BoxedStrategy<Self>;
572
573        fn arbitrary_with(_args: Self::Parameters) -> Self::Strategy {
574            string_regex(r"[a-zA-Z\-_][0-9a-zA-Z\-_](\.[a-zA-Z\-_][0-9a-zA-Z\-_])*")
575                .unwrap()
576                .prop_map(Self::new)
577                .boxed()
578        }
579    }
580
581    impl Arbitrary for VarName {
582        type Parameters = ();
583        type Strategy = BoxedStrategy<Self>;
584
585        fn arbitrary_with(_args: Self::Parameters) -> Self::Strategy {
586            string_regex(r"[a-zA-Z\-_][0-9a-zA-Z\-_]")
587                .unwrap()
588                .prop_map(Self::new)
589                .boxed()
590        }
591    }
592
593    proptest! {
594        #[test]
595        fn test_literal_text(lit: Literal) {
596            let text = lit.to_string();
597            let parsed: Literal = text.parse().unwrap();
598            assert_eq!(lit, parsed);
599        }
600    }
601}