hugr_model/v0/mod.rs
1//! Version 0 (unstable).
2//!
3//! **Warning**: This module is still under development and is expected to change.
4//! It is included in the library to allow for early experimentation, and for
5//! the core and model to converge incrementally.
6//!
7//! This module defines representations of the hugr IR as plain data, designed
8//! to be as independent of implementation details as feasible. It can be used
9//! by the core compiler, alternative implementations or tooling that does not
10//! need the power/complexity of the full compiler. We provide the following
11//! in-memory representations:
12//!
13//! - [Table]: Efficient intermediate data structure to facilitate conversions.
14//! - [AST]: Abstract syntax tree that uses direct references rather than table indices.
15//!
16//! The table and AST format are interconvertible and can be serialised to
17//! a binary and text format, respectively:
18//!
19//! - [Binary]: Binary serialisation format optimised for performance and size.
20//! - [Text]: Human readable s-expression based text format.
21//!
22//! # Logical Format
23//!
24//! The hugr IR is a hierarchical graph data structure. __Nodes__ represent both
25//! __instructions__ that manipulate runtime values and __symbols__ which
26//! represent named language objects. Instructions have __input__ and __output__ ports
27//! and runtime values flow between ports when they are connected by a __link__.
28//!
29//! Nodes are organised into __regions__ and do not have any explicit ordering
30//! between them; any schedule that respects the data dependencies between nodes
31//! is valid. Regions come in three different kinds. __Module regions__ form the
32//! top level of a module and can only contain symbols. __Dataflow regions__
33//! describe how data flows from the region's __source__ ports to the region's
34//! __target__ ports. __Controlflow regions__ are control flow graphs containing
35//! dataflow __blocks__, with control flow originating from the region's source
36//! ports and ending in the region's target ports.
37//!
38//! __Terms__ form a meta language that is used to describe types, parameters and metadata that
39//! are known statically. To allow types to be parameterized by values, types and values
40//! are treated uniformly as terms, enabling a restricted form of dependent typing.
41//! Terms are extensible declaratively via __constructors__.
42//! __Constraints__ can be used to express more complex validation rules.
43//!
44//! # Remaining Mismatch with `hugr-core`
45//!
46//! This data model was designed to encode as much of `hugr-core` as possible while also
47//! filling in conceptual gaps and providing a forward-compatible foundation for future
48//! development. However, there are still some mismatches with `hugr-core` that are not
49//! addressed by conversions in import/export:
50//!
51//! - Some static types can not yet be represented in `hugr-core` although they should be.
52//! - The model does not have types with a copy bound as `hugr-core` does, and instead uses
53//! a more general form of type constraints ([#1556]). Similarly, the model does not have
54//! bounded naturals. We perform a conversion for compatibility where possible, but this does
55//! not fully cover all potential cases of bounds.
56//! - `hugr-model` allows to declare term constructors that serve as blueprints for constructing
57//! runtime values. This allows constants to have potentially multiple representations,
58//! which can be essential in case of very large constants that require efficient encodings.
59//! `hugr-core` is more restricted, requiring a canonical representation for constant values.
60//! - `hugr-model` has support for passing closed regions as static parameters to operations,
61//! which allows for higher-order operations that require their function arguments to be
62//! statically known. We currently do not yet support converting this to `hugr-core`.
63//! - In a model module, ports are connected when they share the same link. This differs from
64//! the type of port connectivity in the graph data structure used by `hugr-core`. However,
65//! `hugr-core` restricts connectivity so that in any group of connected ports there is at
66//! most one output port (for dataflow) or at most one input port (for control flow). In
67//! these cases, there is no mismatch.
68//! - `hugr-core` only allows to define type aliases, but not aliases for other terms.
69//! - `hugr-model` has no concept of order edges, encoding a strong preference that ordering
70//! requirements be encoded within the dataflow paradigm.
71//! - Both `hugr-model` and `hugr-core` support metadata, but they use different encodings.
72//! `hugr-core` encodes metadata as JSON objects, while `hugr-model` uses terms. Using
73//! terms has the advantage that metadata can be validated with the same type checking
74//! mechanism as the rest of the model ([#1553]).
75//! - `hugr-model` have a root region that corresponds to a root `Module` in `hugr-core`.
76//! `hugr-core` however can have nodes with different operations as their root ([#1554]).
77//!
78//! [#1556]: https://github.com/CQCL/hugr/discussions/1556
79//! [#1553]: https://github.com/CQCL/hugr/issues/1553
80//! [#1554]: https://github.com/CQCL/hugr/issues/1554
81//! [Text]: crate::v0::ast
82//! [Binary]: crate::v0::binary
83//! [Table]: crate::v0::table
84//! [AST]: crate::v0::ast
85use ordered_float::OrderedFloat;
86#[cfg(feature = "pyo3")]
87use pyo3::PyTypeInfo as _;
88#[cfg(feature = "pyo3")]
89use pyo3::types::PyAnyMethods as _;
90use smol_str::SmolStr;
91use std::sync::Arc;
92use table::LinkIndex;
93
94/// Core function types.
95///
96/// - **Parameter:** `?inputs : (core.list core.type)`
97/// - **Parameter:** `?outputs : (core.list core.type)`
98/// - **Result:** `core.type`
99pub const CORE_FN: &str = "core.fn";
100
101/// The type of runtime types.
102///
103/// Runtime types are the types of values that can flow between nodes at runtime.
104///
105/// - **Result:** `?type : core.static`
106pub const CORE_TYPE: &str = "core.type";
107
108/// The type of static types.
109///
110/// Static types are the types of statically known parameters.
111///
112/// This is the only term that is its own type.
113///
114/// - **Result:** `?type : core.static`
115pub const CORE_STATIC: &str = "core.static";
116
117/// The type of constraints.
118///
119/// - **Result:** `?type : core.static`
120pub const CORE_CONSTRAINT: &str = "core.constraint";
121
122/// The constraint for non-linear runtime data.
123///
124/// Runtime values are copied implicitly by connecting an output port to more
125/// than one input port. Similarly runtime values can be deleted implicitly when
126/// an output port is not connected to any input port. In either of these cases
127/// the type of the runtime value must satisfy this constraint.
128///
129/// - **Parameter:** `?type : core.type`
130/// - **Result:** `core.constraint`
131pub const CORE_NON_LINEAR: &str = "core.nonlinear";
132
133/// The type of metadata.
134///
135/// - **Result:** `?type : core.static`
136pub const CORE_META: &str = "core.meta";
137
138/// Runtime algebraic data types.
139///
140/// Algebraic data types are sums of products of other runtime types.
141///
142/// - **Parameter:** `?variants : (core.list (core.list core.type))`
143/// - **Result:** `core.type`
144pub const CORE_ADT: &str = "core.adt";
145
146/// Type of string literals.
147///
148/// - **Result:** `core.static`
149pub const CORE_STR_TYPE: &str = "core.str";
150
151/// Type of natural number literals.
152///
153/// - **Result:** `core.static`
154pub const CORE_NAT_TYPE: &str = "core.nat";
155
156/// Type of bytes literals.
157///
158/// - **Result:** `core.static`
159pub const CORE_BYTES_TYPE: &str = "core.bytes";
160
161/// Type of float literals.
162///
163/// - **Result:** `core.static`
164pub const CORE_FLOAT_TYPE: &str = "core.float";
165
166/// Type of a control flow edge.
167///
168/// - **Parameter:** `?types : (core.list core.type)`
169/// - **Result:** `core.ctrl_type`
170pub const CORE_CTRL: &str = "core.ctrl";
171
172/// The type of the types for control flow edges.
173///
174/// - **Result:** `?type : core.static`
175pub const CORE_CTRL_TYPE: &str = "core.ctrl_type";
176
177/// The type for runtime constants.
178///
179/// - **Parameter:** `?type : core.type`
180/// - **Result:** `core.static`
181pub const CORE_CONST: &str = "core.const";
182
183/// Constants for runtime algebraic data types.
184///
185/// - **Parameter:** `?variants : (core.list core.type)`
186/// - **Parameter:** `?types : (core.list core.static)`
187/// - **Parameter:** `?tag : core.nat`
188/// - **Parameter:** `?values : (core.tuple ?types)`
189/// - **Result:** `(core.const (core.adt ?variants))`
190pub const CORE_CONST_ADT: &str = "core.const.adt";
191
192/// The type for lists of static data.
193///
194/// Lists are finite sequences such that all elements have the same type.
195/// For heterogeneous sequences, see [`CORE_TUPLE_TYPE`].
196///
197/// - **Parameter:** `?type : core.static`
198/// - **Result:** `core.static`
199pub const CORE_LIST_TYPE: &str = "core.list";
200
201/// The type for tuples of static data.
202///
203/// Tuples are finite sequences that allow elements to have different types.
204/// For homogeneous sequences, see [`CORE_LIST_TYPE`].
205///
206/// - **Parameter:** `?types : (core.list core.static)`
207/// - **Result:** `core.static`
208pub const CORE_TUPLE_TYPE: &str = "core.tuple";
209
210/// Operation to call a statically known function.
211///
212/// - **Parameter:** `?inputs : (core.list core.type)`
213/// - **Parameter:** `?outputs : (core.list core.type)`
214/// - **Parameter:** `?func : (core.const (core.fn ?inputs ?outputs))`
215/// - **Result:** `(core.fn ?inputs ?outputs ?ext)`
216pub const CORE_CALL: &str = "core.call";
217
218/// Operation to call a functiion known at runtime.
219///
220/// - **Parameter:** `?inputs : (core.list core.type)`
221/// - **Parameter:** `?outputs : (core.list core.type)`
222/// - **Result:** `(core.fn [(core.fn ?inputs ?outputs) ?inputs ...] ?outputs)`
223pub const CORE_CALL_INDIRECT: &str = "core.call_indirect";
224
225/// Operation to load a constant value.
226///
227/// - **Parameter:** `?type : core.type`
228/// - **Parameter:** `?value : (core.const ?type)`
229/// - **Result:** `(core.fn [] [?type])`
230pub const CORE_LOAD_CONST: &str = "core.load_const";
231
232/// Operation to create a value of an algebraic data type.
233///
234/// - **Parameter:** `?variants : (core.list (core.list core.type))`
235/// - **Parameter:** `?types : (core.list core.type)`
236/// - **Parameter:** `?tag : core.nat`
237/// - **Result:** `(core.fn ?types [(core.adt ?variants)])`
238pub const CORE_MAKE_ADT: &str = "core.make_adt";
239
240/// Constructor for documentation metadata.
241///
242/// - **Parameter:** `?description : core.str`
243/// - **Result:** `core.meta`
244pub const CORE_META_DESCRIPTION: &str = "core.meta.description";
245
246/// Metadata to tag a node or region as the entrypoint of a module.
247///
248/// - **Result:** `core.meta`
249pub const CORE_ENTRYPOINT: &str = "core.entrypoint";
250
251/// Constructor for JSON encoded metadata.
252///
253/// This is included in the model to allow for compatibility with `hugr-core`.
254/// The intention is to deprecate this in the future in favor of metadata
255/// expressed with custom constructors.
256///
257/// - **Parameter:** `?name : core.str`
258/// - **Parameter:** `?json : core.str`
259/// - **Result:** `core.meta`
260pub const COMPAT_META_JSON: &str = "compat.meta_json";
261
262/// Constructor for JSON encoded constants.
263///
264/// This is included in the model to allow for compatibility with `hugr-core`.
265/// The intention is to deprecate this in the future in favor of constants
266/// expressed with custom constructors.
267///
268/// - **Parameter:** `?type : core.type`
269/// - **Parameter:** `?json : core.str`
270/// - **Result:** `(core.const ?type)`
271pub const COMPAT_CONST_JSON: &str = "compat.const_json";
272
273/// Metadata constructor for order hint keys.
274///
275/// Nodes in a dataflow region can be annotated with a key. Each node may have
276/// at most one key and the key must be unique among all nodes in the same
277/// dataflow region. The parent dataflow graph can then use the
278/// `order_hint.order` metadata to imply a desired ordering relation, referring
279/// to the nodes by their key.
280///
281/// - **Parameter:** `?key : core.nat`
282/// - **Result:** `core.meta`
283pub const ORDER_HINT_KEY: &str = "core.order_hint.key";
284
285/// Metadata constructor for order hints.
286///
287/// When this metadata is attached to a dataflow region, it can indicate a
288/// preferred ordering relation between child nodes. Code generation must take
289/// this into account when deciding on an execution order. The child nodes are
290/// identified by a key, using the `order_hint.key` metadata.
291///
292/// The graph consisting of both value dependencies between nodes and order
293/// hints must be directed acyclic.
294///
295/// - **Parameter:** `?before : core.nat`
296/// - **Parameter:** `?after : core.nat`
297/// - **Result:** `core.meta`
298pub const ORDER_HINT_ORDER: &str = "core.order_hint.order";
299
300pub mod ast;
301pub mod binary;
302pub mod scope;
303pub mod table;
304
305pub use bumpalo;
306
307/// Type to indicate whether scopes are open or closed.
308#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
309pub enum ScopeClosure {
310 /// A scope that is open and therefore not isolated from its parent scope.
311 #[default]
312 Open,
313 /// A scope that is closed and therefore isolated from its parent scope.
314 Closed,
315}
316
317#[cfg(feature = "pyo3")]
318impl<'py> pyo3::FromPyObject<'py> for ScopeClosure {
319 fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
320 let value: usize = ob.getattr("value")?.extract()?;
321 match value {
322 0 => Ok(Self::Open),
323 1 => Ok(Self::Closed),
324 _ => Err(pyo3::exceptions::PyTypeError::new_err(
325 "Invalid ScopeClosure.",
326 )),
327 }
328 }
329}
330
331#[cfg(feature = "pyo3")]
332impl<'py> pyo3::IntoPyObject<'py> for ScopeClosure {
333 type Target = pyo3::PyAny;
334 type Output = pyo3::Bound<'py, Self::Target>;
335 type Error = pyo3::PyErr;
336
337 fn into_pyobject(self, py: pyo3::Python<'py>) -> Result<Self::Output, Self::Error> {
338 let py_module = py.import("hugr.model")?;
339 let py_class = py_module.getattr("ScopeClosure")?;
340
341 match self {
342 ScopeClosure::Open => py_class.getattr("OPEN"),
343 ScopeClosure::Closed => py_class.getattr("CLOSED"),
344 }
345 }
346}
347
348/// The kind of a region.
349#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
350pub enum RegionKind {
351 /// Data flow region.
352 #[default]
353 DataFlow = 0,
354 /// Control flow region.
355 ControlFlow = 1,
356 /// Module region.
357 Module = 2,
358}
359
360#[cfg(feature = "pyo3")]
361impl<'py> pyo3::FromPyObject<'py> for RegionKind {
362 fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
363 let value: usize = ob.getattr("value")?.extract()?;
364 match value {
365 0 => Ok(Self::DataFlow),
366 1 => Ok(Self::ControlFlow),
367 2 => Ok(Self::Module),
368 _ => Err(pyo3::exceptions::PyTypeError::new_err(
369 "Invalid RegionKind.",
370 )),
371 }
372 }
373}
374
375#[cfg(feature = "pyo3")]
376impl<'py> pyo3::IntoPyObject<'py> for RegionKind {
377 type Target = pyo3::PyAny;
378 type Output = pyo3::Bound<'py, Self::Target>;
379 type Error = pyo3::PyErr;
380
381 fn into_pyobject(self, py: pyo3::Python<'py>) -> Result<Self::Output, Self::Error> {
382 let py_module = py.import("hugr.model")?;
383 let py_class = py_module.getattr("RegionKind")?;
384
385 match self {
386 RegionKind::DataFlow => py_class.getattr("DATA_FLOW"),
387 RegionKind::ControlFlow => py_class.getattr("CONTROL_FLOW"),
388 RegionKind::Module => py_class.getattr("MODULE"),
389 }
390 }
391}
392
393/// The name of a variable.
394#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
395pub struct VarName(SmolStr);
396
397impl VarName {
398 /// Create a new variable name.
399 pub fn new(name: impl Into<SmolStr>) -> Self {
400 Self(name.into())
401 }
402}
403
404impl AsRef<str> for VarName {
405 fn as_ref(&self) -> &str {
406 self.0.as_ref()
407 }
408}
409
410#[cfg(feature = "pyo3")]
411impl<'py> pyo3::FromPyObject<'py> for VarName {
412 fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
413 let name: String = ob.extract()?;
414 Ok(Self::new(name))
415 }
416}
417
418#[cfg(feature = "pyo3")]
419impl<'py> pyo3::IntoPyObject<'py> for &VarName {
420 type Target = pyo3::types::PyString;
421 type Output = pyo3::Bound<'py, Self::Target>;
422 type Error = pyo3::PyErr;
423
424 fn into_pyobject(self, py: pyo3::Python<'py>) -> Result<Self::Output, Self::Error> {
425 Ok(self.as_ref().into_pyobject(py)?)
426 }
427}
428
429/// The name of a symbol.
430#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
431pub struct SymbolName(SmolStr);
432
433impl SymbolName {
434 /// Create a new symbol name.
435 pub fn new(name: impl Into<SmolStr>) -> Self {
436 Self(name.into())
437 }
438}
439
440impl AsRef<str> for SymbolName {
441 fn as_ref(&self) -> &str {
442 self.0.as_ref()
443 }
444}
445
446#[cfg(feature = "pyo3")]
447impl<'py> pyo3::FromPyObject<'py> for SymbolName {
448 fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
449 let name: String = ob.extract()?;
450 Ok(Self::new(name))
451 }
452}
453
454/// The name of a link.
455#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
456pub struct LinkName(SmolStr);
457
458impl LinkName {
459 /// Create a new link name.
460 pub fn new(name: impl Into<SmolStr>) -> Self {
461 Self(name.into())
462 }
463
464 /// Create a new link name from a link index.
465 #[must_use]
466 pub fn new_index(index: LinkIndex) -> Self {
467 // TODO: Should named and numbered links have different namespaces?
468 Self(format!("{index}").into())
469 }
470}
471
472impl AsRef<str> for LinkName {
473 fn as_ref(&self) -> &str {
474 self.0.as_ref()
475 }
476}
477
478#[cfg(feature = "pyo3")]
479impl<'py> pyo3::FromPyObject<'py> for LinkName {
480 fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
481 let name: String = ob.extract()?;
482 Ok(Self::new(name))
483 }
484}
485
486#[cfg(feature = "pyo3")]
487impl<'py> pyo3::IntoPyObject<'py> for &LinkName {
488 type Target = pyo3::types::PyString;
489 type Output = pyo3::Bound<'py, Self::Target>;
490 type Error = pyo3::PyErr;
491
492 fn into_pyobject(self, py: pyo3::Python<'py>) -> Result<Self::Output, Self::Error> {
493 Ok(self.as_ref().into_pyobject(py)?)
494 }
495}
496
497/// A static literal value.
498///
499/// Literal values may be large since they can include strings and byte
500/// sequences of arbitrary length. To enable cheap cloning and sharing,
501/// strings and byte sequences use reference counting.
502#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
503pub enum Literal {
504 /// String literal.
505 Str(SmolStr),
506 /// Natural number literal (unsigned 64 bit).
507 Nat(u64),
508 /// Byte sequence literal.
509 Bytes(Arc<[u8]>),
510 /// Floating point literal
511 Float(OrderedFloat<f64>),
512}
513
514#[cfg(feature = "pyo3")]
515impl<'py> pyo3::FromPyObject<'py> for Literal {
516 fn extract_bound(ob: &pyo3::Bound<'py, pyo3::PyAny>) -> pyo3::PyResult<Self> {
517 if pyo3::types::PyString::is_type_of(ob) {
518 let value: String = ob.extract()?;
519 Ok(Literal::Str(value.into()))
520 } else if pyo3::types::PyInt::is_type_of(ob) {
521 let value: u64 = ob.extract()?;
522 Ok(Literal::Nat(value))
523 } else if pyo3::types::PyFloat::is_type_of(ob) {
524 let value: f64 = ob.extract()?;
525 Ok(Literal::Float(value.into()))
526 } else if pyo3::types::PyBytes::is_type_of(ob) {
527 let value: Vec<u8> = ob.extract()?;
528 Ok(Literal::Bytes(value.into()))
529 } else {
530 Err(pyo3::exceptions::PyTypeError::new_err(
531 "Invalid literal value.",
532 ))
533 }
534 }
535}
536
537#[cfg(feature = "pyo3")]
538impl<'py> pyo3::IntoPyObject<'py> for &Literal {
539 type Target = pyo3::PyAny;
540 type Output = pyo3::Bound<'py, Self::Target>;
541 type Error = pyo3::PyErr;
542
543 fn into_pyobject(self, py: pyo3::Python<'py>) -> Result<Self::Output, Self::Error> {
544 Ok(match self {
545 Literal::Str(s) => s.as_str().into_pyobject(py)?.into_any(),
546 Literal::Nat(n) => n.into_pyobject(py)?.into_any(),
547 Literal::Bytes(b) => pyo3::types::PyBytes::new(py, b)
548 .into_pyobject(py)?
549 .into_any(),
550 Literal::Float(f) => f.0.into_pyobject(py)?.into_any(),
551 })
552 }
553}
554
555#[cfg(test)]
556mod test {
557 use super::*;
558 use proptest::{prelude::*, string::string_regex};
559
560 impl Arbitrary for Literal {
561 type Parameters = ();
562 type Strategy = BoxedStrategy<Self>;
563
564 fn arbitrary_with(_args: Self::Parameters) -> Self::Strategy {
565 prop_oneof![
566 any::<String>().prop_map(|s| Literal::Str(s.into())),
567 any::<u64>().prop_map(Literal::Nat),
568 prop::collection::vec(any::<u8>(), 0..100).prop_map(|v| Literal::Bytes(v.into())),
569 any::<f64>().prop_map(|f| Literal::Float(OrderedFloat(f)))
570 ]
571 .boxed()
572 }
573 }
574
575 impl Arbitrary for SymbolName {
576 type Parameters = ();
577 type Strategy = BoxedStrategy<Self>;
578
579 fn arbitrary_with(_args: Self::Parameters) -> Self::Strategy {
580 string_regex(r"[a-zA-Z\-_][0-9a-zA-Z\-_](\.[a-zA-Z\-_][0-9a-zA-Z\-_])*")
581 .unwrap()
582 .prop_map(Self::new)
583 .boxed()
584 }
585 }
586
587 impl Arbitrary for VarName {
588 type Parameters = ();
589 type Strategy = BoxedStrategy<Self>;
590
591 fn arbitrary_with(_args: Self::Parameters) -> Self::Strategy {
592 string_regex(r"[a-zA-Z\-_][0-9a-zA-Z\-_]")
593 .unwrap()
594 .prop_map(Self::new)
595 .boxed()
596 }
597 }
598
599 proptest! {
600 #[test]
601 fn test_literal_text(lit: Literal) {
602 let text = lit.to_string();
603 let parsed: Literal = text.parse().unwrap();
604 assert_eq!(lit, parsed);
605 }
606 }
607}