hugr_model/v0/mod.rs
1//! Version 0 (unstable).
2//!
3//! **Warning**: This module is still under development and is expected to change.
4//! It is included in the library to allow for early experimentation, and for
5//! the core and model to converge incrementally.
6//!
7//! This module defines representations of the hugr IR as plain data, designed
8//! to be as independent of implementation details as feasible. It can be used
9//! by the core compiler, alternative implementations or tooling that does not
10//! need the power/complexity of the full compiler. We provide the following
11//! in-memory representations:
12//!
13//! - [Table]: Efficient intermediate data structure to facilitate conversions.
14//! - [AST]: Abstract syntax tree that uses direct references rather than table indices.
15//!
16//! The table and AST format are interconvertible and can be serialised to
17//! a binary and text format, respectively:
18//!
19//! - [Binary]: Binary serialisation format optimised for performance and size.
20//! - [Text]: Human readable s-expression based text format.
21//!
22//! # Logical Format
23//!
24//! The hugr IR is a hierarchical graph data structure. __Nodes__ represent both
25//! __instructions__ that manipulate runtime values and __symbols__ which
26//! represent named language objects. Instructions have __input__ and __output__ ports
27//! and runtime values flow between ports when they are connected by a __link__.
28//!
29//! Nodes are organised into __regions__ and do not have any explicit ordering
30//! between them; any schedule that respects the data dependencies between nodes
31//! is valid. Regions come in three different kinds. __Module regions__ form the
32//! top level of a module and can only contain symbols. __Dataflow regions__
33//! describe how data flows from the region's __source__ ports to the region's
34//! __target__ ports. __Controlflow regions__ are control flow graphs containing
35//! dataflow __blocks__, with control flow originating from the region's source
36//! ports and ending in the region's target ports.
37//!
38//! __Terms__ form a meta language that is used to describe types, parameters and metadata that
39//! are known statically. To allow types to be parameterized by values, types and values
40//! are treated uniformly as terms, enabling a restricted form of dependent typing.
41//! Terms are extensible declaratively via __constructors__.
42//! __Constraints__ can be used to express more complex validation rules.
43//!
44//! # Remaining Mismatch with `hugr-core`
45//!
46//! This data model was designed to encode as much of `hugr-core` as possible while also
47//! filling in conceptual gaps and providing a forward-compatible foundation for future
48//! development. However, there are still some mismatches with `hugr-core` that are not
49//! addressed by conversions in import/export:
50//!
51//! - Some static types can not yet be represented in `hugr-core` although they should be.
52//! - The model does not have types with a copy bound as `hugr-core` does, and instead uses
53//! a more general form of type constraints ([#1556]). Similarly, the model does not have
54//! bounded naturals. We perform a conversion for compatibility where possible, but this does
55//! not fully cover all potential cases of bounds.
56//! - `hugr-model` allows to declare term constructors that serve as blueprints for constructing
57//! runtime values. This allows constants to have potentially multiple representations,
58//! which can be essential in case of very large constants that require efficient encodings.
59//! `hugr-core` is more restricted, requiring a canonical representation for constant values.
60//! - `hugr-model` has support for passing closed regions as static parameters to operations,
61//! which allows for higher-order operations that require their function arguments to be
62//! statically known. We currently do not yet support converting this to `hugr-core`.
63//! - In a model module, ports are connected when they share the same link. This differs from
64//! the type of port connectivity in the graph data structure used by `hugr-core`. However,
65//! `hugr-core` restricts connectivity so that in any group of connected ports there is at
66//! most one output port (for dataflow) or at most one input port (for control flow). In
67//! these cases, there is no mismatch.
68//! - `hugr-core` only allows to define type aliases, but not aliases for other terms.
69//! - `hugr-model` has no concept of order edges, encoding a strong preference that ordering
70//! requirements be encoded within the dataflow paradigm.
71//! - Both `hugr-model` and `hugr-core` support metadata, but they use different encodings.
72//! `hugr-core` encodes metadata as JSON objects, while `hugr-model` uses terms. Using
73//! terms has the advantage that metadata can be validated with the same type checking
74//! mechanism as the rest of the model ([#1553]).
75//! - `hugr-model` have a root region that corresponds to a root `Module` in `hugr-core`.
76//! `hugr-core` however can have nodes with different operations as their root ([#1554]).
77//!
78//! [#1556]: https://github.com/CQCL/hugr/discussions/1556
79//! [#1553]: https://github.com/CQCL/hugr/issues/1553
80//! [#1554]: https://github.com/CQCL/hugr/issues/1554
81//! [Text]: crate::v0::ast
82//! [Binary]: crate::v0::binary
83//! [Table]: crate::v0::table
84//! [AST]: crate::v0::ast
85use ordered_float::OrderedFloat;
86use smol_str::SmolStr;
87use std::sync::Arc;
88use table::LinkIndex;
89
90/// Core function types.
91///
92/// - **Parameter:** `?inputs : (core.list core.type)`
93/// - **Parameter:** `?outputs : (core.list core.type)`
94/// - **Parameter:** `?ext : core.ext-set`
95/// - **Result:** `core.type`
96pub const CORE_FN: &str = "core.fn";
97
98/// The type of runtime types.
99///
100/// Runtime types are the types of values that can flow between nodes at runtime.
101///
102/// - **Result:** `?type : core.static`
103pub const CORE_TYPE: &str = "core.type";
104
105/// The type of static types.
106///
107/// Static types are the types of statically known parameters.
108///
109/// This is the only term that is its own type.
110///
111/// - **Result:** `?type : core.static`
112pub const CORE_STATIC: &str = "core.static";
113
114/// The type of constraints.
115///
116/// - **Result:** `?type : core.static`
117pub const CORE_CONSTRAINT: &str = "core.constraint";
118
119/// The constraint for non-linear runtime data.
120///
121/// Runtime values are copied implicitly by connecting an output port to more
122/// than one input port. Similarly runtime values can be deleted implicitly when
123/// an output port is not connected to any input port. In either of these cases
124/// the type of the runtime value must satisfy this constraint.
125///
126/// - **Parameter:** `?type : core.type`
127/// - **Result:** `core.constraint`
128pub const CORE_NON_LINEAR: &str = "core.nonlinear";
129
130/// The type of metadata.
131///
132/// - **Result:** `?type : core.static`
133pub const CORE_META: &str = "core.meta";
134
135/// Runtime algebraic data types.
136///
137/// Algebraic data types are sums of products of other runtime types.
138///
139/// - **Parameter:** `?variants : (core.list (core.list core.type))`
140/// - **Result:** `core.type`
141pub const CORE_ADT: &str = "core.adt";
142
143/// Type of string literals.
144///
145/// - **Result:** `core.static`
146pub const CORE_STR_TYPE: &str = "core.str";
147
148/// Type of natural number literals.
149///
150/// - **Result:** `core.static`
151pub const CORE_NAT_TYPE: &str = "core.nat";
152
153/// Type of bytes literals.
154///
155/// - **Result:** `core.static`
156pub const CORE_BYTES_TYPE: &str = "core.bytes";
157
158/// Type of float literals.
159///
160/// - **Result:** `core.static`
161pub const CORE_FLOAT_TYPE: &str = "core.float";
162
163/// Type of a control flow edge.
164///
165/// - **Parameter:** `?types : (core.list core.type)`
166/// - **Result:** `core.ctrl_type`
167pub const CORE_CTRL: &str = "core.ctrl";
168
169/// The type of the types for control flow edges.
170///
171/// - **Result:** `?type : core.static`
172pub const CORE_CTRL_TYPE: &str = "core.ctrl_type";
173
174/// The type of extension sets.
175///
176/// - **Result:** `?type : core.static`
177pub const CORE_EXT_SET: &str = "core.ext_set";
178
179/// The type for runtime constants.
180///
181/// - **Parameter:** `?type : core.type`
182/// - **Parameter:** `?ext : core.ext_set`
183/// - **Result:** `core.static`
184pub const CORE_CONST: &str = "core.const";
185
186/// Constants for runtime algebraic data types.
187///
188/// - **Parameter:** `?variants : (core.list core.type)`
189/// - **Parameter:** `?ext : core.ext_set`
190/// - **Parameter:** `?types : (core.list core.static)`
191/// - **Parameter:** `?tag : core.nat`
192/// - **Parameter:** `?values : (core.tuple ?types)`
193/// - **Result:** `(core.const (core.adt ?variants) ?ext)`
194pub const CORE_CONST_ADT: &str = "core.const.adt";
195
196/// The type for lists of static data.
197///
198/// Lists are finite sequences such that all elements have the same type.
199/// For heterogeneous sequences, see [`CORE_TUPLE_TYPE`].
200///
201/// - **Parameter:** `?type : core.static`
202/// - **Result:** `core.static`
203pub const CORE_LIST_TYPE: &str = "core.list";
204
205/// The type for tuples of static data.
206///
207/// Tuples are finite sequences that allow elements to have different types.
208/// For homogeneous sequences, see [`CORE_LIST_TYPE`].
209///
210/// - **Parameter:** `?types : (core.list core.static)`
211/// - **Result:** `core.static`
212pub const CORE_TUPLE_TYPE: &str = "core.tuple";
213
214/// Operation to call a statically known function.
215///
216/// - **Parameter:** `?inputs : (core.list core.type)`
217/// - **Parameter:** `?outputs : (core.list core.type)`
218/// - **Parameter:** `?ext : core.ext_set`
219/// - **Parameter:** `?func : (core.const (core.fn ?inputs ?outputs ?ext) ?ext)`
220/// - **Result:** `(core.fn ?inputs ?outputs ?ext)`
221pub const CORE_CALL: &str = "core.call";
222
223/// Operation to call a functiion known at runtime.
224///
225/// - **Parameter:** `?inputs : (core.list core.type)`
226/// - **Parameter:** `?outputs : (core.list core.type)`
227/// - **Parameter:** `?ext : core.ext_set`
228/// - **Result:** `(core.fn [(core.fn ?inputs ?outputs ?ext) ?inputs ...] ?outputs ?ext)`
229pub const CORE_CALL_INDIRECT: &str = "core.call_indirect";
230
231/// Operation to load a constant value.
232///
233/// - **Parameter:** `?type : core.type`
234/// - **Parameter:** `?ext : core.ext_set`
235/// - **Parameter:** `?value : (core.const ?type ?ext)`
236/// - **Result:** `(core.fn [] [?type] ?ext)`
237pub const CORE_LOAD_CONST: &str = "core.load_const";
238
239/// Operation to create a value of an algebraic data type.
240///
241/// - **Parameter:** `?variants : (core.list (core.list core.type))`
242/// - **Parameter:** `?types : (core.list core.type)`
243/// - **Parameter:** `?tag : core.nat`
244/// - **Result:** `(core.fn ?types [(core.adt ?variants)] (ext))`
245pub const CORE_MAKE_ADT: &str = "core.make_adt";
246
247/// Constructor for documentation metadata.
248///
249/// - **Parameter:** `?description : core.str`
250/// - **Result:** `core.meta`
251pub const CORE_META_DESCRIPTION: &str = "core.meta.description";
252
253/// Constructor for JSON encoded metadata.
254///
255/// This is included in the model to allow for compatibility with `hugr-core`.
256/// The intention is to deprecate this in the future in favor of metadata
257/// expressed with custom constructors.
258///
259/// - **Parameter:** `?name : core.str`
260/// - **Parameter:** `?json : core.str`
261/// - **Result:** `core.meta`
262pub const COMPAT_META_JSON: &str = "compat.meta_json";
263
264/// Constructor for JSON encoded constants.
265///
266/// This is included in the model to allow for compatibility with `hugr-core`.
267/// The intention is to deprecate this in the future in favor of constants
268/// expressed with custom constructors.
269///
270/// - **Parameter:** `?type : core.type`
271/// - **Parameter:** `?ext : core.ext_set`
272/// - **Parameter:** `?json : core.str`
273/// - **Result:** `(core.const ?type ?ext)`
274pub const COMPAT_CONST_JSON: &str = "compat.const_json";
275
276pub mod ast;
277pub mod binary;
278pub mod scope;
279pub mod table;
280
281pub use bumpalo;
282
283/// Type to indicate whether scopes are open or closed.
284#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
285pub enum ScopeClosure {
286 /// A scope that is open and therefore not isolated from its parent scope.
287 #[default]
288 Open,
289 /// A scope that is closed and therefore isolated from its parent scope.
290 Closed,
291}
292
293/// The kind of a region.
294#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
295pub enum RegionKind {
296 /// Data flow region.
297 #[default]
298 DataFlow = 0,
299 /// Control flow region.
300 ControlFlow = 1,
301 /// Module region.
302 Module = 2,
303}
304
305/// The name of a variable.
306#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
307pub struct VarName(SmolStr);
308
309impl VarName {
310 /// Create a new variable name.
311 pub fn new(name: impl Into<SmolStr>) -> Self {
312 Self(name.into())
313 }
314}
315
316impl AsRef<str> for VarName {
317 fn as_ref(&self) -> &str {
318 self.0.as_ref()
319 }
320}
321
322/// The name of a symbol.
323#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
324pub struct SymbolName(SmolStr);
325
326impl SymbolName {
327 /// Create a new symbol name.
328 pub fn new(name: impl Into<SmolStr>) -> Self {
329 Self(name.into())
330 }
331}
332
333impl AsRef<str> for SymbolName {
334 fn as_ref(&self) -> &str {
335 self.0.as_ref()
336 }
337}
338
339/// The name of a link.
340#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
341pub struct LinkName(SmolStr);
342
343impl LinkName {
344 /// Create a new link name.
345 pub fn new(name: impl Into<SmolStr>) -> Self {
346 Self(name.into())
347 }
348
349 /// Create a new link name from a link index.
350 pub fn new_index(index: LinkIndex) -> Self {
351 // TODO: Should named and numbered links have different namespaces?
352 Self(format!("{}", index).into())
353 }
354}
355
356impl AsRef<str> for LinkName {
357 fn as_ref(&self) -> &str {
358 self.0.as_ref()
359 }
360}
361
362/// A static literal value.
363///
364/// Literal values may be large since they can include strings and byte
365/// sequences of arbitrary length. To enable cheap cloning and sharing,
366/// strings and byte sequences use reference counting.
367#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
368pub enum Literal {
369 /// String literal.
370 Str(SmolStr),
371 /// Natural number literal (unsigned 64 bit).
372 Nat(u64),
373 /// Byte sequence literal.
374 Bytes(Arc<[u8]>),
375 /// Floating point literal
376 Float(OrderedFloat<f64>),
377}
378
379#[cfg(test)]
380mod test {
381 use super::*;
382 use proptest::{prelude::*, string::string_regex};
383
384 impl Arbitrary for Literal {
385 type Parameters = ();
386 type Strategy = BoxedStrategy<Self>;
387
388 fn arbitrary_with(_args: Self::Parameters) -> Self::Strategy {
389 prop_oneof![
390 any::<String>().prop_map(|s| Literal::Str(s.into())),
391 any::<u64>().prop_map(Literal::Nat),
392 prop::collection::vec(any::<u8>(), 0..100).prop_map(|v| Literal::Bytes(v.into())),
393 any::<f64>().prop_map(|f| Literal::Float(OrderedFloat(f)))
394 ]
395 .boxed()
396 }
397 }
398
399 impl Arbitrary for SymbolName {
400 type Parameters = ();
401 type Strategy = BoxedStrategy<Self>;
402
403 fn arbitrary_with(_args: Self::Parameters) -> Self::Strategy {
404 string_regex(r"[a-zA-Z\-_][0-9a-zA-Z\-_](\.[a-zA-Z\-_][0-9a-zA-Z\-_])*")
405 .unwrap()
406 .prop_map(Self::new)
407 .boxed()
408 }
409 }
410
411 impl Arbitrary for VarName {
412 type Parameters = ();
413 type Strategy = BoxedStrategy<Self>;
414
415 fn arbitrary_with(_args: Self::Parameters) -> Self::Strategy {
416 string_regex(r"[a-zA-Z\-_][0-9a-zA-Z\-_]")
417 .unwrap()
418 .prop_map(Self::new)
419 .boxed()
420 }
421 }
422
423 proptest! {
424 #[test]
425 fn test_literal_text(lit: Literal) {
426 let text = lit.to_string();
427 let parsed: Literal = text.parse().unwrap();
428 assert_eq!(lit, parsed);
429 }
430 }
431}