typeshare_model/
language.rs

1use std::{borrow::Cow, fmt::Debug, io::Write, path::Path};
2
3use anyhow::Context;
4use itertools::Itertools;
5
6use crate::parsed_data::{
7    CrateName, Id, RustConst, RustEnum, RustEnumVariant, RustStruct, RustType, RustTypeAlias,
8    SpecialRustType, TypeName,
9};
10
11/// If we're in multifile mode, this enum contains the crate name for the
12/// specific file
13#[derive(Debug, Clone, Copy)]
14#[non_exhaustive]
15pub enum FilesMode<T> {
16    Single,
17    Multi(T),
18    // We've had requests for java support, which means we'll need a
19    // 1-file-per-type mode
20}
21
22impl<T> FilesMode<T> {
23    pub fn map<U>(self, op: impl FnOnce(T) -> U) -> FilesMode<U> {
24        match self {
25            FilesMode::Single => FilesMode::Single,
26            FilesMode::Multi(value) => FilesMode::Multi(op(value)),
27        }
28    }
29
30    pub fn is_multi(&self) -> bool {
31        matches!(*self, Self::Multi(_))
32    }
33}
34
35/**
36*The* trait you need to implement in order to have your own implementation of
37typeshare. The whole world revolves around this trait.
38
39In general, to implement this correctly, you *must* implement:
40
41- `new_from_config`, which instantiates your `Language` struct from
42  configuration which was read from a config file or the command line
43- `output_filename_for_crate`, which (in multi-file mode) produces a file
44  name from a crate name. All of the typeshared types from that crate will
45  be written to that file.
46- The `write_*` methods, which output the actual type definitions. These
47  methods *should* call `format_type` to format the actual types contained
48  in the type definitions, which will in turn dispatch to the relevant
49  `format_*` method, depending on what kind of type it is.
50- The `format_special_type` method, which outputs things like integer types,
51  arrays, and other builtin or primitive types. This method is only ever called
52  by `format_type`, which is only called if you choose to call it in your
53  `write_*` implementations.
54
55Additionally, you must provide a `Config` associated type, which must implement
56`Serialize + Deserialize`. This type will be used to load configuration from
57a config file and from the command line arguments for your language, which will
58be passed to `new_from_config`. This type should provide defaults for *all* of
59its fields; it should always tolerate being loaded from an empty config file.
60When *serializing*, this type should always output all of its fields, even if
61they're defaulted.
62
63It's also very common to implement:
64
65- `mapped_type`, to define certain types as having specialied handling in your
66  lanugage.
67- `begin_file`, `end_file`, and `write_additional_files`, to add additional
68  per-file or per-directory content to your output.
69
70If your language spells type names in an unusual way (here, defined as the C++
71descended convention, where a type might be spelled `Foo<Bar, Baz<Zed>>`),
72you'll want to implement the `format_*` methods.
73
74Other methods can be specialized as needed.
75
76# Typeshare execution flow.
77
78This is the detailed flow of how the `Language` trait is actually used by
79typeshare. It includes references to all of the methods that are called, and
80in what order. For these examples, we're assuming a hypothetical implementation
81for Kotlin, which means that there must be `impl Language<'_> for Kotlin`
82somewhere.
83
841. The language's config is loaded from the config file and command line
85arguments:
86
87```ignore
88let config = Kotlin::Config::deserialize(config_file)?;
89```
90
912. The language is loaded from the config file via `new_from_config`. This is
92where the implementation has the opportunity to report any configuration errors
93that weren't detected during deserialization.
94
95```ignore
96let language = Kotlin::new_from_config(config)?;
97```
98
993. If we're in multi-file mode, we call `output_filename_for_crate` for each rust
100crate being typeshared to determine the _filename_ for the output file that
101will contain that crate's types.
102
103```ignore
104let files = crate_names
105    .iter()
106    .map(|crate_name| {
107        let filename = language.output_filename_for_crate(crate_name);
108        File::create(output_directory.join(filename))
109    });
110}
111```
112
1134. We call `begin_file` on the output type to print any headers or preamble
114appropriate for this language. In multi-file mode, `begin_file` is called once
115for each output file; in this case, the `mode` argument will include the crate
116name.
117
118```ignore
119language.begin_file(&mut file, mode)
120```
121
1225. In mutli-file mode only, we call `write_imports` with a list of all the
123types that are being imported from other typeshare'd crates. This allows the
124language to emit appropriate import statements for its own language.
125
126```ignore
127// Only in multi-file mode
128language.write_imports(&mut file, crate_name, computed_imports)
129```
130
1316. For EACE typeshared item in being typeshared, we call `write_enum`,
132`write_struct`, `write_type_alias`, or `write_const`, as appropriate.
133
134```ignore
135language.write_struct(&mut file, parsed_struct);
136language.write_enum(&mut file, parsed_enum);
137```
138
1396a. In your implementations of these methods, we recommend that you call
140`format_type` for the fields of these types. `format_type` will in turn call
141`format_simple_type`, `format_generic_type`, or `format_special_type`, as
142appropriate; usually it is only necessary for you to implmenent
143`format_special_type` yourself, and use the default implementations for the
144others. The `format_*` methods will otherwise never be called by typeshare.
145
1466b. If your language doesn't natively support data-containing enums, we
147recommand that you call `write_types_for_anonymous_structs` in your
148implementation of `write_enum`; this will call `write_struct` for each variant
149of the enum.
150
1517. After all the types are written, we call `end_file`, with the same
152arguments that were passed to `begin_file`.
153
154```ignore
155language.end_file(&mut file, mode)
156```
157
1588. In multi-file mode only, after ALL files are written, we call
159`write_additional_files` with the output directory. This gives the language an
160opportunity to create any files resembling `mod.rs` or `index.js` as it might
161require.
162
163```ignore
164// Only in multi-file mode
165language.write_additional_files(&output_directory, generated_files.iter())
166```
167
168NOTE: at this stage, multi-file output is still work-in-progress, as the
169algorithms that compute import sets are being rewritten. The API presented
170here is stable, but output might be buggy while issues with import detection
171are resolved.
172
173In the future, we hope to make mutli-file mode multithreaded, capable of
174writing multiple files concurrently from a shared `Language` instance.
175`Language` therefore has a `Sync` bound to keep this possibility available.
176*/
177pub trait Language<'config>: Sized + Sync + Debug {
178    /**
179    The configuration for this language. This configuration will be loaded
180    from a config file and, where possible, from the command line, via
181    `serde`.
182
183    It is important that this type include `#[serde(default)]` or something
184    equivelent, so that a config can be loaded with default setting even
185    if this language isn't present in the config file.
186
187    The `serialize` implementation for this type should NOT skip keys, if
188    possible.
189    */
190    type Config: serde::Deserialize<'config> + serde::Serialize;
191
192    /**
193    The lowercase conventional name for this language. This should be a
194    single identifier. It will be used as a prefix for various things;
195    for instance, it will identify this language in the config file, and
196    be used as a prefix when generating CLI parameters
197    */
198    const NAME: &'static str;
199
200    /// Create an instance of this language from the loaded configuration.
201    fn new_from_config(config: Self::Config) -> anyhow::Result<Self>;
202
203    /**
204    Most languages provide manual overrides for specific types. When a type
205    is formatted with a name that matches a mapped type, the mapped type
206    name is formatted instead.
207
208    By default this returns `None` for all types.
209    */
210    fn mapped_type(&self, type_name: &TypeName) -> Option<Cow<'_, str>> {
211        let _ = type_name;
212        None
213    }
214
215    /**
216    In multi-file mode, typeshare will output one separate file with this
217    name for each crate in the input set. These file names should have the
218    appropriate naming convention and extension for this language.
219
220    This method isn't used in single-file mode.
221    */
222    fn output_filename_for_crate(&self, crate_name: &CrateName) -> String;
223
224    /**
225    Convert a Rust type into a type from this language. By default this
226    calls `format_simple_type`, `format_generic_type`, or
227    `format_special_type`, depending on the type. There should only rarely
228    be a need to specialize this.
229
230    This method should be called by the `write_*` methods to write the types
231    contained by type definitions.
232
233    The `generic_context` is the set of generic types being provided by
234    the enclosing type definition; this allows languages that do type
235    renaming to be able to distinguish concrete type names (like `int`)
236    from generic type names (like `T`)
237    */
238    fn format_type(&self, ty: &RustType, generic_context: &[TypeName]) -> anyhow::Result<String> {
239        match ty {
240            RustType::Simple { id } => self.format_simple_type(id, generic_context),
241            RustType::Generic { id, parameters } => {
242                self.format_generic_type(id, parameters.as_slice(), generic_context)
243            }
244            RustType::Special(special) => self.format_special_type(special, generic_context),
245        }
246    }
247
248    /**
249    Format a simple type with no generic parameters.
250
251    By default, this first checks `self.mapped_type` to see if there's an
252    alternative way this type should be formatted, and otherwise prints the
253    `base` verbatim.
254
255    The `generic_context` is the set of generic types being provided by
256    the enclosing type definition; this allows languages that do type
257    renaming to be able to distinguish concrete type names (like `int`)
258    from generic type names (like `T`).
259    */
260    fn format_simple_type(
261        &self,
262        base: &TypeName,
263        generic_context: &[TypeName],
264    ) -> anyhow::Result<String> {
265        let _ = generic_context;
266        Ok(match self.mapped_type(base) {
267            Some(mapped) => mapped.to_string(),
268            None => base.to_string(),
269        })
270    }
271
272    /**
273    Format a generic type that takes in generic arguments, which
274    may be recursive.
275
276    By default, this creates a composite type name by appending
277    `self.format_simple_type` and `self.format_generic_parameters`. With
278    their default implementations, this will print `name<parameters>`,
279    which is a common syntax used by many languages for generics.
280
281    The `generic_context` is the set of generic types being provided by
282    the enclosing type definition; this allows languages that do type
283    renaming to be able to distinguish concrete type names (like `int`)
284    from generic type names (like `T`).
285    */
286    fn format_generic_type(
287        &self,
288        base: &TypeName,
289        parameters: &[RustType],
290        generic_context: &[TypeName],
291    ) -> anyhow::Result<String> {
292        match parameters.is_empty() {
293            true => self.format_simple_type(base, generic_context),
294            false => Ok(match self.mapped_type(base) {
295                Some(mapped) => mapped.to_string(),
296                None => format!(
297                    "{}{}",
298                    self.format_simple_type(base, generic_context)?,
299                    self.format_generic_parameters(parameters, generic_context)?,
300                ),
301            }),
302        }
303    }
304
305    /**
306    Format generic parameters into a syntax used by this language. By
307    default, this returns `<A, B, C, ...>`, since that's a common syntax
308    used by most languages.
309
310    This method is only used when `format_generic_type` calls it.
311
312    The `generic_context` is the set of generic types being provided by
313    the enclosing type definition; this allows languages that do type
314    renaming to be able to distinguish concrete type names (like `int`)
315    from generic type names (like `T`).
316    */
317    fn format_generic_parameters(
318        &self,
319        parameters: &[RustType],
320        generic_context: &[TypeName],
321    ) -> anyhow::Result<String> {
322        parameters
323            .iter()
324            .map(|ty| self.format_type(ty, generic_context))
325            .process_results(|mut formatted| format!("<{}>", formatted.join(", ")))
326    }
327
328    /**
329    Format a special type. This will handle things like arrays, primitives,
330    options, and so on. Every lanugage has different spellings for these types,
331    so this is one of the key methods that a language implementation needs to
332    deal with.
333    */
334    fn format_special_type(
335        &self,
336        special_ty: &SpecialRustType,
337        generic_context: &[TypeName],
338    ) -> anyhow::Result<String>;
339
340    /**
341    Write a header for typeshared code. This is called unconditionally
342    at the start of the output file (or at the start of all files, if in
343    multi-file mode).
344
345    By default this does nothing.
346    */
347    fn begin_file(&self, w: &mut impl Write, mode: FilesMode<&CrateName>) -> anyhow::Result<()> {
348        let _ = (w, mode);
349        Ok(())
350    }
351
352    /**
353    For generating import statements. This is called only in multi-file
354    mode, after `begin_file` and before any other writes.
355
356    `imports` includes an ordered list of type names that typeshare
357    believes are being imported by this file, grouped by the crates they
358    come from. `typeshare` guarantees that these will be passed in some stable
359    order, so that your output remains consistent.
360
361    NOTE: Currently this is bugged and doesn't receive correct imports.
362    This will be fixed in a future release.
363    */
364    fn write_imports<'a, Crates, Types>(
365        &self,
366        writer: &mut impl Write,
367        crate_name: &CrateName,
368        imports: Crates,
369    ) -> anyhow::Result<()>
370    where
371        Crates: IntoIterator<Item = (&'a CrateName, Types)>,
372        Types: IntoIterator<Item = &'a TypeName>;
373
374    /**
375    Write a header for typeshared code. This is called unconditionally
376    at the end of the output file (or at the end of all files, if in
377    multi-file mode).
378
379    By default this does nothing.
380    */
381    fn end_file(&self, w: &mut impl Write, mode: FilesMode<&CrateName>) -> anyhow::Result<()> {
382        let _ = (w, mode);
383        Ok(())
384    }
385
386    /**
387    Write a type alias definition.
388
389    Example of a type alias:
390    ```
391    type MyTypeAlias = String;
392    ```
393
394    Generally this method will call `self.format_type` to produce the
395    aliased type name in the output definition.
396    */
397    fn write_type_alias(&self, w: &mut impl Write, t: &RustTypeAlias) -> anyhow::Result<()>;
398
399    /**
400    Write a struct definition.
401
402    Example of a struct:
403    ```ignore
404    #[typeshare]
405    #[derive(Serialize, Deserialize)]
406    struct Foo {
407        bar: String
408    }
409    ```
410
411    Generally this method will call `self.format_type` to produce the types
412    of the individual fields.
413    */
414    fn write_struct(&self, w: &mut impl Write, rs: &RustStruct) -> anyhow::Result<()>;
415
416    /**
417    Write an enum definition.
418
419    Example of an enum:
420    ```ignore
421    #[typeshare]
422    #[derive(Serialize, Deserialize)]
423    #[serde(tag = "type", content = "content")]
424    enum Foo {
425        Fizz,
426        Buzz { yep_this_works: bool }
427    }
428    ```
429
430    Generally this will call `self.format_type` to produce the types of
431    the individual fields. If this enum is an algebraic sum type, and this
432    language doesn't really support those, it should consider calling
433    `write_struct_types_for_enum_variants` to produce struct types matching
434    those variants, which can be used for this language's abstraction for
435    data like this.
436    */
437    fn write_enum(&self, w: &mut impl Write, e: &RustEnum) -> anyhow::Result<()>;
438
439    /**
440    Write a constant variable.
441
442    Example of a constant variable:
443    ```
444    const ANSWER_TO_EVERYTHING: u32 = 42;
445    ```
446
447    If necessary, generally this will call `self.format_type` to produce
448    the type of this constant (though some languages are allowed to omit
449    it).
450    */
451    fn write_const(&self, w: &mut impl Write, c: &RustConst) -> anyhow::Result<()>;
452
453    /**
454    Write out named types to represent anonymous struct enum variants.
455
456    Take the following enum as an example:
457
458    ```
459    enum AlgebraicEnum {
460        AnonymousStruct {
461            field: String,
462            another_field: bool,
463        },
464
465        Variant2 {
466            field: i32,
467        }
468    }
469    ```
470
471    This function will write out a pair of struct types resembling:
472
473    ```
474    struct AnonymousStruct {
475        field: String,
476        another_field: bool,
477    }
478
479    struct Variant2 {
480        field: i32,
481    }
482    ```
483
484    Except that it will use `make_struct_name` to compute the names of these
485    structs based on the names of the variants.
486
487    This method isn't called by default; it is instead provided as a helper
488    for your implementation of `write_enum`, since many languages don't have
489    a specific notion of an algebraic sum type, and have to emulate it with
490    subclasses, tagged unions, or something similar.
491    */
492    fn write_struct_types_for_enum_variants(
493        &self,
494        w: &mut impl Write,
495        e: &RustEnum,
496        make_struct_name: &impl Fn(&TypeName) -> String,
497    ) -> anyhow::Result<()> {
498        let variants = match e {
499            RustEnum::Unit { .. } => return Ok(()),
500            RustEnum::Algebraic { variants, .. } => variants.iter().filter_map(|v| match v {
501                RustEnumVariant::AnonymousStruct { fields, shared } => Some((fields, shared)),
502                _ => None,
503            }),
504        };
505
506        for (fields, variant) in variants {
507            let struct_name = make_struct_name(&variant.id.original);
508
509            // Builds the list of generic types (e.g [T, U, V]), by digging
510            // through the fields recursively and comparing against the
511            // enclosing enum's list of generic parameters.
512            let generic_types = fields
513                .iter()
514                .flat_map(|field| {
515                    e.shared()
516                        .generic_types
517                        .iter()
518                        .filter(|g| field.ty.contains_type(g))
519                })
520                .unique()
521                .cloned()
522                .collect();
523
524            self.write_struct(
525                w,
526                &RustStruct {
527                    id: Id {
528                        original: TypeName::new_string(struct_name.clone()),
529                        renamed: TypeName::new_string(struct_name),
530                    },
531                    fields: fields.clone(),
532                    generic_types,
533                    comments: vec![format!(
534                        "Generated type representing the anonymous struct \
535                        variant `{}` of the `{}` Rust enum",
536                        &variant.id.original,
537                        &e.shared().id.original,
538                    )],
539                    decorators: e.shared().decorators.clone(),
540                },
541            )
542            .with_context(|| {
543                format!(
544                    "failed to write struct type for the \
545                    `{}` variant of the `{}` enum",
546                    variant.id.original,
547                    e.shared().id.original
548                )
549            })?;
550        }
551
552        Ok(())
553    }
554
555    /**
556    If a type with this name appears in a type definition, it will be
557    unconditionally excluded from cross-file import analysis. Usually this will
558    be the types in `mapped_types`, since those are types with special behavior
559    (for instance, a datetime date provided as a standard type by your
560    langauge).
561
562    This is mostly a performance optimization. By default it returns `false`
563    for all types.
564    */
565    fn exclude_from_import_analysis(&self, name: &TypeName) -> bool {
566        let _ = name;
567        false
568    }
569
570    /**
571    In multi-file mode, this method is called after all of the individual
572    typeshare files are completely generated. Use it to generate any
573    additional files your language might need in this directory to
574    function correctly, such as a `mod.rs`, `__init__.py`, `index.js`, or
575    anything else like that.
576
577    It passed a list of crate names, for each crate that was typeshared, and
578    the associated file paths, indicating all of the files that were generated
579    by typeshare.
580
581    By default, this does nothing.
582    */
583    fn write_additional_files<'a>(
584        &self,
585        output_folder: &Path,
586        output_files: impl IntoIterator<Item = (&'a CrateName, &'a Path)>,
587    ) -> anyhow::Result<()> {
588        let _ = (output_folder, output_files);
589        Ok(())
590    }
591}