typeshare_model/
language.rs

1use std::{borrow::Cow, io::Write, path::Path};
2
3use anyhow::Context;
4use itertools::Itertools;
5
6use crate::parsed_data::{
7    CrateName, Id, RustConst, RustEnum, RustEnumVariant, RustStruct, RustType, RustTypeAlias,
8    SpecialRustType, TypeName,
9};
10
11/// If we're in multifile mode, this enum contains the crate name for the
12/// specific file
13#[derive(Debug, Clone, Copy)]
14pub enum FilesMode<T> {
15    Single,
16    Multi(T),
17}
18
19impl<T> FilesMode<T> {
20    pub fn map<U>(self, op: impl FnOnce(T) -> U) -> FilesMode<U> {
21        match self {
22            FilesMode::Single => FilesMode::Single,
23            FilesMode::Multi(value) => FilesMode::Multi(op(value)),
24        }
25    }
26
27    pub fn is_multi(&self) -> bool {
28        matches!(*self, Self::Multi(_))
29    }
30}
31
32/**
33*The* trait you need to implement in order to have your own implementation of
34typeshare. The whole world revolves around this trait.
35
36In general, to implement this correctly, you *must* implement:
37
38- `new_from_config`, which instantiates your `Language` struct from
39  configuration which was read from a config file or the command line
40- `output_filename_for_crate`, which (in multi-file mode) produces a file
41  name from a crate name. All of the typeshared types from that crate will
42  be written to that file.
43- The `write_*` methods, which output the actual type definitions. These
44  methods *should* call `format_type` to format the actual types contained
45  in the type definitions, which will in turn dispatch to the relevant
46  `format_*` method, depending on what kind of type it is.
47- The `format_special_type` method, which outputs things like integer types,
48  arrays, and other builtin or primitive types. This method is only ever called
49  by `format_type`, which is only called if you choose to call it in your
50  `write_*` implementations.
51
52Additionally, you must provide a `Config` associated type, which must implement
53`Serialize + Deserialize`. This type will be used to load configuration from
54a config file and from the command line arguments for your language, which will
55be passed to `new_from_config`. This type should provide defaults for *all* of
56its fields; it should always tolerate being loaded from an empty config file.
57When *serializing*, this type should always output all of its fields, even if
58they're defaulted.
59
60It's also very common to implement:
61
62- `mapped_type`, to define certain types as having specialied handling in your
63  lanugage.
64- `begin_type`, `end_type`, and `write_additional_files`, to add additional
65  per-file or per-directory content to your output.
66
67If your language spells type names in an unusual way (here, defined as the C++
68descended convention, where a type might be spelled `Foo<Bar, Baz<Zed>>`),
69you'll want to implement the `format_*` methods.
70
71Other methods can be specialized as needed.
72
73# Typeshare execution flow.
74
75This is the detailed flow of how the `Language` trait is actually used by
76typeshare. It includes references to all of the methods that are called, and
77in what order. For these examples, we're assuming a hypothetical implementation
78for Kotlin, which means that there must be `impl Language<'_> for Kotlin`
79somewhere.
80
811. The language's config is loaded from the config file and command line
82arguments:
83
84```ignore
85let config = Kotlin::Config::deserialize(config_file)?;
86```
87
882. The language is loaded from the config file via `new_from_config`. This is
89where the implementation has the opportunity to report any configuration errors
90that weren't detected during deserialization.
91
92```ignore
93let language = Kotlin::new_from_config(config)?;
94```
95
963. If we're in multi-file mode, we call `output_filename_for_crate` for each rust
97crate being typeshared to determine the _filename_ for the output file that
98will contain that crate's types.
99
100```ignore
101let files = crate_names
102    .iter()
103    .map(|crate_name| {
104        let filename = language.output_file_for_type(crate_name);
105        File::create(output_directory.join(filename))
106    });
107}
108```
109
1104. We call `begin_file` on the output type to print any headers or preamble
111appropriate for this language. In multi-file mode, `begin_file` is called once
112for each output file; in this case, the `mode` argument will include the crate
113name.
114
115```ignore
116language.begin_file(&mut file, mode)
117```
118
1195. In mutli-file mode only, we call `write_imports` with a list of all the
120types that are being imported from other typeshare'd crates. This allows the
121language to emit appropriate import statements for its own language.
122
123```ignore
124// Only in multi-file mode
125language.write_imports(&mut file, crate_name, computed_imports)
126```
127
1286. For EACE typeshared item in being typeshared, we call `write_enum`,
129`write_struct`, `write_type_alias`, or `write_const`, as appropriate.
130
131```ignore
132language.write_struct(&mut file, parsed_struct);
133language.write_enum(&mut file, parsed_enum);
134```
135
1366a. In your implementations of these methods, we recommend that you call
137`format_type` for the fields of these types. `format_type` will in turn call
138`format_simple_type`, `format_generic_type`, or `format_special_type`, as
139appropriate; usually it is only necessary for you to implmenent
140`format_special_type` yourself, and use the default implementations for the
141others. The `format_*` methods will otherwise never be called by typeshare.
142
1436b. If your language doesn't natively support data-containing enums, we
144recommand that you call `write_types_for_anonymous_structs` in your
145implementation of `write_enum`; this will call `write_struct` for each variant
146of the enum.
147
1487. After all the types are written, we call `end_file`, with the same
149arguments that were passed to `begin_file`.
150
151```ignore
152language.end_file(&mut file, mode)
153```
154
1558. In multi-file mode only, after ALL files are written, we call
156`write_additional_files` with the output directory. This gives the language an
157opportunity to create any files resembling `mod.rs` or `index.js` as it might
158require.
159
160```ignore
161// Only in multi-file mode
162language.write_additional_files(&output_directory, generated_files.iter())
163```
164
165NOTE: at this stage, multi-file output is still work-in-progress, as the
166algorithms that compute import sets are being rewritten. The API presented
167here is stable, but output might be buggy while issues with import detection
168are resolved.
169*/
170pub trait Language<'config>: Sized {
171    /**
172    The configuration for this language. This configuration will be loaded
173    from a config file and, where possible, from the command line, via
174    `serde`.
175
176    It is important that this type include `#[serde(default)]` or something
177    equivelent, so that a config can be loaded with default setting even
178    if this language isn't present in the config file.
179
180    The `serialize` implementation for this type should NOT skip keys, if
181    possible.
182    */
183    type Config: serde::Deserialize<'config> + serde::Serialize;
184
185    /**
186    The lowercase conventional name for this language. This should be a
187    single identifier. It will be used as a prefix for various things;
188    for instance, it will identify this language in the config file, and
189    be used as a prefix when generating CLI parameters
190    */
191    const NAME: &'static str;
192
193    /// Create an instance of this language from the loaded configuration.
194    fn new_from_config(config: Self::Config) -> anyhow::Result<Self>;
195
196    /**
197    Most languages provide manual overrides for specific types. When a type
198    is formatted with a name that matches a mapped type, the mapped type
199    name is formatted instead.
200
201    By default this returns `None` for all types.
202    */
203    fn mapped_type(
204        &self,
205        #[expect(unused_variables)] type_name: &TypeName,
206    ) -> Option<Cow<'_, str>> {
207        None
208    }
209
210    /**
211    In multi-file mode, typeshare will output one separate file with this
212    name for each crate in the input set. These file names should have the
213    appropriate naming convention and extension for this language.
214
215    This method isn't used in single-file mode.
216    */
217    fn output_filename_for_crate(&self, crate_name: &CrateName) -> String;
218
219    /**
220    Convert a Rust type into a type from this language. By default this
221    calls `format_simple_type`, `format_generic_type`, or
222    `format_special_type`, depending on the type. There should only rarely
223    be a need to specialize this.
224
225    This method should be called by the `write_*` methods to write the types
226    contained by type definitions.
227
228    The `generic_context` is the set of generic types being provided by
229    the enclosing type definition; this allows languages that do type
230    renaming to be able to distinguish concrete type names (like `int`)
231    from generic type names (like `T`)
232    */
233    fn format_type(&self, ty: &RustType, generic_context: &[TypeName]) -> anyhow::Result<String> {
234        match ty {
235            RustType::Simple { id } => self.format_simple_type(id, generic_context),
236            RustType::Generic { id, parameters } => {
237                self.format_generic_type(id, parameters.as_slice(), generic_context)
238            }
239            RustType::Special(special) => self.format_special_type(special, generic_context),
240        }
241    }
242
243    /**
244    Format a simple type with no generic parameters.
245
246    By default, this first checks `self.mapped_type` to see if there's an
247    alternative way this type should be formatted, and otherwise prints the
248    `base` verbatim.
249
250    The `generic_context` is the set of generic types being provided by
251    the enclosing type definition; this allows languages that do type
252    renaming to be able to distinguish concrete type names (like `int`)
253    from generic type names (like `T`).
254    */
255    fn format_simple_type(
256        &self,
257        base: &TypeName,
258        #[expect(unused_variables)] generic_context: &[TypeName],
259    ) -> anyhow::Result<String> {
260        Ok(match self.mapped_type(base) {
261            Some(mapped) => mapped.to_string(),
262            None => base.to_string(),
263        })
264    }
265
266    /**
267    Format a generic type that takes in generic arguments, which
268    may be recursive.
269
270    By default, this creates a composite type name by appending
271    `self.format_simple_type` and `self.format_generic_parameters`. With
272    their default implementations, this will print `name<parameters>`,
273    which is a common syntax used by many languages for generics.
274
275    The `generic_context` is the set of generic types being provided by
276    the enclosing type definition; this allows languages that do type
277    renaming to be able to distinguish concrete type names (like `int`)
278    from generic type names (like `T`).
279    */
280    fn format_generic_type(
281        &self,
282        base: &TypeName,
283        parameters: &[RustType],
284        generic_context: &[TypeName],
285    ) -> anyhow::Result<String> {
286        match parameters.is_empty() {
287            true => self.format_simple_type(base, generic_context),
288            false => Ok(match self.mapped_type(base) {
289                Some(mapped) => mapped.to_string(),
290                None => format!(
291                    "{}{}",
292                    self.format_simple_type(base, generic_context)?,
293                    self.format_generic_parameters(parameters, generic_context)?,
294                ),
295            }),
296        }
297    }
298
299    /**
300    Format generic parameters into a syntax used by this language. By
301    default, this returns `<A, B, C, ...>`, since that's a common syntax
302    used by most languages.
303
304    This method is only used when `format_generic_type` calls it.
305
306    The `generic_context` is the set of generic types being provided by
307    the enclosing type definition; this allows languages that do type
308    renaming to be able to distinguish concrete type names (like `int`)
309    from generic type names (like `T`).
310    */
311    fn format_generic_parameters(
312        &self,
313        parameters: &[RustType],
314        generic_context: &[TypeName],
315    ) -> anyhow::Result<String> {
316        parameters
317            .iter()
318            .map(|ty| self.format_type(ty, generic_context))
319            .process_results(|mut formatted| format!("<{}>", formatted.join(", ")))
320    }
321
322    /**
323    Format a special type. This will handle things like arrays, primitives,
324    options, and so on. Every lanugage has different spellings for these types,
325    so this is one of the key methods that a language implementation needs to
326    deal with.
327    */
328    fn format_special_type(
329        &self,
330        special_ty: &SpecialRustType,
331        generic_context: &[TypeName],
332    ) -> anyhow::Result<String>;
333
334    /**
335    Write a header for typeshared code. This is called unconditionally
336    at the start of the output file (or at the start of all files, if in
337    multi-file mode).
338
339    By default this does nothing.
340    */
341    fn begin_file(
342        &self,
343        #[expect(unused_variables)] w: &mut impl Write,
344        #[expect(unused_variables)] mode: FilesMode<&CrateName>,
345    ) -> anyhow::Result<()> {
346        Ok(())
347    }
348
349    /**
350    For generating import statements. This is called only in multi-file
351    mode, after `begin_file` and before any other writes.
352
353    `imports` includes an ordered list of type names that typeshare
354    believes are being imported by this file, grouped by the crates they
355    come from. `typeshare` guarantees that these will be passed in some stable
356    order, so that your output remains consistent.
357
358    NOTE: Currently this is bugged and doesn't receive correct imports.
359    This will be fixed in a future release.
360    */
361    fn write_imports<'a, Crates, Types>(
362        &self,
363        writer: &mut impl Write,
364        crate_name: &CrateName,
365        imports: Crates,
366    ) -> anyhow::Result<()>
367    where
368        Crates: IntoIterator<Item = (&'a CrateName, Types)>,
369        Types: IntoIterator<Item = &'a TypeName>;
370
371    /**
372    Write a header for typeshared code. This is called unconditionally
373    at the end of the output file (or at the end of all files, if in
374    multi-file mode).
375
376    By default this does nothing.
377    */
378    fn end_file(
379        &self,
380        #[expect(unused_variables)] w: &mut impl Write,
381        #[expect(unused_variables)] mode: FilesMode<&CrateName>,
382    ) -> anyhow::Result<()> {
383        Ok(())
384    }
385
386    /**
387    Write a type alias definition.
388
389    Example of a type alias:
390    ```
391    type MyTypeAlias = String;
392    ```
393
394    Generally this method will call `self.format_type` to produce the
395    aliased type name in the output definition.
396    */
397    fn write_type_alias(&self, w: &mut impl Write, t: &RustTypeAlias) -> anyhow::Result<()>;
398
399    /**
400    Write a struct definition.
401
402    Example of a struct:
403    ```ignore
404    #[typeshare]
405    #[derive(Serialize, Deserialize)]
406    struct Foo {
407        bar: String
408    }
409    ```
410
411    Generally this method will call `self.format_type` to produce the types
412    of the individual fields.
413    */
414    fn write_struct(&self, w: &mut impl Write, rs: &RustStruct) -> anyhow::Result<()>;
415
416    /**
417    Write an enum definition.
418
419    Example of an enum:
420    ```ignore
421    #[typeshare]
422    #[derive(Serialize, Deserialize)]
423    #[serde(tag = "type", content = "content")]
424    enum Foo {
425        Fizz,
426        Buzz { yep_this_works: bool }
427    }
428    ```
429
430    Generally this will call `self.format_type` to produce the types of
431    the individual fields. If this enum is an algebraic sum type, and this
432    language doesn't really support those, it should consider calling
433    `write_struct_types_for_enum_variants` to produce struct types matching
434    those variants, which can be used for this language's abstraction for
435    data like this.
436    */
437    fn write_enum(&self, w: &mut impl Write, e: &RustEnum) -> anyhow::Result<()>;
438
439    /**
440    Write a constant variable.
441
442    Example of a constant variable:
443    ```
444    const ANSWER_TO_EVERYTHING: u32 = 42;
445    ```
446
447    If necessary, generally this will call `self.format_type` to produce
448    the type of this constant (though some languages are allowed to omit
449    it).
450    */
451    fn write_const(&self, w: &mut impl Write, c: &RustConst) -> anyhow::Result<()>;
452
453    /**
454    Write out named types to represent anonymous struct enum variants.
455
456    Take the following enum as an example:
457
458    ```
459    enum AlgebraicEnum {
460        AnonymousStruct {
461            field: String,
462            another_field: bool,
463        },
464
465        Variant2 {
466            field: i32,
467        }
468    }
469    ```
470
471    This function will write out a pair of struct types resembling:
472
473    ```compile_fail
474    struct AnonymousStruct {
475        field: String,
476        another_field: bool,
477    }
478
479    struct Variant2 {
480        field: i32,
481    }
482    ```
483
484    Except that it will use `make_struct_name` to compute the names of these
485    structs based on the names of the variants.
486
487    This method isn't called by default; it is instead provided as a helper
488    for your implementation of `write_enum`, since many languages don't have
489    a specific notion of an algebraic sum type, and have to emulate it with
490    subclasses, tagged unions, or something similar.
491    */
492    fn write_struct_types_for_enum_variants(
493        &self,
494        w: &mut impl Write,
495        e: &RustEnum,
496        make_struct_name: &impl Fn(&TypeName) -> String,
497    ) -> anyhow::Result<()> {
498        let variants = match e {
499            RustEnum::Unit { .. } => return Ok(()),
500            RustEnum::Algebraic { variants, .. } => variants.iter().filter_map(|v| match v {
501                RustEnumVariant::AnonymousStruct { fields, shared } => Some((fields, shared)),
502                _ => None,
503            }),
504        };
505
506        for (fields, variant) in variants {
507            let struct_name = make_struct_name(&variant.id.original);
508
509            // Builds the list of generic types (e.g [T, U, V]), by digging
510            // through the fields recursively and comparing against the
511            // enclosing enum's list of generic parameters.
512            let generic_types = fields
513                .iter()
514                .flat_map(|field| {
515                    e.shared()
516                        .generic_types
517                        .iter()
518                        .filter(|g| field.ty.contains_type(g))
519                })
520                .unique()
521                .cloned()
522                .collect();
523
524            self.write_struct(
525                w,
526                &RustStruct {
527                    id: Id {
528                        original: TypeName::new_string(struct_name.clone()),
529                        renamed: TypeName::new_string(struct_name),
530                    },
531                    fields: fields.clone(),
532                    generic_types,
533                    comments: vec![format!(
534                        "Generated type representing the anonymous struct \
535                        variant `{}` of the `{}` Rust enum",
536                        &variant.id.original,
537                        &e.shared().id.original,
538                    )],
539                    decorators: e.shared().decorators.clone(),
540                },
541            )
542            .with_context(|| {
543                format!(
544                    "failed to write struct type for the \
545                    `{}` variant of the `{}` enum",
546                    variant.id.original,
547                    e.shared().id.original
548                )
549            })?;
550        }
551
552        Ok(())
553    }
554
555    /**
556    If a type with this name appears in a type definition, it will be
557    unconditionally excluded from cross-file import analysis. Usually this will
558    be the types in `mapped_types`, since those are types with special behavior
559    (for instance, a datetime date provided as a standard type by your
560    langauge).
561
562    This is mostly a performance optimization. By default it returns `false`
563    for all types.
564    */
565    fn exclude_from_import_analysis(&self, #[expect(unused_variables)] name: &TypeName) -> bool {
566        false
567    }
568
569    /**
570    In multi-file mode, this method is called after all of the individual
571    typeshare files are completely generated. Use it to generate any
572    additional files your language might need in this directory to
573    function correctly, such as a `mod.rs`, `__init__.py`, `index.js`, or
574    anything else like that.
575
576    It passed a list of crate names, for each crate that was typeshared, and
577    the associated file paths, indicating all of the files that were generated
578    by typeshare.
579
580    By default, this does nothing.
581    */
582    fn write_additional_files<'a>(
583        &self,
584        #[expect(unused_variables)] output_folder: &Path,
585        #[expect(unused_variables)] output_files: impl IntoIterator<Item = (&'a CrateName, &'a Path)>,
586    ) -> anyhow::Result<()> {
587        Ok(())
588    }
589}