typeshare_model/language.rs
1use std::{borrow::Cow, io::Write, path::Path};
2
3use anyhow::Context;
4use itertools::Itertools;
5
6use crate::parsed_data::{
7 CrateName, Id, RustConst, RustEnum, RustEnumVariant, RustStruct, RustType, RustTypeAlias,
8 SpecialRustType, TypeName,
9};
10
11/// If we're in multifile mode, this enum contains the crate name for the
12/// specific file
13#[derive(Debug, Clone, Copy)]
14pub enum FilesMode<T> {
15 Single,
16 Multi(T),
17}
18
19impl<T> FilesMode<T> {
20 pub fn map<U>(self, op: impl FnOnce(T) -> U) -> FilesMode<U> {
21 match self {
22 FilesMode::Single => FilesMode::Single,
23 FilesMode::Multi(value) => FilesMode::Multi(op(value)),
24 }
25 }
26
27 pub fn is_multi(&self) -> bool {
28 matches!(*self, Self::Multi(_))
29 }
30}
31
32/**
33*The* trait you need to implement in order to have your own implementation of
34typeshare. The whole world revolves around this trait.
35
36In general, to implement this correctly, you *must* implement:
37
38- `new_from_config`, which instantiates your `Language` struct from
39 configuration which was read from a config file or the command line
40- `output_filename_for_crate`, which (in multi-file mode) produces a file
41 name from a crate name. All of the typeshared types from that crate will
42 be written to that file.
43- The `write_*` methods, which output the actual type definitions. These
44 methods *should* call `format_type` to format the actual types contained
45 in the type definitions, which will in turn dispatch to the relevant
46 `format_*` method, depending on what kind of type it is.
47- The `format_special_type` method, which outputs things like integer types,
48 arrays, and other builtin or primitive types. This method is only ever called
49 by `format_type`, which is only called if you choose to call it in your
50 `write_*` implementations.
51
52Additionally, you must provide a `Config` associated type, which must implement
53`Serialize + Deserialize`. This type will be used to load configuration from
54a config file and from the command line arguments for your language, which will
55be passed to `new_from_config`. This type should provide defaults for *all* of
56its fields; it should always tolerate being loaded from an empty config file.
57When *serializing*, this type should always output all of its fields, even if
58they're defaulted.
59
60It's also very common to implement:
61
62- `mapped_type`, to define certain types as having specialied handling in your
63 lanugage.
64- `begin_type`, `end_type`, and `write_additional_files`, to add additional
65 per-file or per-directory content to your output.
66
67If your language spells type names in an unusual way (here, defined as the C++
68descended convention, where a type might be spelled `Foo<Bar, Baz<Zed>>`),
69you'll want to implement the `format_*` methods.
70
71Other methods can be specialized as needed.
72
73# Typeshare execution flow.
74
75This is the detailed flow of how the `Language` trait is actually used by
76typeshare. It includes references to all of the methods that are called, and
77in what order. For these examples, we're assuming a hypothetical implementation
78for Kotlin, which means that there must be `impl Language<'_> for Kotlin`
79somewhere.
80
811. The language's config is loaded from the config file and command line
82arguments:
83
84```ignore
85let config = Kotlin::Config::deserialize(config_file)?;
86```
87
882. The language is loaded from the config file via `new_from_config`. This is
89where the implementation has the opportunity to report any configuration errors
90that weren't detected during deserialization.
91
92```ignore
93let language = Kotlin::new_from_config(config)?;
94```
95
963. If we're in multi-file mode, we call `output_filename_for_crate` for each rust
97crate being typeshared to determine the _filename_ for the output file that
98will contain that crate's types.
99
100```ignore
101let files = crate_names
102 .iter()
103 .map(|crate_name| {
104 let filename = language.output_file_for_type(crate_name);
105 File::create(output_directory.join(filename))
106 });
107}
108```
109
1104. We call `begin_file` on the output type to print any headers or preamble
111appropriate for this language. In multi-file mode, `begin_file` is called once
112for each output file; in this case, the `mode` argument will include the crate
113name.
114
115```ignore
116language.begin_file(&mut file, mode)
117```
118
1195. In mutli-file mode only, we call `write_imports` with a list of all the
120types that are being imported from other typeshare'd crates. This allows the
121language to emit appropriate import statements for its own language.
122
123```ignore
124// Only in multi-file mode
125language.write_imports(&mut file, crate_name, computed_imports)
126```
127
1286. For EACE typeshared item in being typeshared, we call `write_enum`,
129`write_struct`, `write_type_alias`, or `write_const`, as appropriate.
130
131```ignore
132language.write_struct(&mut file, parsed_struct);
133language.write_enum(&mut file, parsed_enum);
134```
135
1366a. In your implementations of these methods, we recommend that you call
137`format_type` for the fields of these types. `format_type` will in turn call
138`format_simple_type`, `format_generic_type`, or `format_special_type`, as
139appropriate; usually it is only necessary for you to implmenent
140`format_special_type` yourself, and use the default implementations for the
141others. The `format_*` methods will otherwise never be called by typeshare.
142
1436b. If your language doesn't natively support data-containing enums, we
144recommand that you call `write_types_for_anonymous_structs` in your
145implementation of `write_enum`; this will call `write_struct` for each variant
146of the enum.
147
1487. After all the types are written, we call `end_file`, with the same
149arguments that were passed to `begin_file`.
150
151```ignore
152language.end_file(&mut file, mode)
153```
154
1558. In multi-file mode only, after ALL files are written, we call
156`write_additional_files` with the output directory. This gives the language an
157opportunity to create any files resembling `mod.rs` or `index.js` as it might
158require.
159
160```ignore
161// Only in multi-file mode
162language.write_additional_files(&output_directory, generated_files.iter())
163```
164
165NOTE: at this stage, multi-file output is still work-in-progress, as the
166algorithms that compute import sets are being rewritten. The API presented
167here is stable, but output might be buggy while issues with import detection
168are resolved.
169*/
170pub trait Language<'config>: Sized {
171 /**
172 The configuration for this language. This configuration will be loaded
173 from a config file and, where possible, from the command line, via
174 `serde`.
175
176 It is important that this type include `#[serde(default)]` or something
177 equivelent, so that a config can be loaded with default setting even
178 if this language isn't present in the config file.
179
180 The `serialize` implementation for this type should NOT skip keys, if
181 possible.
182 */
183 type Config: serde::Deserialize<'config> + serde::Serialize;
184
185 /**
186 The lowercase conventional name for this language. This should be a
187 single identifier. It will be used as a prefix for various things;
188 for instance, it will identify this language in the config file, and
189 be used as a prefix when generating CLI parameters
190 */
191 const NAME: &'static str;
192
193 /// Create an instance of this language from the loaded configuration.
194 fn new_from_config(config: Self::Config) -> anyhow::Result<Self>;
195
196 /**
197 Most languages provide manual overrides for specific types. When a type
198 is formatted with a name that matches a mapped type, the mapped type
199 name is formatted instead.
200
201 By default this returns `None` for all types.
202 */
203 fn mapped_type(
204 &self,
205 #[expect(unused_variables)] type_name: &TypeName,
206 ) -> Option<Cow<'_, str>> {
207 None
208 }
209
210 /**
211 In multi-file mode, typeshare will output one separate file with this
212 name for each crate in the input set. These file names should have the
213 appropriate naming convention and extension for this language.
214
215 This method isn't used in single-file mode.
216 */
217 fn output_filename_for_crate(&self, crate_name: &CrateName) -> String;
218
219 /**
220 Convert a Rust type into a type from this language. By default this
221 calls `format_simple_type`, `format_generic_type`, or
222 `format_special_type`, depending on the type. There should only rarely
223 be a need to specialize this.
224
225 This method should be called by the `write_*` methods to write the types
226 contained by type definitions.
227
228 The `generic_context` is the set of generic types being provided by
229 the enclosing type definition; this allows languages that do type
230 renaming to be able to distinguish concrete type names (like `int`)
231 from generic type names (like `T`)
232 */
233 fn format_type(&self, ty: &RustType, generic_context: &[TypeName]) -> anyhow::Result<String> {
234 match ty {
235 RustType::Simple { id } => self.format_simple_type(id, generic_context),
236 RustType::Generic { id, parameters } => {
237 self.format_generic_type(id, parameters.as_slice(), generic_context)
238 }
239 RustType::Special(special) => self.format_special_type(special, generic_context),
240 }
241 }
242
243 /**
244 Format a simple type with no generic parameters.
245
246 By default, this first checks `self.mapped_type` to see if there's an
247 alternative way this type should be formatted, and otherwise prints the
248 `base` verbatim.
249
250 The `generic_context` is the set of generic types being provided by
251 the enclosing type definition; this allows languages that do type
252 renaming to be able to distinguish concrete type names (like `int`)
253 from generic type names (like `T`).
254 */
255 fn format_simple_type(
256 &self,
257 base: &TypeName,
258 #[expect(unused_variables)] generic_context: &[TypeName],
259 ) -> anyhow::Result<String> {
260 Ok(match self.mapped_type(base) {
261 Some(mapped) => mapped.to_string(),
262 None => base.to_string(),
263 })
264 }
265
266 /**
267 Format a generic type that takes in generic arguments, which
268 may be recursive.
269
270 By default, this creates a composite type name by appending
271 `self.format_simple_type` and `self.format_generic_parameters`. With
272 their default implementations, this will print `name<parameters>`,
273 which is a common syntax used by many languages for generics.
274
275 The `generic_context` is the set of generic types being provided by
276 the enclosing type definition; this allows languages that do type
277 renaming to be able to distinguish concrete type names (like `int`)
278 from generic type names (like `T`).
279 */
280 fn format_generic_type(
281 &self,
282 base: &TypeName,
283 parameters: &[RustType],
284 generic_context: &[TypeName],
285 ) -> anyhow::Result<String> {
286 match parameters.is_empty() {
287 true => self.format_simple_type(base, generic_context),
288 false => Ok(match self.mapped_type(base) {
289 Some(mapped) => mapped.to_string(),
290 None => format!(
291 "{}{}",
292 self.format_simple_type(base, generic_context)?,
293 self.format_generic_parameters(parameters, generic_context)?,
294 ),
295 }),
296 }
297 }
298
299 /**
300 Format generic parameters into a syntax used by this language. By
301 default, this returns `<A, B, C, ...>`, since that's a common syntax
302 used by most languages.
303
304 This method is only used when `format_generic_type` calls it.
305
306 The `generic_context` is the set of generic types being provided by
307 the enclosing type definition; this allows languages that do type
308 renaming to be able to distinguish concrete type names (like `int`)
309 from generic type names (like `T`).
310 */
311 fn format_generic_parameters(
312 &self,
313 parameters: &[RustType],
314 generic_context: &[TypeName],
315 ) -> anyhow::Result<String> {
316 parameters
317 .iter()
318 .map(|ty| self.format_type(ty, generic_context))
319 .process_results(|mut formatted| format!("<{}>", formatted.join(", ")))
320 }
321
322 /**
323 Format a special type. This will handle things like arrays, primitives,
324 options, and so on. Every lanugage has different spellings for these types,
325 so this is one of the key methods that a language implementation needs to
326 deal with.
327 */
328 fn format_special_type(
329 &self,
330 special_ty: &SpecialRustType,
331 generic_context: &[TypeName],
332 ) -> anyhow::Result<String>;
333
334 /**
335 Write a header for typeshared code. This is called unconditionally
336 at the start of the output file (or at the start of all files, if in
337 multi-file mode).
338
339 By default this does nothing.
340 */
341 fn begin_file(
342 &self,
343 #[expect(unused_variables)] w: &mut impl Write,
344 #[expect(unused_variables)] mode: FilesMode<&CrateName>,
345 ) -> anyhow::Result<()> {
346 Ok(())
347 }
348
349 /**
350 For generating import statements. This is called only in multi-file
351 mode, after `begin_file` and before any other writes.
352
353 `imports` includes an ordered list of type names that typeshare
354 believes are being imported by this file, grouped by the crates they
355 come from. `typeshare` guarantees that these will be passed in some stable
356 order, so that your output remains consistent.
357
358 NOTE: Currently this is bugged and doesn't receive correct imports.
359 This will be fixed in a future release.
360 */
361 fn write_imports<'a, Crates, Types>(
362 &self,
363 writer: &mut impl Write,
364 crate_name: &CrateName,
365 imports: Crates,
366 ) -> anyhow::Result<()>
367 where
368 Crates: IntoIterator<Item = (&'a CrateName, Types)>,
369 Types: IntoIterator<Item = &'a TypeName>;
370
371 /**
372 Write a header for typeshared code. This is called unconditionally
373 at the end of the output file (or at the end of all files, if in
374 multi-file mode).
375
376 By default this does nothing.
377 */
378 fn end_file(
379 &self,
380 #[expect(unused_variables)] w: &mut impl Write,
381 #[expect(unused_variables)] mode: FilesMode<&CrateName>,
382 ) -> anyhow::Result<()> {
383 Ok(())
384 }
385
386 /**
387 Write a type alias definition.
388
389 Example of a type alias:
390 ```
391 type MyTypeAlias = String;
392 ```
393
394 Generally this method will call `self.format_type` to produce the
395 aliased type name in the output definition.
396 */
397 fn write_type_alias(&self, w: &mut impl Write, t: &RustTypeAlias) -> anyhow::Result<()>;
398
399 /**
400 Write a struct definition.
401
402 Example of a struct:
403 ```ignore
404 #[typeshare]
405 #[derive(Serialize, Deserialize)]
406 struct Foo {
407 bar: String
408 }
409 ```
410
411 Generally this method will call `self.format_type` to produce the types
412 of the individual fields.
413 */
414 fn write_struct(&self, w: &mut impl Write, rs: &RustStruct) -> anyhow::Result<()>;
415
416 /**
417 Write an enum definition.
418
419 Example of an enum:
420 ```ignore
421 #[typeshare]
422 #[derive(Serialize, Deserialize)]
423 #[serde(tag = "type", content = "content")]
424 enum Foo {
425 Fizz,
426 Buzz { yep_this_works: bool }
427 }
428 ```
429
430 Generally this will call `self.format_type` to produce the types of
431 the individual fields. If this enum is an algebraic sum type, and this
432 language doesn't really support those, it should consider calling
433 `write_struct_types_for_enum_variants` to produce struct types matching
434 those variants, which can be used for this language's abstraction for
435 data like this.
436 */
437 fn write_enum(&self, w: &mut impl Write, e: &RustEnum) -> anyhow::Result<()>;
438
439 /**
440 Write a constant variable.
441
442 Example of a constant variable:
443 ```
444 const ANSWER_TO_EVERYTHING: u32 = 42;
445 ```
446
447 If necessary, generally this will call `self.format_type` to produce
448 the type of this constant (though some languages are allowed to omit
449 it).
450 */
451 fn write_const(&self, w: &mut impl Write, c: &RustConst) -> anyhow::Result<()>;
452
453 /**
454 Write out named types to represent anonymous struct enum variants.
455
456 Take the following enum as an example:
457
458 ```
459 enum AlgebraicEnum {
460 AnonymousStruct {
461 field: String,
462 another_field: bool,
463 },
464
465 Variant2 {
466 field: i32,
467 }
468 }
469 ```
470
471 This function will write out a pair of struct types resembling:
472
473 ```compile_fail
474 struct AnonymousStruct {
475 field: String,
476 another_field: bool,
477 }
478
479 struct Variant2 {
480 field: i32,
481 }
482 ```
483
484 Except that it will use `make_struct_name` to compute the names of these
485 structs based on the names of the variants.
486
487 This method isn't called by default; it is instead provided as a helper
488 for your implementation of `write_enum`, since many languages don't have
489 a specific notion of an algebraic sum type, and have to emulate it with
490 subclasses, tagged unions, or something similar.
491 */
492 fn write_struct_types_for_enum_variants(
493 &self,
494 w: &mut impl Write,
495 e: &RustEnum,
496 make_struct_name: &impl Fn(&TypeName) -> String,
497 ) -> anyhow::Result<()> {
498 let variants = match e {
499 RustEnum::Unit { .. } => return Ok(()),
500 RustEnum::Algebraic { variants, .. } => variants.iter().filter_map(|v| match v {
501 RustEnumVariant::AnonymousStruct { fields, shared } => Some((fields, shared)),
502 _ => None,
503 }),
504 };
505
506 for (fields, variant) in variants {
507 let struct_name = make_struct_name(&variant.id.original);
508
509 // Builds the list of generic types (e.g [T, U, V]), by digging
510 // through the fields recursively and comparing against the
511 // enclosing enum's list of generic parameters.
512 let generic_types = fields
513 .iter()
514 .flat_map(|field| {
515 e.shared()
516 .generic_types
517 .iter()
518 .filter(|g| field.ty.contains_type(g))
519 })
520 .unique()
521 .cloned()
522 .collect();
523
524 self.write_struct(
525 w,
526 &RustStruct {
527 id: Id {
528 original: TypeName::new_string(struct_name.clone()),
529 renamed: TypeName::new_string(struct_name),
530 },
531 fields: fields.clone(),
532 generic_types,
533 comments: vec![format!(
534 "Generated type representing the anonymous struct \
535 variant `{}` of the `{}` Rust enum",
536 &variant.id.original,
537 &e.shared().id.original,
538 )],
539 decorators: e.shared().decorators.clone(),
540 },
541 )
542 .with_context(|| {
543 format!(
544 "failed to write struct type for the \
545 `{}` variant of the `{}` enum",
546 variant.id.original,
547 e.shared().id.original
548 )
549 })?;
550 }
551
552 Ok(())
553 }
554
555 /**
556 If a type with this name appears in a type definition, it will be
557 unconditionally excluded from cross-file import analysis. Usually this will
558 be the types in `mapped_types`, since those are types with special behavior
559 (for instance, a datetime date provided as a standard type by your
560 langauge).
561
562 This is mostly a performance optimization. By default it returns `false`
563 for all types.
564 */
565 fn exclude_from_import_analysis(&self, #[expect(unused_variables)] name: &TypeName) -> bool {
566 false
567 }
568
569 /**
570 In multi-file mode, this method is called after all of the individual
571 typeshare files are completely generated. Use it to generate any
572 additional files your language might need in this directory to
573 function correctly, such as a `mod.rs`, `__init__.py`, `index.js`, or
574 anything else like that.
575
576 It passed a list of crate names, for each crate that was typeshared, and
577 the associated file paths, indicating all of the files that were generated
578 by typeshare.
579
580 By default, this does nothing.
581 */
582 fn write_additional_files<'a>(
583 &self,
584 #[expect(unused_variables)] output_folder: &Path,
585 #[expect(unused_variables)] output_files: impl IntoIterator<Item = (&'a CrateName, &'a Path)>,
586 ) -> anyhow::Result<()> {
587 Ok(())
588 }
589}