typeshare_model/language.rs
1use std::{borrow::Cow, fmt::Debug, io::Write, path::Path};
2
3use anyhow::Context;
4use itertools::Itertools;
5
6use crate::parsed_data::{
7 CrateName, Id, RustConst, RustEnum, RustEnumVariant, RustStruct, RustType, RustTypeAlias,
8 SpecialRustType, TypeName,
9};
10
11/// If we're in multifile mode, this enum contains the crate name for the
12/// specific file
13#[derive(Debug, Clone, Copy)]
14#[non_exhaustive]
15pub enum FilesMode<T> {
16 Single,
17 Multi(T),
18 // We've had requests for java support, which means we'll need a
19 // 1-file-per-type mode
20}
21
22impl<T> FilesMode<T> {
23 pub fn map<U>(self, op: impl FnOnce(T) -> U) -> FilesMode<U> {
24 match self {
25 FilesMode::Single => FilesMode::Single,
26 FilesMode::Multi(value) => FilesMode::Multi(op(value)),
27 }
28 }
29
30 pub fn is_multi(&self) -> bool {
31 matches!(*self, Self::Multi(_))
32 }
33}
34
35/**
36*The* trait you need to implement in order to have your own implementation of
37typeshare. The whole world revolves around this trait.
38
39In general, to implement this correctly, you *must* implement:
40
41- `new_from_config`, which instantiates your `Language` struct from
42 configuration which was read from a config file or the command line
43- `output_filename_for_crate`, which (in multi-file mode) produces a file
44 name from a crate name. All of the typeshared types from that crate will
45 be written to that file.
46- The `write_*` methods, which output the actual type definitions. These
47 methods *should* call `format_type` to format the actual types contained
48 in the type definitions, which will in turn dispatch to the relevant
49 `format_*` method, depending on what kind of type it is.
50- The `format_special_type` method, which outputs things like integer types,
51 arrays, and other builtin or primitive types. This method is only ever called
52 by `format_type`, which is only called if you choose to call it in your
53 `write_*` implementations.
54
55Additionally, you must provide a `Config` associated type, which must implement
56`Serialize + Deserialize`. This type will be used to load configuration from
57a config file and from the command line arguments for your language, which will
58be passed to `new_from_config`. This type should provide defaults for *all* of
59its fields; it should always tolerate being loaded from an empty config file.
60When *serializing*, this type should always output all of its fields, even if
61they're defaulted.
62
63It's also very common to implement:
64
65- `mapped_type`, to define certain types as having specialied handling in your
66 lanugage.
67- `begin_file`, `end_file`, and `write_additional_files`, to add additional
68 per-file or per-directory content to your output.
69
70If your language spells type names in an unusual way (here, defined as the C++
71descended convention, where a type might be spelled `Foo<Bar, Baz<Zed>>`),
72you'll want to implement the `format_*` methods.
73
74Other methods can be specialized as needed.
75
76# Typeshare execution flow.
77
78This is the detailed flow of how the `Language` trait is actually used by
79typeshare. It includes references to all of the methods that are called, and
80in what order. For these examples, we're assuming a hypothetical implementation
81for Kotlin, which means that there must be `impl Language<'_> for Kotlin`
82somewhere.
83
841. The language's config is loaded from the config file and command line
85arguments:
86
87```ignore
88let config = Kotlin::Config::deserialize(config_file)?;
89```
90
912. The language is loaded from the config file via `new_from_config`. This is
92where the implementation has the opportunity to report any configuration errors
93that weren't detected during deserialization.
94
95```ignore
96let language = Kotlin::new_from_config(config)?;
97```
98
993. If we're in multi-file mode, we call `output_filename_for_crate` for each rust
100crate being typeshared to determine the _filename_ for the output file that
101will contain that crate's types.
102
103```ignore
104let files = crate_names
105 .iter()
106 .map(|crate_name| {
107 let filename = language.output_filename_for_crate(crate_name);
108 File::create(output_directory.join(filename))
109 });
110}
111```
112
1134. We call `begin_file` on the output type to print any headers or preamble
114appropriate for this language. In multi-file mode, `begin_file` is called once
115for each output file; in this case, the `mode` argument will include the crate
116name.
117
118```ignore
119language.begin_file(&mut file, mode)
120```
121
1225. In mutli-file mode only, we call `write_imports` with a list of all the
123types that are being imported from other typeshare'd crates. This allows the
124language to emit appropriate import statements for its own language.
125
126```ignore
127// Only in multi-file mode
128language.write_imports(&mut file, crate_name, computed_imports)
129```
130
1316. For EACE typeshared item in being typeshared, we call `write_enum`,
132`write_struct`, `write_type_alias`, or `write_const`, as appropriate.
133
134```ignore
135language.write_struct(&mut file, parsed_struct);
136language.write_enum(&mut file, parsed_enum);
137```
138
1396a. In your implementations of these methods, we recommend that you call
140`format_type` for the fields of these types. `format_type` will in turn call
141`format_simple_type`, `format_generic_type`, or `format_special_type`, as
142appropriate; usually it is only necessary for you to implmenent
143`format_special_type` yourself, and use the default implementations for the
144others. The `format_*` methods will otherwise never be called by typeshare.
145
1466b. If your language doesn't natively support data-containing enums, we
147recommand that you call `write_types_for_anonymous_structs` in your
148implementation of `write_enum`; this will call `write_struct` for each variant
149of the enum.
150
1517. After all the types are written, we call `end_file`, with the same
152arguments that were passed to `begin_file`.
153
154```ignore
155language.end_file(&mut file, mode)
156```
157
1588. In multi-file mode only, after ALL files are written, we call
159`write_additional_files` with the output directory. This gives the language an
160opportunity to create any files resembling `mod.rs` or `index.js` as it might
161require.
162
163```ignore
164// Only in multi-file mode
165language.write_additional_files(&output_directory, generated_files.iter())
166```
167
168NOTE: at this stage, multi-file output is still work-in-progress, as the
169algorithms that compute import sets are being rewritten. The API presented
170here is stable, but output might be buggy while issues with import detection
171are resolved.
172
173In the future, we hope to make mutli-file mode multithreaded, capable of
174writing multiple files concurrently from a shared `Language` instance.
175`Language` therefore has a `Sync` bound to keep this possibility available.
176*/
177pub trait Language<'config>: Sized + Sync + Debug {
178 /**
179 The configuration for this language. This configuration will be loaded
180 from a config file and, where possible, from the command line, via
181 `serde`.
182
183 It is important that this type include `#[serde(default)]` or something
184 equivelent, so that a config can be loaded with default setting even
185 if this language isn't present in the config file.
186
187 The `serialize` implementation for this type should NOT skip keys, if
188 possible.
189 */
190 type Config: serde::Deserialize<'config> + serde::Serialize;
191
192 /**
193 The lowercase conventional name for this language. This should be a
194 single identifier. It will be used as a prefix for various things;
195 for instance, it will identify this language in the config file, and
196 be used as a prefix when generating CLI parameters
197 */
198 const NAME: &'static str;
199
200 /// Create an instance of this language from the loaded configuration.
201 fn new_from_config(config: Self::Config) -> anyhow::Result<Self>;
202
203 /**
204 Most languages provide manual overrides for specific types. When a type
205 is formatted with a name that matches a mapped type, the mapped type
206 name is formatted instead.
207
208 By default this returns `None` for all types.
209 */
210 fn mapped_type(&self, type_name: &TypeName) -> Option<Cow<'_, str>> {
211 let _ = type_name;
212 None
213 }
214
215 /**
216 In multi-file mode, typeshare will output one separate file with this
217 name for each crate in the input set. These file names should have the
218 appropriate naming convention and extension for this language.
219
220 This method isn't used in single-file mode.
221 */
222 fn output_filename_for_crate(&self, crate_name: &CrateName) -> String;
223
224 /**
225 Convert a Rust type into a type from this language. By default this
226 calls `format_simple_type`, `format_generic_type`, or
227 `format_special_type`, depending on the type. There should only rarely
228 be a need to specialize this.
229
230 This method should be called by the `write_*` methods to write the types
231 contained by type definitions.
232
233 The `generic_context` is the set of generic types being provided by
234 the enclosing type definition; this allows languages that do type
235 renaming to be able to distinguish concrete type names (like `int`)
236 from generic type names (like `T`)
237 */
238 fn format_type(&self, ty: &RustType, generic_context: &[TypeName]) -> anyhow::Result<String> {
239 match ty {
240 RustType::Simple { id } => self.format_simple_type(id, generic_context),
241 RustType::Generic { id, parameters } => {
242 self.format_generic_type(id, parameters.as_slice(), generic_context)
243 }
244 RustType::Special(special) => self.format_special_type(special, generic_context),
245 }
246 }
247
248 /**
249 Format a simple type with no generic parameters.
250
251 By default, this first checks `self.mapped_type` to see if there's an
252 alternative way this type should be formatted, and otherwise prints the
253 `base` verbatim.
254
255 The `generic_context` is the set of generic types being provided by
256 the enclosing type definition; this allows languages that do type
257 renaming to be able to distinguish concrete type names (like `int`)
258 from generic type names (like `T`).
259 */
260 fn format_simple_type(
261 &self,
262 base: &TypeName,
263 generic_context: &[TypeName],
264 ) -> anyhow::Result<String> {
265 let _ = generic_context;
266 Ok(match self.mapped_type(base) {
267 Some(mapped) => mapped.to_string(),
268 None => base.to_string(),
269 })
270 }
271
272 /**
273 Format a generic type that takes in generic arguments, which
274 may be recursive.
275
276 By default, this creates a composite type name by appending
277 `self.format_simple_type` and `self.format_generic_parameters`. With
278 their default implementations, this will print `name<parameters>`,
279 which is a common syntax used by many languages for generics.
280
281 The `generic_context` is the set of generic types being provided by
282 the enclosing type definition; this allows languages that do type
283 renaming to be able to distinguish concrete type names (like `int`)
284 from generic type names (like `T`).
285 */
286 fn format_generic_type(
287 &self,
288 base: &TypeName,
289 parameters: &[RustType],
290 generic_context: &[TypeName],
291 ) -> anyhow::Result<String> {
292 match parameters.is_empty() {
293 true => self.format_simple_type(base, generic_context),
294 false => Ok(match self.mapped_type(base) {
295 Some(mapped) => mapped.to_string(),
296 None => format!(
297 "{}{}",
298 self.format_simple_type(base, generic_context)?,
299 self.format_generic_parameters(parameters, generic_context)?,
300 ),
301 }),
302 }
303 }
304
305 /**
306 Format generic parameters into a syntax used by this language. By
307 default, this returns `<A, B, C, ...>`, since that's a common syntax
308 used by most languages.
309
310 This method is only used when `format_generic_type` calls it.
311
312 The `generic_context` is the set of generic types being provided by
313 the enclosing type definition; this allows languages that do type
314 renaming to be able to distinguish concrete type names (like `int`)
315 from generic type names (like `T`).
316 */
317 fn format_generic_parameters(
318 &self,
319 parameters: &[RustType],
320 generic_context: &[TypeName],
321 ) -> anyhow::Result<String> {
322 parameters
323 .iter()
324 .map(|ty| self.format_type(ty, generic_context))
325 .process_results(|mut formatted| format!("<{}>", formatted.join(", ")))
326 }
327
328 /**
329 Format a special type. This will handle things like arrays, primitives,
330 options, and so on. Every lanugage has different spellings for these types,
331 so this is one of the key methods that a language implementation needs to
332 deal with.
333 */
334 fn format_special_type(
335 &self,
336 special_ty: &SpecialRustType,
337 generic_context: &[TypeName],
338 ) -> anyhow::Result<String>;
339
340 /**
341 Write a header for typeshared code. This is called unconditionally
342 at the start of the output file (or at the start of all files, if in
343 multi-file mode).
344
345 By default this does nothing.
346 */
347 fn begin_file(&self, w: &mut impl Write, mode: FilesMode<&CrateName>) -> anyhow::Result<()> {
348 let _ = (w, mode);
349 Ok(())
350 }
351
352 /**
353 For generating import statements. This is called only in multi-file
354 mode, after `begin_file` and before any other writes.
355
356 `imports` includes an ordered list of type names that typeshare
357 believes are being imported by this file, grouped by the crates they
358 come from. `typeshare` guarantees that these will be passed in some stable
359 order, so that your output remains consistent.
360
361 NOTE: Currently this is bugged and doesn't receive correct imports.
362 This will be fixed in a future release.
363 */
364 fn write_imports<'a, Crates, Types>(
365 &self,
366 writer: &mut impl Write,
367 crate_name: &CrateName,
368 imports: Crates,
369 ) -> anyhow::Result<()>
370 where
371 Crates: IntoIterator<Item = (&'a CrateName, Types)>,
372 Types: IntoIterator<Item = &'a TypeName>;
373
374 /**
375 Write a header for typeshared code. This is called unconditionally
376 at the end of the output file (or at the end of all files, if in
377 multi-file mode).
378
379 By default this does nothing.
380 */
381 fn end_file(&self, w: &mut impl Write, mode: FilesMode<&CrateName>) -> anyhow::Result<()> {
382 let _ = (w, mode);
383 Ok(())
384 }
385
386 /**
387 Write a type alias definition.
388
389 Example of a type alias:
390 ```
391 type MyTypeAlias = String;
392 ```
393
394 Generally this method will call `self.format_type` to produce the
395 aliased type name in the output definition.
396 */
397 fn write_type_alias(&self, w: &mut impl Write, t: &RustTypeAlias) -> anyhow::Result<()>;
398
399 /**
400 Write a struct definition.
401
402 Example of a struct:
403 ```ignore
404 #[typeshare]
405 #[derive(Serialize, Deserialize)]
406 struct Foo {
407 bar: String
408 }
409 ```
410
411 Generally this method will call `self.format_type` to produce the types
412 of the individual fields.
413 */
414 fn write_struct(&self, w: &mut impl Write, rs: &RustStruct) -> anyhow::Result<()>;
415
416 /**
417 Write an enum definition.
418
419 Example of an enum:
420 ```ignore
421 #[typeshare]
422 #[derive(Serialize, Deserialize)]
423 #[serde(tag = "type", content = "content")]
424 enum Foo {
425 Fizz,
426 Buzz { yep_this_works: bool }
427 }
428 ```
429
430 Generally this will call `self.format_type` to produce the types of
431 the individual fields. If this enum is an algebraic sum type, and this
432 language doesn't really support those, it should consider calling
433 `write_struct_types_for_enum_variants` to produce struct types matching
434 those variants, which can be used for this language's abstraction for
435 data like this.
436 */
437 fn write_enum(&self, w: &mut impl Write, e: &RustEnum) -> anyhow::Result<()>;
438
439 /**
440 Write a constant variable.
441
442 Example of a constant variable:
443 ```
444 const ANSWER_TO_EVERYTHING: u32 = 42;
445 ```
446
447 If necessary, generally this will call `self.format_type` to produce
448 the type of this constant (though some languages are allowed to omit
449 it).
450 */
451 fn write_const(&self, w: &mut impl Write, c: &RustConst) -> anyhow::Result<()>;
452
453 /**
454 Write out named types to represent anonymous struct enum variants.
455
456 Take the following enum as an example:
457
458 ```
459 enum AlgebraicEnum {
460 AnonymousStruct {
461 field: String,
462 another_field: bool,
463 },
464
465 Variant2 {
466 field: i32,
467 }
468 }
469 ```
470
471 This function will write out a pair of struct types resembling:
472
473 ```
474 struct AnonymousStruct {
475 field: String,
476 another_field: bool,
477 }
478
479 struct Variant2 {
480 field: i32,
481 }
482 ```
483
484 Except that it will use `make_struct_name` to compute the names of these
485 structs based on the names of the variants.
486
487 This method isn't called by default; it is instead provided as a helper
488 for your implementation of `write_enum`, since many languages don't have
489 a specific notion of an algebraic sum type, and have to emulate it with
490 subclasses, tagged unions, or something similar.
491 */
492 fn write_struct_types_for_enum_variants(
493 &self,
494 w: &mut impl Write,
495 e: &RustEnum,
496 make_struct_name: &impl Fn(&TypeName) -> String,
497 ) -> anyhow::Result<()> {
498 let variants = match e {
499 RustEnum::Unit { .. } => return Ok(()),
500 RustEnum::Algebraic { variants, .. } => variants.iter().filter_map(|v| match v {
501 RustEnumVariant::AnonymousStruct { fields, shared } => Some((fields, shared)),
502 _ => None,
503 }),
504 };
505
506 for (fields, variant) in variants {
507 let struct_name = make_struct_name(&variant.id.original);
508
509 // Builds the list of generic types (e.g [T, U, V]), by digging
510 // through the fields recursively and comparing against the
511 // enclosing enum's list of generic parameters.
512 let generic_types = fields
513 .iter()
514 .flat_map(|field| {
515 e.shared()
516 .generic_types
517 .iter()
518 .filter(|g| field.ty.contains_type(g))
519 })
520 .unique()
521 .cloned()
522 .collect();
523
524 self.write_struct(
525 w,
526 &RustStruct {
527 id: Id {
528 original: TypeName::new_string(struct_name.clone()),
529 renamed: TypeName::new_string(struct_name),
530 },
531 fields: fields.clone(),
532 generic_types,
533 comments: vec![format!(
534 "Generated type representing the anonymous struct \
535 variant `{}` of the `{}` Rust enum",
536 &variant.id.original,
537 &e.shared().id.original,
538 )],
539 decorators: e.shared().decorators.clone(),
540 },
541 )
542 .with_context(|| {
543 format!(
544 "failed to write struct type for the \
545 `{}` variant of the `{}` enum",
546 variant.id.original,
547 e.shared().id.original
548 )
549 })?;
550 }
551
552 Ok(())
553 }
554
555 /**
556 If a type with this name appears in a type definition, it will be
557 unconditionally excluded from cross-file import analysis. Usually this will
558 be the types in `mapped_types`, since those are types with special behavior
559 (for instance, a datetime date provided as a standard type by your
560 langauge).
561
562 This is mostly a performance optimization. By default it returns `false`
563 for all types.
564 */
565 fn exclude_from_import_analysis(&self, name: &TypeName) -> bool {
566 let _ = name;
567 false
568 }
569
570 /**
571 In multi-file mode, this method is called after all of the individual
572 typeshare files are completely generated. Use it to generate any
573 additional files your language might need in this directory to
574 function correctly, such as a `mod.rs`, `__init__.py`, `index.js`, or
575 anything else like that.
576
577 It passed a list of crate names, for each crate that was typeshared, and
578 the associated file paths, indicating all of the files that were generated
579 by typeshare.
580
581 By default, this does nothing.
582 */
583 fn write_additional_files<'a>(
584 &self,
585 output_folder: &Path,
586 output_files: impl IntoIterator<Item = (&'a CrateName, &'a Path)>,
587 ) -> anyhow::Result<()> {
588 let _ = (output_folder, output_files);
589 Ok(())
590 }
591}