Struct crowbook_text_processing::FrenchFormatter [] [src]

pub struct FrenchFormatter { /* fields omitted */ }

French typographic formatter.

The purpose of this struct is to try to make a text more typographically correct, according to french typographic rules. This means:

  • making spaces before ?, !, ; narrow non-breaking space;
  • making spaces before : non-breaking space;
  • making space after for dialog a demi em space;
  • making spaces after « and before » non-breking space or narrow non-breking space, according to the circumstances (dialog or a few quoted words).
  • making spaces in numbers, e.g. 80 000 or 50 € narrow and non-breaking.

Additionally, this feature use functions that are "generic" (not specific to french language) in order to:

  • replace straight quotes (' and ") with curly, typographic ones;
  • replace ellipsis (...) with the unicode character ().

As some of these features require a bit of guessing sometimes, there are some paremeters that can be set if you want better results.

Example

use crowbook_text_processing::FrenchFormatter;
let input = "Un texte à 'formater', n'est-ce pas ?";
let output = FrenchFormatter::new()
             .typographic_ellipsis(false) // don't replace ellipsis
             .format_tex(input); // format to tex (so non-breaking
                                 // spaces are visible in assert_eq!)
assert_eq!(&output, "Un texte à ‘formater’, n’est-ce pas~?");

Methods

impl FrenchFormatter
[src]

[src]

Create a new FrenchFormatter with default settings

[src]

Sets the threshold currency.

After that number of characters, assume it's not a currency

Default is 3.

[src]

Sets the threshold for unit.

After that number of characters, assume it's not an unit.

Default is 2.

[src]

Sets the threshold for quote.

After that number of characters, assume it's not a quote of a single word or a few words, but a dialog.

Default is 20.

[src]

Sets the threshold for real word.

After that number of characters, assume it's not an abbreviation but a real word (used to determine if . marks the end of a sentence or just a title such as M. Dupuis.

Default is 3

[src]

Enables the typographic quotes replacement.

If true, "L'" will be replaced by "L’"

Default is true

[src]

Enables typographic ellipsis replacement.

If true, "..." will be replaced by "…"

Default is true

[src]

If set to true, replaces --to and --- to .

Default is false.

[src]

If set to true, replaces << to « and >> to ».

Default is false.

[src]

(Try to) Format a string according to french typographic rules.

This method should be called for each paragraph, as it makes some suppositions that the beginning of the string also means the beginning of a line.

This method calls remove_whitespaces internally, as it relies on it.

Example

use crowbook_text_processing::FrenchFormatter;
let f = FrenchFormatter::new();
let s = f.format("« Est-ce bien formaté ? » se demandait-elle — les espaces \
                  insécables étaient tellement compliquées à gérer,
                  dans cette langue !");
println!("{}", s);

[src]

(Try to) Format a string according to french typographic rules, and use '~' so it works correctly with LaTeX output.

Example

use crowbook_text_processing::FrenchFormatter;
let f = FrenchFormatter::new();
let s = f.format_tex("« Est-ce bien formaté ? »");
assert_eq!(&s, "«~Est-ce bien formaté~?~»");

Trait Implementations

impl Debug for FrenchFormatter
[src]

[src]

Formats the value using the given formatter.

impl Default for FrenchFormatter
[src]

[src]

Returns the "default value" for a type. Read more