[][src]Struct crowbook_text_processing::FrenchFormatter

pub struct FrenchFormatter { /* fields omitted */ }

French typographic formatter.

The purpose of this struct is to try to make a text more typographically correct, according to french typographic rules. This means:

  • making spaces before ?, !, ; narrow non-breaking space;
  • making spaces before : non-breaking space;
  • making space after for dialog a demi em space;
  • making spaces after « and before » non-breking space or narrow non-breking space, according to the circumstances (dialog or a few quoted words).
  • making spaces in numbers, e.g. 80 000 or 50 € narrow and non-breaking.

Additionally, this feature use functions that are "generic" (not specific to french language) in order to:

  • replace straight quotes (' and ") with curly, typographic ones;
  • replace ellipsis (...) with the unicode character ().

As some of these features require a bit of guessing sometimes, there are some paremeters that can be set if you want better results.

Example

use crowbook_text_processing::FrenchFormatter;
let input = "Un texte à 'formater', n'est-ce pas ?";
let output = FrenchFormatter::new()
             .typographic_ellipsis(false) // don't replace ellipsis
             .format_tex(input); // format to tex (so non-breaking
                                 // spaces are visible in assert_eq!)
assert_eq!(&output, "Un texte à ‘formater’, n’est-ce pas~?");

Methods

impl FrenchFormatter[src]

pub fn new() -> Self[src]

Create a new FrenchFormatter with default settings

pub fn threshold_currency(&mut self, t: usize) -> &mut Self[src]

Sets the threshold currency.

After that number of characters, assume it's not a currency

Default is 3.

pub fn threshold_unit(&mut self, t: usize) -> &mut Self[src]

Sets the threshold for unit.

After that number of characters, assume it's not an unit.

Default is 2.

pub fn threshold_quote(&mut self, t: usize) -> &mut Self[src]

Sets the threshold for quote.

After that number of characters, assume it's not a quote of a single word or a few words, but a dialog.

Default is 20.

pub fn threshold_real_word(&mut self, t: usize) -> &mut Self[src]

Sets the threshold for real word.

After that number of characters, assume it's not an abbreviation but a real word (used to determine if . marks the end of a sentence or just a title such as M. Dupuis.

Default is 3

pub fn typographic_quotes(&mut self, b: bool) -> &mut Self[src]

Enables the typographic quotes replacement.

If true, "L'" will be replaced by "L’"

Default is true

pub fn typographic_ellipsis(&mut self, b: bool) -> &mut Self[src]

Enables typographic ellipsis replacement.

If true, "..." will be replaced by "…"

Default is true

pub fn ligature_dashes(&mut self, b: bool) -> &mut Self[src]

If set to true, replaces --to and --- to .

Default is false.

pub fn ligature_guillemets(&mut self, b: bool) -> &mut Self[src]

If set to true, replaces << to « and >> to ».

Default is false.

pub fn format<'a, S: Into<Cow<'a, str>>>(&self, input: S) -> Cow<'a, str>[src]

(Try to) Format a string according to french typographic rules.

This method should be called for each paragraph, as it makes some suppositions that the beginning of the string also means the beginning of a line.

This method calls remove_whitespaces internally, as it relies on it.

Example

use crowbook_text_processing::FrenchFormatter;
let f = FrenchFormatter::new();
let s = f.format("« Est-ce bien formaté ? » se demandait-elle — les espaces \
                  insécables étaient tellement compliquées à gérer,
                  dans cette langue !");
println!("{}", s);

pub fn format_tex<'a, S: Into<Cow<'a, str>>>(&self, input: S) -> Cow<'a, str>[src]

(Try to) Format a string according to french typographic rules, and use '~' so it works correctly with LaTeX output.

Example

use crowbook_text_processing::FrenchFormatter;
let f = FrenchFormatter::new();
let s = f.format_tex("« Est-ce bien formaté ? »");
assert_eq!(&s, "«~Est-ce bien formaté~?~»");

Trait Implementations

impl Default for FrenchFormatter[src]

impl Debug for FrenchFormatter[src]

Auto Trait Implementations

Blanket Implementations

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T> From<T> for T[src]

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> Any for T where
    T: 'static + ?Sized
[src]