Struct human_name::Name

source ·
pub struct Name {
    pub hash: u64,
    /* private fields */
}
Expand description

Represents a parsed human name.

Guaranteed to contain (what we think is) a surname, a first initial, and nothing more. May also contain given & middle names, middle initials, and/or a generational suffix.

Construct a Name using parse:

use human_name::Name;

let name = Name::parse("Jane Doe").unwrap();

Once you have a Name, you may extract is components, convert it to JSON, or compare it with another Name to see if they are consistent with representing the same person (see docs on consistent_with for details).

Fields

hash: u64

Implementations

Might this name represent the same person as another name?

Examples
use human_name::Name;

let j_doe = Name::parse("J. Doe").unwrap();
let jane_doe = Name::parse("Jane Doe").unwrap();
let john_m_doe = Name::parse("John M. Doe").unwrap();
let john_l_doe = Name::parse("John L. Doe").unwrap();

assert!(j_doe.consistent_with(&john_m_doe));
assert!(j_doe.consistent_with(&john_l_doe));
assert!(j_doe.consistent_with(&jane_doe));
assert!(j_doe.consistent_with(&j_doe));
assert!(!john_m_doe.consistent_with(&john_l_doe));
assert!(!jane_doe.consistent_with(&john_l_doe));

let zheng_he = Name::parse("Zheng He").unwrap();
let han_chars = Name::parse("鄭和").unwrap();
assert!(han_chars.consistent_with(&zheng_he));
Defining “consistency”

Requires that all known parts are consistent, which means at minimum, the final words of the surnames match, and one ordered set of first and middle initials is a superset of the other. If given and/or middle names and/or suffixes are present in both names, they must match as well.

Transliterates everything to ASCII before comparison using the naive algorithm of unidecode (which ignores context), and ignores case, accents and combining marks.

In the case of given and middle names, allows one name to be a prefix of the other, without requiring the prefix end at a word boundary as we do with surname suffix matches, and supports matching a small number of common nicknames and nickname patterns based on the root name.

Limitations

There will be false positives (“Jan Doe” is probably not “Jane Doe”), and false negatives (“James Hanson” might be “James Hansen”). And, of course, even identical names do not necessarily represent the same person.

Given limited information, we err on the side of false positives. This kind of matching will be most useful in cases where we already have reason to believe that a single individual’s name appears twice, and we are trying to figure out exactly where, e.g. a particular author’s index in the list of authors of a co-authored paper.

Does this name appear to match a munged string such as an email localpart or URL slug, where whitespace has been removed?

Examples
use human_name::Name;
let name = Name::parse("Jane A. Doe").unwrap();

assert!(name.matches_slug_or_localpart("jane.doe"));
assert!(!name.matches_slug_or_localpart("john.doe"));

assert!(name.matches_slug_or_localpart("janedoe"));
assert!(!name.matches_slug_or_localpart("johndoe"));

assert!(name.matches_slug_or_localpart("jad"));
assert!(!name.matches_slug_or_localpart("jd"));

assert!(name.matches_slug_or_localpart("janed"));
assert!(!name.matches_slug_or_localpart("jane"));
assert!(!name.matches_slug_or_localpart("johnd"));

Parses a string represent a single person’s full name into a canonical representation.

Examples
use human_name::Name;

let name = Name::parse("Jane Doe").unwrap();
assert_eq!("Doe", name.surname());
assert_eq!(Some("Jane"), name.given_name());

let name = Name::parse("Doe, J").unwrap();
assert_eq!("Doe", name.surname());
assert_eq!(None, name.given_name());
assert_eq!('J', name.first_initial());

let name = Name::parse("Dr. Juan Alberto T. Velasquez y Garcia III").unwrap();
assert_eq!("Velasquez y Garcia", name.surname());
assert_eq!(Some("Juan"), name.given_name());
assert_eq!(Some("AT"), name.middle_initials());
assert_eq!(Some("III"), name.suffix());
Supported formats

Supports a variety of formats, including prefix and postfix titles, parenthesized nicknames, initials with and without periods, and sort order (“Doe, Jane”). Makes use of heuristics based on case when applicable (e.g., “AL Doe” is parsed as “A. L. Doe”, while “Al Doe” is parsed as a given name and surname), as well as small sets of known particles, conjunctions, titles, etc.

Limitations

Errs on the side of producing parse output rather than giving up, so this function is not suitable as a way of guessing whether a given string actually represents a name.

However, success requires at least an apparent surname and first initial. Single-word names cannot be parsed (you may or may not wish to assume they are given names).

Does not preserve titles (other than generational suffixes such as “III”) or nicknames. Does not handle plural forms specially: “Mr. & Mrs. John Doe” will be parsed as “John Doe”, and “Jane Doe, et al” will be parsed as “Jane Doe”.

Works best on Latin names - i.e., data from North or South America or Europe. Does not understand surname-first formats without commas: “Kim Il-sung” will be parsed as having the first name “Kim”.

Handles non-Latin unicode strings, but without any particular intelligence. Attempts at least to fail nicely, such that either parse returns None, or calling display_full() on the parsed result returns the input, plus or minus whitespace.

Of course, there is no perfect algorithm for canonicalizing names. The goal here is to do the best we can without large statistical models.

First initial (always present)

Given name as a string, if present

use human_name::Name;

let name = Name::parse("Jane Doe").unwrap();
assert_eq!(Some("Jane"), name.given_name());

let name = Name::parse("J. Doe").unwrap();
assert_eq!(None, name.given_name());

Does this person use a middle name in place of their given name?

use human_name::Name;

let name = Name::parse("Jane Doe").unwrap();
assert!(!name.goes_by_middle_name());

let name = Name::parse("J. Doe").unwrap();
assert!(!name.goes_by_middle_name());

let name = Name::parse("T Boone Pickens").unwrap();
assert!(name.goes_by_middle_name());

First and middle initials as a string (always present)

use human_name::Name;

let name = Name::parse("Jane Doe").unwrap();
assert_eq!("J", name.initials());

let name = Name::parse("James T. Kirk").unwrap();
assert_eq!("JT", name.initials());

Middle names as an array of words, if present

Middle names as a string, if present

use human_name::Name;

let name = Name::parse("Jane Doe").unwrap();
assert_eq!(None, name.middle_name());

let name = Name::parse("James T. Kirk").unwrap();
assert_eq!(None, name.middle_name());

let name = Name::parse("James Tiberius Kirk").unwrap();
assert_eq!("Tiberius", name.middle_name().unwrap());

let name = Name::parse("Able Baker Charlie Delta").unwrap();
assert_eq!("Baker Charlie", name.middle_name().unwrap());

Middle initials as a string, if present

use human_name::Name;

let name = Name::parse("Jane Doe").unwrap();
assert_eq!(None, name.middle_initials());

let name = Name::parse("James T. Kirk").unwrap();
assert_eq!("T", name.middle_initials().unwrap());

let name = Name::parse("James Tiberius Kirk").unwrap();
assert_eq!("T", name.middle_initials().unwrap());

let name = Name::parse("Able Baker Charlie Delta").unwrap();
assert_eq!("BC", name.middle_initials().unwrap());

Surname as a slice of words (always present)

Surname as a string (always present)

use human_name::Name;

let name = Name::parse("Jane Doe").unwrap();
assert_eq!("Doe", name.surname());

let name = Name::parse("JOHN ALLEN Q DE LA MACDONALD JR").unwrap();
assert_eq!("de la MacDonald", name.surname());

Generational suffix, if present

First initial (with period) and surname.

use human_name::Name;

let name = Name::parse("J. Doe").unwrap();
assert_eq!("J. Doe", name.display_initial_surname());

let name = Name::parse("James T. Kirk").unwrap();
assert_eq!("J. Kirk", name.display_initial_surname());

let name = Name::parse("JOHN ALLEN Q DE LA MACDONALD JR").unwrap();
assert_eq!("J. de la MacDonald", name.display_initial_surname());

Given name and surname, if given name is known, otherwise first initial and surname.

use human_name::Name;

let name = Name::parse("J. Doe").unwrap();
assert_eq!("J. Doe", name.display_first_last());

let name = Name::parse("Jane Doe").unwrap();
assert_eq!("Jane Doe", name.display_first_last());

let name = Name::parse("James T. Kirk").unwrap();
assert_eq!("James Kirk", name.display_first_last());

let name = Name::parse("JOHN ALLEN Q DE LA MACDONALD JR").unwrap();
assert_eq!("John de la MacDonald", name.display_first_last());

Number of bytes in the full name as UTF-8 in NFKD normal form, including spaces and punctuation.

use human_name::Name;

let name = Name::parse("JOHN ALLEN Q DE LA MACDÖNALD JR").unwrap();
assert_eq!("John Allen Q. de la MacDönald, Jr.".len(), name.byte_len());

The full name, or as much of it as was preserved from the input, including given name, middle names, surname and suffix.

use human_name::Name;

let name = Name::parse("JOHN ALLEN Q DE LA MACDONALD JR").unwrap();
assert_eq!("John Allen Q. de la MacDonald, Jr.", name.display_full());

Implements a hash for a name that is always identical for two names that may be consistent according to our matching algorithm.

WARNING

This hash function is prone to collisions!

We can only use the last four alphabetical characters of the surname, because that’s all we’re guaranteed to use in the consistency test. That means if names are ASCII, we only have 19 bits of variability.

That means if you are working with a lot of names and you expect surnames to be similar or identical, you might be better off avoiding hash-based datastructures (or using a custom hash and matching algorithm).

We can’t use more characters of the surname because we treat names as equal when one surname ends with the other and the smaller is at least four characters, to catch cases like “Iria Gayo” == “Iria del Río Gayo”.

We can’t use the first initial because we might ignore it if someone goes by a middle name or nickname, or due to transliteration.

Trait Implementations

Returns a copy of the value. Read more
Performs copy-assignment from source. Read more
Formats the value using the given formatter. Read more

Implements a hash for a name that is always identical for two names that may be equal.

WARNING

This hash function is prone to collisions!

See docs on surname_hash for details.

Feeds this value into the given Hasher. Read more
Feeds a slice of this type into the given Hasher. Read more
This method tests for self and other values to be equal, and is used by ==. Read more
This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason. Read more

Serializes a name into parsed components.

use human_name::Name;
use rustc_serialize::json::ToJson;

let name = Name::parse("JOHN ALLEN Q MACDONALD JR").unwrap();
assert_eq!(
  r#"{"first_initial":"J","given_name":"John","middle_initials":"AQ","middle_names":"Allen","suffix":"Jr.","surname":"MacDonald"}"#,
  name.to_json().to_string()
);

Might this name represent the same person as another name?

WARNING

This is technically an invalid implementation of PartialEq because it is not transitive - “J. Doe” == “Jane Doe”, and “J. Doe” == “John Doe”, but “Jane Doe” != “John Doe”. (It is, however, symmetric and reflexive.)

Use with caution! See consistent_with docs for details.

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The resulting type after obtaining ownership.
Creates owned data from borrowed data, usually by cloning. Read more
Uses borrowed data to replace owned data, usually by cloning. Read more
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.