Struct human_name::Name
source · pub struct Name { /* private fields */ }
Expand description
Represents a parsed human name.
Guaranteed to contain (what we think is) a surname, a first initial, and nothing more. May also contain given & middle names, middle initials, and/or a generational suffix.
Construct a Name using parse
:
use human_name::Name;
let name = Name::parse("Jane Doe").unwrap();
Once you have a Name, you may extract is components, convert it to JSON,
or compare it with another Name to see if they are consistent with representing
the same person (see docs on consistent_with
for details).
Implementations§
source§impl Name
impl Name
sourcepub fn consistent_with(&self, other: &Name) -> bool
pub fn consistent_with(&self, other: &Name) -> bool
Might this name represent the same person as another name?
§Examples
use human_name::Name;
let j_doe = Name::parse("J. Doe").unwrap();
let jane_doe = Name::parse("Jane Doe").unwrap();
let john_m_doe = Name::parse("John M. Doe").unwrap();
let john_l_doe = Name::parse("John L. Doe").unwrap();
assert!(j_doe.consistent_with(&john_m_doe));
assert!(j_doe.consistent_with(&john_l_doe));
assert!(j_doe.consistent_with(&jane_doe));
assert!(j_doe.consistent_with(&j_doe));
assert!(!john_m_doe.consistent_with(&john_l_doe));
assert!(!jane_doe.consistent_with(&john_l_doe));
let zheng_he = Name::parse("Zheng He").unwrap();
let han_chars = Name::parse("鄭和").unwrap();
assert!(han_chars.consistent_with(&zheng_he));
§Defining “consistency”
Requires that all known parts are consistent, which means at minimum, the final words of the surnames match, and one ordered set of first and middle initials is a superset of the other. If given and/or middle names and/or suffixes are present in both names, they must match as well.
Transliterates everything to ASCII before comparison using the naive algorithm of unidecode (which ignores context), and ignores case, accents and combining marks.
In the case of given and middle names, allows one name to be a prefix of the other, without requiring the prefix end at a word boundary as we do with surname suffix matches, and supports matching a small number of common nicknames and nickname patterns based on the root name.
§Limitations
There will be false positives (“Jan Doe” is probably not “Jane Doe”), and false negatives (“James Hanson” might be “James Hansen”). And, of course, even identical names do not necessarily represent the same person.
Given limited information, we err on the side of false positives. This kind of matching will be most useful in cases where we already have reason to believe that a single individual’s name appears twice, and we are trying to figure out exactly where, e.g. a particular author’s index in the list of authors of a co-authored paper.
source§impl Name
impl Name
sourcepub fn parse(name: &str) -> Option<Name>
pub fn parse(name: &str) -> Option<Name>
Parses a string represent a single person’s full name into a canonical representation.
§Examples
use human_name::Name;
let name = Name::parse("Jane Doe").unwrap();
assert_eq!("Doe", name.surname());
assert_eq!(Some("Jane"), name.given_name());
let name = Name::parse("Doe, J").unwrap();
assert_eq!("Doe", name.surname());
assert_eq!(None, name.given_name());
assert_eq!('J', name.first_initial());
let name = Name::parse("Dr. Juan Alberto T. Velasquez y Garcia III").unwrap();
assert_eq!("Velasquez y Garcia", name.surname());
assert_eq!(Some("Juan"), name.given_name());
assert_eq!(Some("AT"), name.middle_initials());
assert_eq!(Some("III"), name.generational_suffix());
assert_eq!(Some("Dr."), name.honorific_prefix());
§Supported formats
Supports a variety of formats, including prefix and postfix titles, parenthesized nicknames, initials with and without periods, and sort order (“Doe, Jane”). Makes use of heuristics based on case when applicable (e.g., “AL Doe” is parsed as “A. L. Doe”, while “Al Doe” is parsed as a given name and surname), as well as small sets of known particles, conjunctions, titles, etc.
§Limitations
Errs on the side of producing parse output rather than giving up, so this function is not suitable as a way of guessing whether a given string actually represents a name.
However, success requires at least an apparent surname and first initial. Single-word names cannot be parsed (you may or may not wish to assume they are given names).
Does not preserve titles (other than generational suffixes such as “III”) or nicknames. Does not handle plural forms specially: “Mr. & Mrs. John Doe” will be parsed as “John Doe”, and “Jane Doe, et al” will be parsed as “Jane Doe”.
Works best on Latin names - i.e., data from North or South America or Europe. Does not understand surname-first formats without commas: “Kim Il-sung” will be parsed as having the first name “Kim”.
Handles non-Latin unicode strings, but without any particular intelligence.
Attempts at least to fail nicely, such that either parse
returns None
,
or calling display_full()
on the parsed result returns the input,
plus or minus whitespace.
Of course, there is no perfect algorithm for canonicalizing names. The goal here is to do the best we can without large statistical models.
sourcepub fn first_initial(&self) -> char
pub fn first_initial(&self) -> char
First initial (always present)
sourcepub fn given_name(&self) -> Option<&str>
pub fn given_name(&self) -> Option<&str>
Given name as a string, if present
use human_name::Name;
let name = Name::parse("Jane Doe").unwrap();
assert_eq!(Some("Jane"), name.given_name());
let name = Name::parse("J. Doe").unwrap();
assert_eq!(None, name.given_name());
sourcepub fn goes_by_middle_name(&self) -> bool
pub fn goes_by_middle_name(&self) -> bool
Does this person use a middle name in place of their given name?
use human_name::Name;
let name = Name::parse("Jane Doe").unwrap();
assert!(!name.goes_by_middle_name());
let name = Name::parse("J. Doe").unwrap();
assert!(!name.goes_by_middle_name());
let name = Name::parse("T Boone Pickens").unwrap();
assert!(name.goes_by_middle_name());
sourcepub fn initials(&self) -> &str
pub fn initials(&self) -> &str
First and middle initials as a string (always present)
use human_name::Name;
let name = Name::parse("Jane Doe").unwrap();
assert_eq!("J", name.initials());
let name = Name::parse("James T. Kirk").unwrap();
assert_eq!("JT", name.initials());
sourcepub fn middle_names(&self) -> Option<SmallVec<[&str; 3]>>
pub fn middle_names(&self) -> Option<SmallVec<[&str; 3]>>
Middle names as an array of words, if present
sourcepub fn middle_name(&self) -> Option<Cow<'_, str>>
pub fn middle_name(&self) -> Option<Cow<'_, str>>
Middle names as a string, if present
use human_name::Name;
let name = Name::parse("Jane Doe").unwrap();
assert_eq!(None, name.middle_name());
let name = Name::parse("James T. Kirk").unwrap();
assert_eq!(None, name.middle_name());
let name = Name::parse("James Tiberius Kirk").unwrap();
assert_eq!("Tiberius", name.middle_name().unwrap());
let name = Name::parse("Able Baker Charlie Delta").unwrap();
assert_eq!("Baker Charlie", name.middle_name().unwrap());
sourcepub fn middle_initials(&self) -> Option<&str>
pub fn middle_initials(&self) -> Option<&str>
Middle initials as a string, if present
use human_name::Name;
let name = Name::parse("Jane Doe").unwrap();
assert_eq!(None, name.middle_initials());
let name = Name::parse("James T. Kirk").unwrap();
assert_eq!("T", name.middle_initials().unwrap());
let name = Name::parse("James Tiberius Kirk").unwrap();
assert_eq!("T", name.middle_initials().unwrap());
let name = Name::parse("Able Baker Charlie Delta").unwrap();
assert_eq!("BC", name.middle_initials().unwrap());
sourcepub fn surname(&self) -> &str
pub fn surname(&self) -> &str
Surname as a string (always present)
use human_name::Name;
let name = Name::parse("Jane Doe").unwrap();
assert_eq!("Doe", name.surname());
let name = Name::parse("JOHN ALLEN Q DE LA MACDONALD JR").unwrap();
assert_eq!("de la MacDonald", name.surname());
sourcepub fn generational_suffix(&self) -> Option<&str>
pub fn generational_suffix(&self) -> Option<&str>
Generational suffix, if present
use human_name::Name;
let name = Name::parse("Gary Payton II").unwrap();
assert_eq!(Some("Jr."), name.generational_suffix());
sourcepub fn honorific_prefix(&self) -> Option<&str>
pub fn honorific_prefix(&self) -> Option<&str>
Honorific prefix(es), if present
use human_name::Name;
let name = Name::parse("Rev. Dr. Martin Luther King, Jr.").unwrap();
assert_eq!(Some("Rev. Dr."), name.honorific_prefix());
sourcepub fn honorific_suffix(&self) -> Option<&str>
pub fn honorific_suffix(&self) -> Option<&str>
Honorific suffix(es), if present
use human_name::Name;
let name = Name::parse("Stephen Strange, MD").unwrap();
assert_eq!(Some("MD"), name.honorific_suffix());
sourcepub fn display_initial_surname(&self) -> Cow<'_, str>
pub fn display_initial_surname(&self) -> Cow<'_, str>
First initial (with period) and surname.
use human_name::Name;
let name = Name::parse("J. Doe").unwrap();
assert_eq!("J. Doe", name.display_initial_surname());
let name = Name::parse("James T. Kirk").unwrap();
assert_eq!("J. Kirk", name.display_initial_surname());
let name = Name::parse("JOHN ALLEN Q DE LA MACDONALD JR").unwrap();
assert_eq!("J. de la MacDonald", name.display_initial_surname());
sourcepub fn display_first_last(&self) -> Cow<'_, str>
pub fn display_first_last(&self) -> Cow<'_, str>
Given name and surname, if given name is known, otherwise first initial and surname.
use human_name::Name;
let name = Name::parse("J. Doe").unwrap();
assert_eq!("J. Doe", name.display_first_last());
let name = Name::parse("Jane Doe").unwrap();
assert_eq!("Jane Doe", name.display_first_last());
let name = Name::parse("James T. Kirk").unwrap();
assert_eq!("James Kirk", name.display_first_last());
let name = Name::parse("JOHN ALLEN Q DE LA MACDONALD JR").unwrap();
assert_eq!("John de la MacDonald", name.display_first_last());
sourcepub fn byte_len(&self) -> usize
pub fn byte_len(&self) -> usize
Number of bytes in the full name as UTF-8 in NFKD normal form, including spaces and punctuation.
Includes generational suffix, but does not include honorifics.
use human_name::Name;
let name = Name::parse("JOHN ALLEN Q DE LA MACDÖNALD JR").unwrap();
assert_eq!("John Allen Q. de la MacDönald, Jr.".len(), name.byte_len());
sourcepub fn display_full(&self) -> Cow<'_, str>
pub fn display_full(&self) -> Cow<'_, str>
The full name, or as much of it as was preserved from the input, including given name, middle names, surname and generational suffix.
Includes generational suffix, but does not include honorifics.
use human_name::Name;
let name = Name::parse("DR JOHN ALLEN Q DE LA MACDONALD JR").unwrap();
assert_eq!("John Allen Q. de la MacDonald, Jr.", name.display_full());
let name = Name::parse("Air Chief Marshal Sir Harrieta ('Harry') Keōpūolani Nāhiʻenaʻena, GBE, KCB, ADC").unwrap();
assert_eq!("Harrieta Keōpūolani Nāhiʻenaʻena", name.display_full());
sourcepub fn display_full_with_honorifics(&self) -> Cow<'_, str>
pub fn display_full_with_honorifics(&self) -> Cow<'_, str>
The full name, or as much of it as was preserved from the input, including given name, middle names, surname, generational suffix, and honorifics.
use human_name::Name;
let name = Name::parse("DR JOHN ALLEN Q DE LA MACDONALD JR").unwrap();
assert_eq!("Dr. John Allen Q. de la MacDonald, Jr.", name.display_full_with_honorifics());
let name = Name::parse("Air Chief Marshal Sir Harrieta ('Harry') Keōpūolani Nāhiʻenaʻena, GBE, KCB, ADC").unwrap();
assert_eq!("Air Chief Marshal Sir Harrieta Keōpūolani Nāhiʻenaʻena GBE KCB ADC", name.display_full_with_honorifics());
sourcepub fn surname_hash(&self) -> u64
pub fn surname_hash(&self) -> u64
Implements a hash for a name that is always identical for two names that may be consistent according to our matching algorithm.
§WARNING
This hash function is prone to collisions!
We can only use the last four alphabetical characters of the surname, because that’s all we’re guaranteed to use in the consistency test, and we attempt to convert to lowercase ASCII, giving us only have 19 bits of variability.
That means if you are working with a lot of names and you expect surnames to be similar or identical, you might be better off avoiding hash-based datastructures (or using a custom hash and matching algorithm).
We can’t use more characters of the surname because we treat names as equal when one surname ends with the other and the smaller is at least four characters, to catch cases like “Iria Gayo” == “Iria del Río Gayo”.
We can’t use the first initial because we might ignore it if someone goes by a middle name or nickname, or due to transliteration.