cusip/lib.rs
1#![warn(missing_docs)]
2//! # cusip
3//!
4//! `cusip` provides a `CUSIP` type for working with validated Committee on Uniform Security
5//! Identification Procedures (CUSIP) identifiers as defined in [ANSI X9.6-2020 Financial Services -
6//! Committee on Uniform Security Identification Procedures Securities Identification CUSIP](https://webstore.ansi.org/standards/ascx9/ansix92020)
7//! ("The Standard").
8//!
9//! [CUSIP Global Services (CGS)](https://www.cusip.com/) has [a page describing CUSIP
10//! identifiers](https://www.cusip.com/identifiers.html).
11//!
12//! A CUSIP "number" (so-called by The Standard because originally they were composed only of
13//! decimal digits, but now they can also use letters) is comprised of 9 ASCII characters with the
14//! following parts, in order (Section 3.1 "CUSIP number length" of the standard):
15//!
16//! 1. A six-character uppercase alphanumeric _Issuer Number_.
17//! 2. A two-character uppercase alphanumeric _Issue Number_.
18//! 3. A single decimal digit representing the _Check Digit_ computed using what The Standard calls
19//! the "modulus 10 'double-add-double' technique".
20//!
21//! Note: The Standard does not specify uppercase for the alphabetic characters but uniformly
22//! presents examples only using uppercase. Therefore this implementation treats uppercase as
23//! required for both parsing and validation, while offering a `parse_loose()` alternative that
24//! allows mixed case. There is no "loose" version of validation because of the risk of confusion
25//! if it were used to validate a set of strings -- the number of distinct string values could
26//! differ from the number of distinct CUSIP identifiers because each identifier could have multiple
27//! string representations in the set, potentially resulting in data integrity problems.
28//!
29//! Although The Standard asserts that CUSIP numbers are not assigned using alphabetic 'I' and 'O'
30//! nor using digits '1' and '0' to avoid confusion, digits '1' and '0' are common in current
31//! real-world CUSIP numbers. A survey of a large set of values turned up none using letter 'I' or
32//! letter 'O', so it is plausible that 'I' and 'O' are indeed not used. In any case, this crate
33//! does _not_ treat any of these four character values as invalid.
34//!
35//! CUSIP number "issuance and dissemination" are managed by
36//! [CUSIP Global Services (CGS)](https://www.cusip.com/) per section B.1 "Registration Authority"
37//! of The Standard. In addition, there are provisions for privately assigned identifiers (see
38//! below).
39//!
40//! ## Usage
41//!
42//! Use the `parse()` or `parse_loose()` functions to convert a string to a validated CUSIP:
43//!
44//! ```
45//! # let some_string = "09739D100";
46//! match cusip::parse(some_string) {
47//! Ok(cusip) => { /* ... */ }
48//! Err(err) => { /* ... */ }
49//! }
50//! ```
51//!
52//! or take advantage of CUSIP's implementation of the `FromStr` trait and use the `parse()` method
53//! on the `str` type:
54//!
55//! ```
56//! # let some_string = "09739D100";
57//! let cusip: cusip::CUSIP = some_string.parse().unwrap();
58//! ```
59//!
60//! If you just want to check if a string value is in a valid CUSIP format (with the correct _Check
61//! Digit_), use `validate()`.
62//!
63//! ```
64//! # let some_string = "09739D100";
65//! let is_valid_cusip = cusip::validate(some_string);
66//! ```
67//!
68//! ## Optional features
69//!
70//! This crate supports the following optional features:
71//!
72//! * `serde` - Enables serialization and deserialization support via [serde](https://crates.io/crates/serde).
73//! * `schemars` - Enables JSON Schema generation via [schemars](https://crates.io/crates/schemars).
74//! Generates schemas that match the serde deserialization behavior.
75//!
76//! ## CUSIP
77//!
78//! Since its adoption in 1968, CUSIP has been the standard security identifier for:
79//!
80//! * United States of America
81//! * Canada
82//! * Bermuda
83//! * Cayman Islands
84//! * British Virgin Islands
85//! * Jamaica
86//!
87//! Since the introduction of the ISIN standard
88//! ([ISO 6166](https://www.iso.org/standard/78502.html)), CUSIP has been adopted as the ISIN
89//! _Security Identifier_ for many more territories in the creation of ISIN identifiers.
90//!
91//! ## Private use
92//!
93//! The CUSIP code space has allocations for both private _Issuer Numbers_ and private _Issue
94//! Numbers_.
95//!
96//! You can determine whether or not a CUSIP is intended for private use by using the
97//! `CUSIP::is_private_use()` method. A private use CUSIP is one that either `has_private_issuer()`
98//! or `is_private_issue()`. The has/is distinction is because a CUSIP represents ("is") an Issue
99//! (Security) offered by an "Issuer" (the Security "has" an Issuer).
100//!
101//! ### Private Issue Numbers
102//!
103//! In Section 3.2 "Issuer Number" of The Standard, "privately assigned identifiers" are defined as
104//! those having _Issuer Number_ ending in "990" through "999".
105//!
106//! In Section C.8.1.3 "Issuer Numbers Reserved for Internal Use" of the Standard, expands that set
107//! with the following additional _Issuer Numbers_:
108//!
109//! * those ending in "99A" through "99Z"
110//! * those from "990000" through "999999"
111//! * those from "99000A" through "99999Z"
112//!
113//! Such CUSIPs are reserved for this use only, and will not be assigned by the Registration
114//! Authority.
115//!
116//! You can use the `CUSIP::has_private_issuer()` method to detect this case.
117//!
118//! Note that The Standard says that in all cases a "Z" in the "5th and 6th position has been
119//! reserved for use by the Canadian Depository for Securities." There are no examples given, and it
120//! is not clear whether this means literally "and" ("0000ZZ005" would be reserved but "0000Z0002"
121//! and "00000Z003" would not) or if it actually means "and/or" (all of "0000ZZ005", "0000Z0002" and
122//! "00000Z003" would be reserved). Because this is not clear from the text of the standard, this
123//! rule is not represented in this crate.
124//!
125//! ### Private Issuer Numbers
126//!
127//! In Section C.8.2.6 "Issue Numbers Reserved for Internal Use", The Standard specifies that
128//! _Issue Numbers_ "90" through "99" and "9A" through "9Y" are reserved for private use,
129//! potentially in combination with non-private-use _Issuer Numbers_.
130//!
131//! ## CUSIP International Numbering System (CINS)
132//!
133//! While the primary motivation for the creation of the CUSIP standard was representation of U.S.
134//! and Canadian securities, it was extended in 1989 for non-North American issues through definition
135//! of a CUSIP International Numbering System (CINS). On 1991-01-01 CINS became the only allowed way
136//! of issuing CUSIP identifiers for non-North American securities.
137//!
138//! A CUSIP with a letter in the first position is a CINS number, and that letter identifies the
139//! country or geographic region of the _Issuer_.
140//!
141//! Use the `CUSIP::is_cins()` method to discriminate between CINS and conventional CUSIPs, and the
142//! `CUSIP::cins_country_code()` method to extract the CINS Country Code as an `Option<char>`.
143//!
144//! This crate provides a `CINS` type for working with CINS identifiers. You can convert a `CUSIP`
145//! to a `CINS` using `CINS::new`, `TryFrom<&CUSIP>`, or `CUSIP::as_cins`. Once you have a `CINS`,
146//! you can access the CINS _Country Code_ using `CINS::country_code``, and the (one character
147//! shorter) CINS _Issuer Number_ using `CINS::issuer_num`). You can also get the _Issue Number_
148//! via `CINS::issue_num`, though its the same as for the CUSIP. See the CINS documentation for
149//! more details.
150//!
151//! The country codes are:
152//!
153//! |code|region |code|region |code|region |code|region |
154//! |----|--------------|----|-----------|----|-------------|----|---------------|
155//! |`A` |Austria |`H` |Switzerland|`O` |(Unused) |`V` |Africa - Other |
156//! |`B` |Belgium |`I` |(Unused) |`P` |South America|`W` |Sweden |
157//! |`C` |Canada |`J` |Japan |`Q` |Australia |`X` |Europe - Other |
158//! |`D` |Germany |`K` |Denmark |`R` |Norway |`Y` |Asia |
159//! |`E` |Spain |`L` |Luxembourg |`S` |South Africa |`Z` |(Unused) |
160//! |`F` |France |`M` |Mid-East |`T` |Italy | | |
161//! |`G` |United Kingdom|`N` |Netherlands|`U` |United States| | |
162//!
163//! Even though country codes `I`, `O` and `Z` are unused, this crate reports CUSIPs starting
164//! with those letters as being in the CINS format via `CUSIP::is_cins()` and returns them via
165//! `CUSIP::cins_country_code()` because The Standard says CINS numbers are those CUSIPs starting
166//! with a letter. If you care about the distinction between the two, use `CUSIP::is_cins_base()`
167//! and `CUSIP::is_cins_extended()`.
168//!
169//! See section C.7.2 "Non-North American Issues -- CUSIP International Numbering System" of The
170//! Standard.
171//!
172//! ## Private Placement Number (PPN)
173//!
174//! According to Section C.7.2 "Private Placements" of The Standard,
175//! The Standard defines three non-alphanumeric character values to support a special use for
176//! the "PPN System". They are '`*`' (value 36), '`@`' (value 37) and '`#`' (value 38) (see section
177//! A.3 "Treatment of Alphabetic Characters".
178//!
179//! CUSIPs using these extended characters are not supported by this crate because the extended
180//! characters are not supported by ISINs, and CUSIPs are incorporated as the _Security Identifier_
181//! for ISINs for certain _Country Codes_.
182//!
183//! ## Related crates
184//!
185//! This crate is part of the Financial Identifiers series:
186//!
187//! * [CIK](https://crates.io/crates/cik): Central Index Key (SEC EDGAR)
188//! * [CUSIP](https://crates.io/crates/cusip): Committee on Uniform Security Identification Procedures (ANSI X9.6-2020)
189//! * [ISIN](https://crates.io/crates/isin): International Securities Identification Number (ISO 6166:2021)
190//! * [LEI](https://crates.io/crates/lei): Legal Entity Identifier (ISO 17442:2020)
191//!
192
193use std::fmt;
194use std::str::from_utf8_unchecked;
195use std::str::FromStr;
196
197pub mod checksum;
198
199use checksum::checksum_table;
200
201pub mod error;
202pub use error::CUSIPError;
203
204#[cfg(feature = "schemars")]
205pub mod schemars;
206
207/// Compute the _Check Digit_ for an array of u8. No attempt is made to ensure the input string
208/// is in the CUSIP payload format or length. If an illegal character (not an ASCII digit and not
209/// an ASCII uppercase letter) is encountered, this function will panic.
210pub fn compute_check_digit(s: &[u8]) -> u8 {
211 let sum = checksum_table(s);
212 b'0' + sum
213}
214
215/// Check whether or not the passed _Issuer Number_ has a valid format.
216fn validate_issuer_num_format(num: &[u8]) -> Result<(), CUSIPError> {
217 if num.len() != 6 {
218 panic!("Expected 6 bytes for Issuer Num, but got {}", num.len());
219 }
220
221 for b in num {
222 if !(b.is_ascii_digit() || (b.is_ascii_alphabetic() && b.is_ascii_uppercase())) {
223 let mut id_copy: [u8; 6] = [0; 6];
224 id_copy.copy_from_slice(num);
225 return Err(CUSIPError::InvalidIssuerNum { was: id_copy });
226 }
227 }
228 Ok(())
229}
230
231/// Check whether or not the passed _Issue Number_ has a valid format.
232fn validate_issue_num_format(num: &[u8]) -> Result<(), CUSIPError> {
233 if num.len() != 2 {
234 panic!("Expected 2 bytes for Issue Num, but got {}", num.len());
235 }
236
237 for b in num {
238 if !(b.is_ascii_digit() || (b.is_ascii_alphabetic() && b.is_ascii_uppercase())) {
239 let mut id_copy: [u8; 2] = [0; 2];
240 id_copy.copy_from_slice(num);
241 return Err(CUSIPError::InvalidIssueNum { was: id_copy });
242 }
243 }
244 Ok(())
245}
246
247/// Check whether or not the passed _Check Digit_ has a valid format.
248fn validate_check_digit_format(cd: u8) -> Result<(), CUSIPError> {
249 if !cd.is_ascii_digit() {
250 Err(CUSIPError::InvalidCheckDigit { was: cd })
251 } else {
252 Ok(())
253 }
254}
255
256/// Parse a string to a valid CUSIP or an error, requiring the string to already be only
257/// uppercase alphanumerics with no leading or trailing whitespace in addition to being the
258/// right length and format.
259#[deprecated(note = "Use CUSIP::parse instead.")]
260#[inline]
261pub fn parse(value: &str) -> Result<CUSIP, CUSIPError> {
262 CUSIP::parse(value)
263}
264
265/// Parse a string to a valid CUSIP or an error message, allowing the string to contain leading
266/// or trailing whitespace and/or lowercase letters as long as it is otherwise the right length
267/// and format.
268#[deprecated(note = "Use CUSIP::parse_loose instead.")]
269#[inline]
270pub fn parse_loose(value: &str) -> Result<CUSIP, CUSIPError> {
271 CUSIP::parse_loose(value)
272}
273
274/// Build a CUSIP from a _Payload_ (an already-concatenated _Issuer Number_ and _Issue Number_). The
275/// _Check Digit_ is automatically computed.
276pub fn build_from_payload(payload: &str) -> Result<CUSIP, CUSIPError> {
277 if payload.len() != 8 {
278 return Err(CUSIPError::InvalidPayloadLength { was: payload.len() });
279 }
280 let b = &payload.as_bytes()[0..8];
281
282 let issuer_num = &b[0..6];
283 validate_issuer_num_format(issuer_num)?;
284
285 let issue_num = &b[6..8];
286 validate_issue_num_format(issue_num)?;
287
288 let mut bb = [0u8; 9];
289
290 bb[0..8].copy_from_slice(b);
291 bb[8] = compute_check_digit(b);
292
293 Ok(CUSIP(bb))
294}
295
296/// Build a CUSIP from its parts: an _Issuer Number_ and an _Issue Number_. The _Check Digit_ is
297/// automatically computed.
298pub fn build_from_parts(issuer_num: &str, issue_num: &str) -> Result<CUSIP, CUSIPError> {
299 if issuer_num.len() != 6 {
300 return Err(CUSIPError::InvalidIssuerNumLength {
301 was: issuer_num.len(),
302 });
303 }
304 let issuer_num: &[u8] = &issuer_num.as_bytes()[0..6];
305 validate_issuer_num_format(issuer_num)?;
306
307 if issue_num.len() != 2 {
308 return Err(CUSIPError::InvalidIssueNumLength {
309 was: issue_num.len(),
310 });
311 }
312 let issue_num: &[u8] = &issue_num.as_bytes()[0..2];
313 validate_issue_num_format(issue_num)?;
314
315 let mut bb = [0u8; 9];
316
317 bb[0..6].copy_from_slice(issuer_num);
318 bb[6..8].copy_from_slice(issue_num);
319 bb[8] = compute_check_digit(&bb[0..8]);
320
321 Ok(CUSIP(bb))
322}
323
324/// Test whether or not the passed string is in valid CUSIP format, without producing a CUSIP struct
325/// value.
326pub fn validate(value: &str) -> bool {
327 if value.len() != 9 {
328 return false;
329 }
330
331 // We make the preliminary assumption that the string is pure ASCII, so we work with the
332 // underlying bytes. If there is Unicode in the string, the bytes will be outside the
333 // allowed range and format validations will fail.
334
335 let b = value.as_bytes();
336
337 // We slice out the three fields and validate their formats.
338
339 let issuer_num: &[u8] = &b[0..6];
340 if validate_issuer_num_format(issuer_num).is_err() {
341 return false;
342 }
343
344 let issue_num: &[u8] = &b[6..8];
345 if validate_issue_num_format(issue_num).is_err() {
346 return false;
347 }
348
349 let check_digit = b[8];
350 if validate_check_digit_format(check_digit).is_err() {
351 return false;
352 }
353
354 let payload = &b[0..8];
355
356 let computed_check_digit = compute_check_digit(payload);
357
358 let incorrect_check_digit = check_digit != computed_check_digit;
359
360 !incorrect_check_digit
361}
362
363/// Returns true if this CUSIP number is actually a CUSIP International Numbering System
364/// (CINS) number, false otherwise (i.e., that it has a letter as the first character of its
365/// _issuer number_). See also `is_cins_base()` and `is_cins_extended()`.
366fn is_cins(byte: u8) -> bool {
367 match byte {
368 (b'0'..=b'9') => false,
369 (b'A'..=b'Z') => true,
370 x => panic!("It should not be possible to have a non-ASCII-alphanumeric value here: {x:?}"),
371 }
372}
373
374/// Returns true if this CUSIP identifier is actually a CUSIP International Numbering System
375/// (CINS) identifier (with the further restriction that it *does not* use 'I', 'O' or 'Z' as
376/// its country code), false otherwise. See also `is_cins()` and `is_cins_extended()`.
377fn is_cins_base(byte: u8) -> bool {
378 match byte {
379 (b'0'..=b'9') => false,
380 (b'A'..=b'H') => true,
381 b'I' => false,
382 (b'J'..=b'N') => true,
383 b'O' => false,
384 (b'P'..=b'Y') => true,
385 b'Z' => false,
386 x => panic!("It should not be possible to have a non-ASCII-alphanumeric value here: {x:?}"),
387 }
388}
389
390/// Returns true if this CUSIP identifier is actually a CUSIP International Numbering System
391/// (CINS) identifier (with the further restriction that it *does* use 'I', 'O' or 'Z' as its
392/// country code), false otherwise.
393fn is_cins_extended(byte: u8) -> bool {
394 match byte {
395 (b'0'..=b'9') => false,
396 (b'A'..=b'H') => false,
397 b'I' => true,
398 (b'J'..=b'N') => false,
399 b'O' => true,
400 (b'P'..=b'Y') => false,
401 b'Z' => true,
402 x => panic!("It should not be possible to have a non-ASCII-alphanumeric value here: {x:?}"),
403 }
404}
405
406/// Returns Some(c) containing the first character of the CUSIP if it is actually a CUSIP
407/// International Numbering System (CINS) identifier, None otherwise.
408fn cins_country_code(byte: u8) -> Option<char> {
409 match byte {
410 (b'0'..=b'9') => None,
411 x @ (b'A'..=b'Z') => Some(x as char),
412 x => panic!("It should not be possible to have a non-ASCII-alphanumeric value here: {x:?}"),
413 }
414}
415
416#[doc = include_str!("../README.md")]
417#[cfg(doctest)]
418pub struct ReadmeDoctests;
419
420/// A CUSIP in confirmed valid format.
421///
422/// You cannot construct a CUSIP value manually. This does not compile:
423///
424/// ```compile_fail
425/// use cusip;
426/// let cannot_construct = cusip::CUSIP([0_u8; 9]);
427/// ```
428#[derive(Eq, PartialEq, Ord, PartialOrd, Clone, Copy, Hash)]
429#[repr(transparent)]
430#[allow(clippy::upper_case_acronyms)]
431pub struct CUSIP([u8; 9]);
432
433impl AsRef<str> for CUSIP {
434 fn as_ref(&self) -> &str {
435 unsafe { from_utf8_unchecked(&self.0[..]) } // This is safe because we know it is ASCII
436 }
437}
438
439impl fmt::Display for CUSIP {
440 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> std::fmt::Result {
441 let temp = unsafe { from_utf8_unchecked(self.as_bytes()) }; // This is safe because we know it is ASCII
442 write!(f, "{temp}")
443 }
444}
445
446impl fmt::Debug for CUSIP {
447 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
448 let temp = unsafe { from_utf8_unchecked(self.as_bytes()) }; // This is safe because we know it is ASCII
449 write!(f, "CUSIP({temp})")
450 }
451}
452
453#[cfg(feature = "serde")]
454impl<'de> serde::Deserialize<'de> for CUSIP {
455 fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
456 where
457 D: serde::Deserializer<'de>,
458 {
459 struct Visitor;
460
461 impl serde::de::Visitor<'_> for Visitor {
462 type Value = CUSIP;
463
464 fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
465 formatter.write_str("a CUSIP")
466 }
467
468 fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
469 where
470 E: serde::de::Error,
471 {
472 CUSIP::parse(v).map_err(E::custom)
473 }
474 }
475
476 deserializer.deserialize_str(Visitor)
477 }
478}
479
480#[cfg(feature = "serde")]
481impl serde::Serialize for CUSIP {
482 fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
483 where
484 S: serde::Serializer,
485 {
486 serializer.serialize_str(self.as_ref())
487 }
488}
489
490impl FromStr for CUSIP {
491 type Err = CUSIPError;
492
493 fn from_str(s: &str) -> Result<Self, Self::Err> {
494 Self::parse_loose(s)
495 }
496}
497
498impl CUSIP {
499 /// Constructs a `CUSIP` from a byte array of length 9.
500 ///
501 /// The byte array must contain only ASCII alphanumeric characters.
502 /// The first 8 characters represent the issuer and issue numbers,
503 /// and the 9th character is the check digit.
504 ///
505 /// # Errors
506 ///
507 /// Returns `CUSIPError` if the byte array is not a valid CUSIP.
508 ///
509 /// # Examples
510 ///
511 /// ```
512 /// use cusip::{CUSIP, CUSIPError};
513 ///
514 /// let bytes = *b"037833100";
515 /// let cusip = CUSIP::from_bytes(&bytes).unwrap();
516 /// assert_eq!(cusip.to_string(), "037833100");
517 ///
518 /// let invalid_bytes = *b"invalid!!";
519 /// assert!(CUSIP::from_bytes(&invalid_bytes).is_err());
520 /// ```
521 pub fn from_bytes(bytes: &[u8]) -> Result<Self, CUSIPError> {
522 if bytes.len() != 9 {
523 return Err(CUSIPError::InvalidCUSIPLength { was: bytes.len() });
524 }
525
526 // We slice out the three fields and validate their formats.
527
528 let issuer_num: &[u8] = &bytes[0..6];
529 validate_issuer_num_format(issuer_num)?;
530
531 let issue_num: &[u8] = &bytes[6..8];
532 validate_issue_num_format(issue_num)?;
533
534 let cd = bytes[8];
535 validate_check_digit_format(cd)?;
536
537 // Now, we need to compute the correct _Check Digit_ value from the "payload" (everything except
538 // the _Check Digit_).
539
540 let payload = &bytes[0..8];
541
542 let computed_check_digit = compute_check_digit(payload);
543
544 let incorrect_check_digit = cd != computed_check_digit;
545 if incorrect_check_digit {
546 return Err(CUSIPError::IncorrectCheckDigit {
547 was: cd,
548 expected: computed_check_digit,
549 });
550 }
551
552 let mut bb = [0u8; 9];
553 bb.copy_from_slice(bytes);
554 Ok(CUSIP(bb))
555 }
556
557 /// Parse a string to a valid CUSIP or an error, requiring the string to already be only
558 /// uppercase alphanumerics with no leading or trailing whitespace in addition to being the
559 /// right length and format.
560 pub fn parse(value: &str) -> Result<CUSIP, CUSIPError> {
561 let bytes = value.as_bytes();
562
563 Self::from_bytes(bytes)
564 }
565
566 /// Parse a string to a valid CUSIP or an error message, allowing the string to contain leading
567 /// or trailing whitespace and/or lowercase letters as long as it is otherwise the right length
568 /// and format.
569 #[inline]
570 pub fn parse_loose(value: &str) -> Result<CUSIP, CUSIPError> {
571 let uc = value.to_ascii_uppercase();
572 let temp = uc.trim();
573 Self::parse(temp)
574 }
575
576 /// Internal convenience function for treating the ASCII characters as a byte-array slice.
577 fn as_bytes(&self) -> &[u8] {
578 &self.0[..]
579 }
580
581 /// Returns a reference to the `CINS` representation of this `CUSIP`,
582 /// if it is a valid CINS identifier.
583 ///
584 /// # Examples
585 ///
586 /// ```
587 /// use cusip::{CUSIP, CINS};
588 ///
589 /// let cusip = CUSIP::parse("S08000AA9").unwrap();
590 /// if let Some(cins) = cusip.as_cins() {
591 /// assert_eq!(cins.country_code(), 'S');
592 /// assert_eq!(cins.issuer_num(), "08000");
593 /// } else {
594 /// println!("Not a CINS");
595 /// }
596 ///
597 /// let non_cins_cusip = CUSIP::parse("037833100").unwrap();
598 /// assert!(non_cins_cusip.as_cins().is_none());
599 /// ```
600 pub fn as_cins(&self) -> Option<CINS> {
601 CINS::new(self)
602 }
603
604 /// Returns true if this CUSIP number is actually a CUSIP International Numbering System
605 /// (CINS) number, false otherwise (i.e., that it has a letter as the first character of its
606 /// _issuer number_). See also `is_cins_base()` and `is_cins_extended()`.
607 pub fn is_cins(&self) -> bool {
608 is_cins(self.as_bytes()[0])
609 }
610
611 /// Returns true if this CUSIP identifier is actually a CUSIP International Numbering System
612 /// (CINS) identifier (with the further restriction that it *does not* use 'I', 'O' or 'Z' as
613 /// its country code), false otherwise. See also `is_cins()` and `is_cins_extended()`.
614 #[deprecated(note = "Use CUSIP::as_cins and CINS::is_cins_base.")]
615 pub fn is_cins_base(&self) -> bool {
616 is_cins_base(self.as_bytes()[0])
617 }
618
619 /// Returns true if this CUSIP identifier is actually a CUSIP International Numbering System
620 /// (CINS) identifier (with the further restriction that it *does* use 'I', 'O' or 'Z' as its
621 /// country code), false otherwise.
622 #[deprecated(note = "Use CUSIP::as_cins and CINS::is_cins_extended.")]
623 pub fn is_cins_extended(&self) -> bool {
624 is_cins_extended(self.as_bytes()[0])
625 }
626
627 /// Returns Some(c) containing the first character of the CUSIP if it is actually a CUSIP
628 /// International Numbering System (CINS) identifier, None otherwise.
629 #[deprecated(note = "Use CUSIP::as_cins and CINS::country_code.")]
630 pub fn cins_country_code(&self) -> Option<char> {
631 cins_country_code(self.as_bytes()[0])
632 }
633
634 /// Return just the _Issuer Number_ portion of the CUSIP.
635 pub fn issuer_num(&self) -> &str {
636 unsafe { from_utf8_unchecked(&self.as_bytes()[0..6]) } // This is safe because we know it is ASCII
637 }
638
639 /// Returns true if the _Issuer Number_ is reserved for private use.
640 pub fn has_private_issuer(&self) -> bool {
641 let bs = self.as_bytes();
642
643 // "???99?"
644 let case1 = bs[3] == b'9' && bs[4] == b'9';
645
646 // "99000?" to "99999?"
647 let case2 = bs[0] == b'9'
648 && bs[1] == b'9'
649 && (bs[2].is_ascii_digit())
650 && (bs[3].is_ascii_digit())
651 && (bs[4].is_ascii_digit());
652
653 case1 || case2
654 }
655
656 /// Return just the _Issue Number_ portion of the CUSIP.
657 pub fn issue_num(&self) -> &str {
658 unsafe { from_utf8_unchecked(&self.as_bytes()[6..8]) } // This is safe because we know it is ASCII
659 }
660
661 /// Returns true if the _Issue Number_ is reserved for private use.
662 pub fn is_private_issue(&self) -> bool {
663 let bs = self.as_bytes();
664 let nine_tens = bs[6] == b'9';
665 let digit_ones = bs[7].is_ascii_digit();
666 let letter_ones = (b'A'..=b'Y').contains(&bs[7]);
667 nine_tens && (digit_ones || letter_ones)
668 }
669
670 /// Returns true if the CUSIP is reserved for private use (i.e., either it has a private issuer
671 /// or it is a private issue).
672 pub fn is_private_use(&self) -> bool {
673 self.has_private_issuer() || self.is_private_issue()
674 }
675
676 /// Return the _Payload_ — everything except the _Check Digit_.
677 pub fn payload(&self) -> &str {
678 unsafe { from_utf8_unchecked(&self.as_bytes()[0..8]) } // This is safe because we know it is ASCII
679 }
680
681 /// Return just the _Check Digit_ portion of the CUSIP.
682 pub fn check_digit(&self) -> char {
683 self.as_bytes()[8] as char
684 }
685
686 /// Validates a string as a valid CUSIP without constructing an instance.
687 ///
688 /// This method performs the same validation as [`parse`](CUSIP::parse) but returns
689 /// a boolean result instead of constructing a CUSIP instance. It validates the format,
690 /// length, and check digit of the input string.
691 ///
692 /// # Examples
693 ///
694 /// ```
695 /// use cusip::CUSIP;
696 ///
697 /// assert!(CUSIP::validate("037833100")); // Apple Inc
698 /// assert!(CUSIP::validate("09739D100")); // Boise Cascade
699 /// assert!(!CUSIP::validate("invalid")); // Invalid format
700 /// assert!(!CUSIP::validate("037833101")); // Wrong check digit
701 /// ```
702 pub fn validate(s: &str) -> bool {
703 crate::validate(s)
704 }
705
706 /// Returns the complete CUSIP identifier as a string.
707 ///
708 /// This method returns the full 9-character CUSIP identifier including
709 /// the issuer number, issue number, and check digit.
710 ///
711 /// # Examples
712 ///
713 /// ```
714 /// use cusip::CUSIP;
715 ///
716 /// let cusip = CUSIP::parse("037833100").unwrap();
717 /// assert_eq!(cusip.value(), "037833100");
718 /// ```
719 pub fn value(&self) -> &str {
720 self.as_ref()
721 }
722}
723
724/// A CINS (CUSIP International Numbering System) identifier.
725///
726/// CINS is a subset of CUSIP used for international securities.
727/// It is distinguished by having a letter as the first character.
728///
729/// # Creating CINS instances
730///
731/// There are several ways to create a `CINS` instance from a `CUSIP`:
732///
733/// 1. Using `CINS::new`:
734///
735/// ```
736/// use cusip::{CUSIP, CINS};
737///
738/// let cusip = CUSIP::parse("S08000AA9").unwrap();
739/// if let Some(cins) = CINS::new(&cusip) {
740/// println!("CINS: {}", cins);
741/// } else {
742/// println!("Not a valid CINS");
743/// }
744/// ```
745///
746/// 2. Using `TryFrom<&CUSIP>`:
747///
748/// ```
749/// use cusip::{CUSIP, CINS};
750/// use std::convert::TryFrom;
751///
752/// let cusip = CUSIP::parse("S08000AA9").unwrap();
753/// match CINS::try_from(&cusip) {
754/// Ok(cins) => println!("CINS: {}", cins),
755/// Err(err) => println!("Error: {}", err),
756/// }
757/// ```
758///
759/// 3. Using `CUSIP::as_cins`:
760///
761/// ```
762/// use cusip::{CUSIP, CINS};
763///
764/// let cusip = CUSIP::parse("S08000AA9").unwrap();
765/// if let Some(cins) = cusip.as_cins() {
766/// println!("CINS: {}", cins);
767/// } else {
768/// println!("Not a valid CINS");
769/// }
770/// ```
771///
772/// # Accessing the underlying CUSIP
773///
774/// You can call `as_cusip` on a `CINS` instance to access the underlying `CUSIP`:
775///
776/// ```
777/// use cusip::{CUSIP, CINS};
778///
779/// let cusip = CUSIP::parse("S08000AA9").unwrap();
780/// let cins = CINS::new(&cusip).unwrap();
781/// println!("CUSIP: {}", cins.as_cusip());
782/// ```
783#[derive(Clone, Eq, PartialEq, Ord, PartialOrd, Hash)]
784#[allow(clippy::upper_case_acronyms)]
785pub struct CINS<'a>(&'a CUSIP);
786
787impl fmt::Display for CINS<'_> {
788 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
789 self.0.fmt(f)
790 }
791}
792
793impl fmt::Debug for CINS<'_> {
794 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
795 write!(f, "CINS({})", self.0) // The wrapped CUSIP is written as a string not in debug form
796 }
797}
798
799impl<'a> TryFrom<&'a CUSIP> for CINS<'a> {
800 type Error = &'static str;
801
802 fn try_from(cusip: &'a CUSIP) -> Result<Self, Self::Error> {
803 CINS::new(cusip).ok_or("Not a valid CINS")
804 }
805}
806
807impl<'a> CINS<'a> {
808 /// Constructs a new `CINS` from a reference to a `CUSIP`.
809 ///
810 /// Returns `Some(CINS)` if the given `CUSIP` is a valid CINS identifier,
811 /// i.e., its first character is a letter (A-Z). Otherwise, returns `None`.
812 ///
813 /// # Examples
814 ///
815 /// ```
816 /// use cusip::{CUSIP, CINS};
817 ///
818 /// let cusip = CUSIP::parse("S08000AA9").unwrap();
819 /// let cins = CINS::new(&cusip).unwrap();
820 ///
821 /// let non_cins_cusip = CUSIP::parse("037833100").unwrap();
822 /// assert!(CINS::new(&non_cins_cusip).is_none());
823 /// ```
824 pub fn new(cusip: &'a CUSIP) -> Option<Self> {
825 if is_cins(cusip.as_bytes()[0]) {
826 Some(CINS(cusip))
827 } else {
828 None
829 }
830 }
831
832 /// Returns a reference to the underlying `CUSIP`.
833 ///
834 /// # Examples
835 ///
836 /// ```
837 /// use cusip::{CUSIP, CINS};
838 ///
839 /// let cusip = CUSIP::parse("S08000AA9").unwrap();
840 /// let cins = CINS::new(&cusip).unwrap();
841 /// assert_eq!(cins.as_cusip().to_string(), "S08000AA9");
842 /// ```
843 pub fn as_cusip(&self) -> &CUSIP {
844 self.0
845 }
846
847 /// Returns the CINS country code.
848 ///
849 /// The country code is the first character of the CINS identifier,
850 /// which is always a letter (A-Z).
851 ///
852 /// # Examples
853 ///
854 /// ```
855 /// use cusip::{CUSIP, CINS};
856 ///
857 /// let cusip = CUSIP::parse("S08000AA9").unwrap();
858 /// let cins = CINS::new(&cusip).unwrap();
859 /// assert_eq!(cins.country_code(), 'S');
860 /// ```
861 pub fn country_code(&self) -> char {
862 self.0.as_bytes()[0] as char
863 }
864
865 /// Returns true if this CUSIP identifier is actually a CUSIP International Numbering System
866 /// (CINS) identifier (with the further restriction that it *does not* use 'I', 'O' or 'Z' as
867 /// its country code), false otherwise. See also `is_cins()` and `is_cins_extended()`.
868 pub fn is_base(&self) -> bool {
869 is_cins_base(self.0.as_bytes()[0])
870 }
871
872 /// Returns true if this CUSIP identifier is actually a CUSIP International Numbering System
873 /// (CINS) identifier (with the further restriction that it *does* use 'I', 'O' or 'Z' as its
874 /// country code), false otherwise.
875 pub fn is_extended(&self) -> bool {
876 is_cins_extended(self.0.as_bytes()[0])
877 }
878
879 /// Returns the CINS issuer number.
880 ///
881 /// The issuer number is the 5 characters following the country code
882 /// in the CINS identifier.
883 ///
884 /// # Examples
885 ///
886 /// ```
887 /// use cusip::{CUSIP, CINS};
888 ///
889 /// let cusip = CUSIP::parse("S08000AA9").unwrap();
890 /// let cins = CINS::new(&cusip).unwrap();
891 /// assert_eq!(cins.issuer_num(), "08000");
892 /// ```
893 pub fn issuer_num(&self) -> &str {
894 unsafe { from_utf8_unchecked(&self.0.as_bytes()[1..6]) }
895 }
896
897 /// Return just the _Issue Number_ portion of the CINS.
898 pub fn issue_num(&self) -> &str {
899 unsafe { from_utf8_unchecked(&self.0.as_bytes()[6..8]) } // This is safe because we know it is ASCII
900 }
901}
902
903#[cfg(test)]
904mod tests {
905 use super::*;
906 use proptest::prelude::*;
907
908 #[test]
909 fn parse_cusip_for_bcc_strict() {
910 match CUSIP::parse("09739D100") {
911 Ok(cusip) => {
912 assert_eq!(cusip.to_string(), "09739D100");
913 assert_eq!(cusip.issuer_num(), "09739D");
914 assert_eq!(cusip.issue_num(), "10");
915 assert_eq!(cusip.check_digit(), '0');
916 assert!(!cusip.is_cins());
917 }
918 Err(err) => panic!("Did not expect parsing to fail: {}", err),
919 }
920 }
921
922 #[test]
923 fn parse_cusip_for_bcc_loose() {
924 match CUSIP::parse_loose("\t09739d100 ") {
925 Ok(cusip) => {
926 assert_eq!(cusip.to_string(), "09739D100");
927 assert_eq!(cusip.issuer_num(), "09739D");
928 assert_eq!(cusip.issue_num(), "10");
929 assert_eq!(cusip.check_digit(), '0');
930 assert!(!cusip.is_cins());
931 }
932 Err(err) => panic!("Did not expect parsing to fail: {}", err),
933 }
934 }
935
936 #[test]
937 fn validate_cusip_for_bcc() {
938 // Boise Cascade
939 assert!(validate("09739D100"))
940 }
941
942 #[test]
943 fn validate_cusip_for_dfs() {
944 // Discover Financial Services
945 assert!(validate("254709108"))
946 }
947
948 #[test]
949 fn parse_cins() {
950 match CUSIP::parse("S08000AA9") {
951 Ok(cusip) => {
952 assert_eq!(cusip.to_string(), "S08000AA9");
953 assert_eq!(cusip.issuer_num(), "S08000");
954 assert_eq!(cusip.issue_num(), "AA");
955 assert_eq!(cusip.check_digit(), '9');
956 assert!(cusip.is_cins());
957 }
958 Err(err) => panic!("Did not expect parsing to fail: {}", err),
959 }
960 }
961
962 /// This test case appears on page 3 of ANSI X9.6-2020, in the section "Annex A (Normative):
963 /// Modulus 10 Double-Add-Double Technique".
964 #[test]
965 fn parse_example_from_standard() {
966 match CUSIP::parse("837649128") {
967 Ok(cusip) => {
968 assert_eq!(cusip.to_string(), "837649128");
969 assert_eq!(cusip.issuer_num(), "837649");
970 assert_eq!(cusip.issue_num(), "12");
971 assert_eq!(cusip.check_digit(), '8');
972 assert!(!cusip.is_cins());
973 }
974 Err(err) => panic!("Did not expect parsing to fail: {}", err),
975 }
976 }
977
978 /// This test case appears on page 3 of ANSI X9.6-2020, in the section "Annex A (Normative):
979 /// Modulus 10 Double-Add-Double Technique".
980 #[test]
981 fn validate_example_from_standard() {
982 assert!(validate("837649128"))
983 }
984
985 #[test]
986 fn reject_empty_string() {
987 let res = CUSIP::parse("");
988 assert!(res.is_err());
989 }
990
991 #[test]
992 fn reject_lowercase_issuer_id_if_strict() {
993 match CUSIP::parse("99999zAA5") {
994 Err(CUSIPError::InvalidIssuerNum { was: _ }) => {} // Ok
995 Err(err) => {
996 panic!(
997 "Expected Err(InvalidIssuerNum {{ ... }}), but got: Err({:?})",
998 err
999 )
1000 }
1001 Ok(cusip) => {
1002 panic!(
1003 "Expected Err(InvalidIssuerNum {{ ... }}), but got: Ok({:?})",
1004 cusip
1005 )
1006 }
1007 }
1008 }
1009
1010 #[test]
1011 fn reject_lowercase_issue_id_if_strict() {
1012 match CUSIP::parse("99999Zaa5") {
1013 Err(CUSIPError::InvalidIssueNum { was: _ }) => {} // Ok
1014 Err(err) => {
1015 panic!(
1016 "Expected Err(InvalidIssueNum {{ ... }}), but got: Err({:?})",
1017 err
1018 )
1019 }
1020 Ok(cusip) => {
1021 panic!(
1022 "Expected Err(InvalidIssueNum {{ ... }}), but got: Ok({:?})",
1023 cusip
1024 )
1025 }
1026 }
1027 }
1028
1029 #[test]
1030 fn parse_cusip_with_0_check_digit() {
1031 CUSIP::parse("09739D100").unwrap(); // BCC aka Boise Cascade
1032 }
1033
1034 #[test]
1035 fn parse_cusip_with_1_check_digit() {
1036 CUSIP::parse("00724F101").unwrap(); // ADBE aka Adobe
1037 }
1038
1039 #[test]
1040 fn parse_cusip_with_2_check_digit() {
1041 CUSIP::parse("02376R102").unwrap(); // AAL aka American Airlines
1042 }
1043
1044 #[test]
1045 fn parse_cusip_with_3_check_digit() {
1046 CUSIP::parse("053015103").unwrap(); // ADP aka Automatic Data Processing
1047 }
1048
1049 #[test]
1050 fn parse_cusip_with_4_check_digit() {
1051 CUSIP::parse("457030104").unwrap(); // IMKTA aka Ingles Markets
1052 }
1053
1054 #[test]
1055 fn parse_cusip_with_5_check_digit() {
1056 CUSIP::parse("007800105").unwrap(); // AJRD aka Aerojet Rocketdyne Holdings
1057 }
1058
1059 #[test]
1060 fn parse_cusip_with_6_check_digit() {
1061 CUSIP::parse("98421M106").unwrap(); // XRX aka Xerox
1062 }
1063
1064 #[test]
1065 fn parse_cusip_with_7_check_digit() {
1066 CUSIP::parse("007903107").unwrap(); // AMD aka Advanced Micro Devices
1067 }
1068
1069 #[test]
1070 fn parse_cusip_with_8_check_digit() {
1071 CUSIP::parse("921659108").unwrap(); // VNDA aka Vanda Pharmaceuticals
1072 }
1073
1074 #[test]
1075 fn parse_cusip_with_9_check_digit() {
1076 CUSIP::parse("020772109").unwrap(); // APT aka AlphaProTec
1077 }
1078
1079 /// A bunch of test cases obtained from pubic SEC data via a PDF at
1080 /// https://www.sec.gov/divisions/investment/13flists.htm
1081 #[test]
1082 fn parse_bulk() {
1083 let cases = [
1084 "25470F104",
1085 "254709108",
1086 "254709108",
1087 "25470F104",
1088 "25470F302",
1089 "25470M109",
1090 "25490H106",
1091 "25490K273",
1092 "25490K281",
1093 "25490K323",
1094 "25490K331",
1095 "25490K596",
1096 "25490K869",
1097 "25525P107",
1098 "255519100",
1099 "256135203",
1100 "25614T309",
1101 "256163106",
1102 "25659T107",
1103 "256677105",
1104 "256746108",
1105 "25746U109",
1106 "25754A201",
1107 "257554105",
1108 "257559203",
1109 "257651109",
1110 "257701201",
1111 "257867200",
1112 "25787G100",
1113 "25809K105",
1114 "25820R105",
1115 "258278100",
1116 "258622109",
1117 "25960P109",
1118 "25960R105",
1119 "25985W105",
1120 "260003108",
1121 "260174107",
1122 "260557103",
1123 "26140E600",
1124 "26142R104",
1125 "26152H301",
1126 "262037104",
1127 "262077100",
1128 "26210C104",
1129 "264120106",
1130 "264147109",
1131 "264411505",
1132 "26441C204",
1133 "26443V101",
1134 "26484T106",
1135 "265504100",
1136 "26614N102",
1137 "266605104",
1138 "26745T101",
1139 "267475101",
1140 "268150109",
1141 "268158201",
1142 "26817Q886",
1143 "268311107",
1144 "26856L103",
1145 "268603107",
1146 "26874R108",
1147 "26884L109",
1148 "26884U109",
1149 "268948106",
1150 "26922A230",
1151 "26922A248",
1152 "26922A289",
1153 "26922A305",
1154 ];
1155 for case in cases.iter() {
1156 CUSIP::parse(case).unwrap();
1157 assert!(
1158 validate(case),
1159 "Successfully parsed {:?} but got false from validate()!",
1160 case
1161 );
1162 }
1163 }
1164
1165 #[test]
1166 fn static_validate_method() {
1167 // Test valid CUSIPs
1168 assert!(CUSIP::validate("037833100")); // Apple Inc
1169 assert!(CUSIP::validate("09739D100")); // Boise Cascade
1170 assert!(CUSIP::validate("S08000AA9")); // CINS example
1171 assert!(CUSIP::validate("837649128")); // Example from standard
1172 assert!(CUSIP::validate("254709108")); // Discover Financial
1173
1174 // Test invalid CUSIPs
1175 assert!(!CUSIP::validate("")); // Empty string
1176 assert!(!CUSIP::validate("037833101")); // Wrong check digit
1177 assert!(!CUSIP::validate("037833")); // Too short
1178 assert!(!CUSIP::validate("0378331000")); // Too long
1179 assert!(!CUSIP::validate("037833!00")); // Invalid character
1180 assert!(!CUSIP::validate("037833a00")); // Lowercase letter
1181 assert!(!CUSIP::validate("invalid!!")); // Invalid format
1182
1183 // Test consistency with module-level validate function
1184 let test_cases = [
1185 "037833100",
1186 "09739D100",
1187 "S08000AA9",
1188 "837649128",
1189 "254709108",
1190 "",
1191 "037833101",
1192 "037833",
1193 "0378331000",
1194 "037833!00",
1195 "037833a00",
1196 "invalid!!",
1197 ];
1198
1199 for case in test_cases.iter() {
1200 assert_eq!(
1201 CUSIP::validate(case),
1202 crate::validate(case),
1203 "CUSIP::validate() and module validate() should return same result for {:?}",
1204 case
1205 );
1206 }
1207 }
1208
1209 #[test]
1210 fn value_accessor_method() {
1211 // Test value() returns complete CUSIP string
1212 let cusip1 = CUSIP::parse("037833100").unwrap();
1213 assert_eq!(cusip1.value(), "037833100");
1214
1215 let cusip2 = CUSIP::parse("09739D100").unwrap();
1216 assert_eq!(cusip2.value(), "09739D100");
1217
1218 let cusip3 = CUSIP::parse("S08000AA9").unwrap();
1219 assert_eq!(cusip3.value(), "S08000AA9");
1220
1221 let cusip4 = CUSIP::parse("837649128").unwrap();
1222 assert_eq!(cusip4.value(), "837649128");
1223
1224 // Test consistency with other string representation methods
1225 let test_cusips = [
1226 "037833100",
1227 "09739D100",
1228 "S08000AA9",
1229 "837649128",
1230 "254709108",
1231 ];
1232
1233 for cusip_str in test_cusips.iter() {
1234 let cusip = CUSIP::parse(cusip_str).unwrap();
1235
1236 // value() should match the original string
1237 assert_eq!(cusip.value(), *cusip_str);
1238
1239 // value() should match to_string()
1240 assert_eq!(cusip.value(), cusip.to_string());
1241
1242 // value() should match as_ref()
1243 assert_eq!(cusip.value(), cusip.as_ref());
1244
1245 // value() should match Display formatting
1246 assert_eq!(cusip.value(), format!("{}", cusip));
1247 }
1248 }
1249
1250 #[test]
1251 fn new_methods_non_breaking() {
1252 // Verify that existing functionality still works and new methods are additions only
1253 let cusip = CUSIP::parse("037833100").unwrap();
1254
1255 // All existing methods should still work
1256 assert_eq!(cusip.issuer_num(), "037833");
1257 assert_eq!(cusip.issue_num(), "10");
1258 assert_eq!(cusip.check_digit(), '0');
1259 assert!(!cusip.is_cins());
1260 assert!(!cusip.is_private_use());
1261 assert_eq!(cusip.payload(), "03783310");
1262
1263 // New methods should work alongside existing ones
1264 assert_eq!(cusip.value(), "037833100");
1265 assert!(CUSIP::validate("037833100"));
1266
1267 // Verify backward compatibility - all existing string representations work
1268 assert_eq!(cusip.to_string(), "037833100");
1269 assert_eq!(cusip.as_ref(), "037833100");
1270 assert_eq!(format!("{}", cusip), "037833100");
1271 }
1272
1273 proptest! {
1274 #[test]
1275 #[allow(unused_must_use)]
1276 fn doesnt_crash(s in "\\PC*") {
1277 CUSIP::parse(&s);
1278 }
1279 }
1280}