indicator_extractor/
lib.rs

1//! A fast indicator extractor based on a paser combinator framework ([nom](https://github.com/rust-bakery/nom)) and a PDF parser ([pdf-extract](https://github.com/benjeffrey/pdf-extract)).
2//!
3//! The goal is to be able to extract indicators either defanged or not with `[.]`, `(.)`, `[:]`, or `(:)`. The exhaustive list of types can be found in the [`parser::Indicator`] enum. Here's an overview of the types
4//! - IPv4
5//! - IPv6
6//! - Domains
7//! - URLs
8//! - Emails
9//! - Hashes
10//! - Filenames
11//! - Bicoin addresses
12//! - Litecoin addresses
13//!
14//! Currently the project only supports parsing of PDF files, but the goal is to add support for other file types and extraction methods thanks to the DataExtractor trait.
15//!
16//!
17//! The project is still in its early stages, so expect some breaking changes.
18//!
19//! # Usage
20//!
21//! To extract indicators from a string/bytes:
22//!
23//! ```
24//! use indicator_extractor::parser::extract_indicators;
25//!
26//! let result = extract_indicators("https://github.com".as_bytes());
27//! println!("{:?}", result); // Ok(([], [Indicator::Url("https://github.com")])
28//! ```
29//!
30//! To extract indicators from a PDF file:
31//!
32//! ```
33//! use indicator_extractor::{data::{PdfExtractor, DataExtractor}, parser::extract_indicators};
34//!
35//! let pdf_data = std::fs::read("./resources/pdfs/aa23-131a_malicious_actors_exploit_cve-2023-27350_in_papercut_mf_and_ng_1.pdf").unwrap();
36//! let pdf_string = PdfExtractor.extract(&pdf_data);
37//! let result = extract_indicators(pdf_string.as_bytes());
38//! ```
39//!
40//! # WebAssembly
41//!
42//! The project is written in Rust and can be used in a WebAssembly build or as a Rust library. To use the WebAssembly build, you can install the package [indicator-extractor](https://www.npmjs.com/package/indicator-extractor) npm package.
43
44pub mod data;
45pub mod parser;
46
47#[cfg(target_family = "wasm")]
48mod wasm;
49
50#[cfg(target_family = "wasm")]
51pub use crate::wasm::*;