Crate indicator_extractor

Expand description

A fast indicator extractor based on a paser combinator framework (nom) and a PDF parser (pdf-extract).

The goal is to be able to extract indicators either defanged or not with [.], (.), [:], or (:). The exhaustive list of types can be found in the parser::Indicator enum. Here’s an overview of the types

IPv4
IPv6
Domains
URLs
Emails
Hashes
Filenames
Bicoin addresses
Litecoin addresses

Currently the project only supports parsing of PDF files, but the goal is to add support for other file types and extraction methods thanks to the DataExtractor trait.

The project is still in its early stages, so expect some breaking changes.

§Usage

To extract indicators from a string/bytes:

use indicator_extractor::parser::extract_indicators;

let result = extract_indicators("https://github.com".as_bytes());
println!("{:?}", result); // Ok(([], [Indicator::Url("https://github.com")])

To extract indicators from a PDF file:

use indicator_extractor::{data::{PdfExtractor, DataExtractor}, parser::extract_indicators};

let pdf_data = std::fs::read("./resources/pdfs/aa23-131a_malicious_actors_exploit_cve-2023-27350_in_papercut_mf_and_ng_1.pdf").unwrap();
let pdf_string = PdfExtractor.extract(&pdf_data);
let result = extract_indicators(pdf_string.as_bytes());

§WebAssembly

The project is written in Rust and can be used in a WebAssembly build or as a Rust library. To use the WebAssembly build, you can install the package indicator-extractor npm package.

Modules§

data: Various data extractors to then parse indicators from.
parser: Parser to extract indicators from a byte array.

Crate indicator_extractorCopy item path

§Usage

§WebAssembly

Modules§

Crate indicator_extractor