Skip to main content

Crate ip_extract

Crate ip_extract 

Source
Expand description

High-performance IP address extraction and tagging engine.

ip-extract provides a blazingly fast, configurable extractor for finding IPv4 and IPv6 addresses in unstructured text. It achieves maximum throughput through:

  • Compile-time DFA: IP patterns are converted to dense Forward DFAs during build, eliminating runtime regex compilation and heap allocation.
  • Zero-overhead scanning: The DFA scans at O(n) with no backtracking; validation is performed only on candidates.
  • Strict validation: Deep checks eliminate false positives (e.g., 1.2.3.4.5 is rejected).

§Quick Start

By default, all IP addresses are extracted:

use ip_extract::ExtractorBuilder;

// Extract all IPs (default: includes private, loopback, broadcast)
let extractor = ExtractorBuilder::new().build()?;

let input = b"Connect from 192.168.1.1 to 2001:db8::1";
for range in extractor.find_iter(input) {
    let ip = std::str::from_utf8(&input[range])?;
    println!("Found: {}", ip);
}

§Tagging and Output

For more structured output (e.g., JSON), use the Tagged and Tag types:

use ip_extract::{ExtractorBuilder, Tagged, Tag};

let extractor = ExtractorBuilder::new().build()?;
let data = b"Server at 8.8.8.8";
let mut tagged = Tagged::new(data);

for range in extractor.find_iter(data) {
    let ip = std::str::from_utf8(&data[range.clone()])?;
    let tag = Tag::new(ip, ip).with_range(range);
    tagged = tagged.tag(tag);
}

§Configuration

Use ExtractorBuilder to filter specific IP categories:

use ip_extract::ExtractorBuilder;

// Extract only publicly routable IPs
let extractor = ExtractorBuilder::new()
    .only_public()
    .build()?;

// Or use granular control
let extractor = ExtractorBuilder::new()
    .ipv4(true)            // Extract IPv4 (default: true)
    .ipv6(false)           // Skip IPv6
    .ignore_private()      // Skip RFC 1918 ranges
    .ignore_loopback()     // Skip loopback (127.0.0.1, ::1)
    .build()?;

§Performance

Typical throughput on modern hardware:

  • Dense IPs (mostly IP addresses): 160+ MiB/s
  • Sparse logs (IPs mixed with text): 360+ MiB/s
  • No IPs (pure scanning): 620+ MiB/s

See benches/ip_benchmark.rs for details.

Structs§

Extractor
The main IP address extractor.
ExtractorBuilder
A builder for configuring IP extraction behavior.
IpMatch
A validated IP address match within a haystack.
Tag
A tag representing an IP address found in text.
Tagged
A line of text with tags.
TextData
Represents the text data for JSON serialization.

Enums§

IpKind
Whether a validated IP match is IPv4 or IPv6.

Functions§

extract
Extract all IPv4 and IPv6 addresses from input, returning them as strings.
extract_parsed
Extract all IPv4 and IPv6 addresses from input, returning them as parsed IpAddr objects.
extract_unique
Extract unique IPv4 and IPv6 addresses from input, returning them as strings.
extract_unique_parsed
Extract unique IPv4 and IPv6 addresses from input, returning them as parsed IpAddr objects.
parse_ipv4_bytes
Parse an IPv4 address from a byte slice.