Expand description
High-performance IP address extraction and tagging engine.
ip-extract provides a blazingly fast, configurable extractor for finding IPv4 and IPv6
addresses in unstructured text. It achieves maximum throughput through:
- Compile-time DFA: IP patterns are converted to dense Forward DFAs during build, eliminating runtime regex compilation and heap allocation.
- Zero-overhead scanning: The DFA scans at O(n) with no backtracking; validation is performed only on candidates.
- Strict validation: Deep checks eliminate false positives (e.g.,
1.2.3.4.5is rejected).
§Quick Start
By default, all IP addresses are extracted:
use ip_extract::ExtractorBuilder;
// Extract all IPs (default: includes private, loopback, broadcast)
let extractor = ExtractorBuilder::new().build()?;
let input = b"Connect from 192.168.1.1 to 2001:db8::1";
for range in extractor.find_iter(input) {
let ip = std::str::from_utf8(&input[range])?;
println!("Found: {}", ip);
}§Tagging and Output
For more structured output (e.g., JSON), use the Tagged and Tag types:
use ip_extract::{ExtractorBuilder, Tagged, Tag};
let extractor = ExtractorBuilder::new().build()?;
let data = b"Server at 8.8.8.8";
let mut tagged = Tagged::new(data);
for range in extractor.find_iter(data) {
let ip = std::str::from_utf8(&data[range.clone()])?;
let tag = Tag::new(ip, ip).with_range(range);
tagged = tagged.tag(tag);
}§Configuration
Use ExtractorBuilder to filter specific IP categories:
use ip_extract::ExtractorBuilder;
// Extract only publicly routable IPs
let extractor = ExtractorBuilder::new()
.only_public()
.build()?;
// Or use granular control
let extractor = ExtractorBuilder::new()
.ipv4(true) // Extract IPv4 (default: true)
.ipv6(false) // Skip IPv6
.ignore_private() // Skip RFC 1918 ranges
.ignore_loopback() // Skip loopback (127.0.0.1, ::1)
.build()?;§Performance
Typical throughput on modern hardware:
- Dense IPs (mostly IP addresses): 160+ MiB/s
- Sparse logs (IPs mixed with text): 360+ MiB/s
- No IPs (pure scanning): 620+ MiB/s
See benches/ip_benchmark.rs for details.
Structs§
- Extractor
- The main IP address extractor.
- Extractor
Builder - A builder for configuring IP extraction behavior.
- IpMatch
- A validated IP address match within a haystack.
- Tag
- A tag representing an IP address found in text.
- Tagged
- A line of text with tags.
- Text
Data - Represents the text data for JSON serialization.
Enums§
- IpKind
- Whether a validated IP match is IPv4 or IPv6.
Functions§
- extract
- Extract all IPv4 and IPv6 addresses from input, returning them as strings.
- extract_
parsed - Extract all IPv4 and IPv6 addresses from input, returning them as parsed
IpAddrobjects. - extract_
unique - Extract unique IPv4 and IPv6 addresses from input, returning them as strings.
- extract_
unique_ parsed - Extract unique IPv4 and IPv6 addresses from input, returning them as parsed
IpAddrobjects. - parse_
ipv4_ bytes - Parse an IPv4 address from a byte slice.