alog is a simple log file anonymizer.
alog just replaces the first word1 on every line of any input stream with a
So "log file anonymizer" might be a bit of an overstatement, but
alog can be used to (very
efficiently) replace the $remote_addr part in many access log formats, e.g. Nginx' default
combined log format:
log_format combined '$remote_addr - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent"';
By default any parseable $remote_addr is replaced by it's localhost representation,
- any valid IPv4 address is replaced by '127.0.0.1',
- any valid IPv6 address is replaced by '::1' and
- any String (what might be a domain name) with 'localhost'.
Lines without a $remote_addr part will remain unchanged (but can be skipped with
alog::Config::set_skip() set to
The default configuration of popular web servers including Apache Web Server and Nginx collect and store at least two of the following three types of logs:
- access logs
- error logs (including processing-language logs like PHP)
- security audit logs
All of these logs contain personal information by default. IP addresses are specifically defined as personal data by the GDPR. The logs can also contain usernames if your web service uses them as part of their URL structure, and even the referral information that’s logged by default can contain personal information (or other sensitive data).
So keep in mind, just removing the IP /
$remote_addr part might not be enough to fully
anonymize any given log file.
Any first substring separated by a
b' '(Space) from the remainder of the line. ↩
Collection of replacement strings / config flags
INPUT / OUTPUT config
Creates a reader (defaults to