Comma-separated values
FilterX is an experimental feature currently under development. Feedback is most welcome on Discord and GitHub.
Available in AxoSyslog 4.8.1 and later.
The parse_csv
FilterX function can separate parts of log messages (that is, the contents of the ${MESSAGE}
macro) along delimiter characters or strings into lists, or key-value pairs within dictionaries, using the csv (comma-separated-values) parser.
Usage: parse_csv(<input-string>, columns=json_array, delimiter=string, string_delimiters=json_array, dialect=string, strip_whitespace=boolean, greedy=boolean)
Only the input parameter is mandatory.
If the columns
option is set, parse_csv
returns a dictionary with the column names (as keys) and the parsed values. If the columns
option isn’t set, parse_csv
returns a list.
The following example separates hostnames like example-1
and example-2
into two parts.
block filterx p_hostname_segmentation() {
cols = json_array(["NAME","ID"]);
HOSTNAME = parse_csv(${HOST}, delimiter="-", columns=cols);
# HOSTNAME is a json object containing parts of the hostname
# For example, for example-1 it contains:
# {"NAME":"example","ID":"1"}
# Set the important elements as name-value pairs so they can be referenced in the destination template
${HOSTNAME_NAME} = HOSTNAME.NAME;
${HOSTNAME_ID} = HOSTNAME.ID;
};
destination d_file {
file("/var/log/${HOSTNAME_NAME:-examplehost}/${HOSTNAME_ID}"/messages.log);
};
log {
source(s_local);
filterx(p_hostname_segmentation());
destination(d_file);
};
Parse Apache log files
The following parser processes the log of Apache web servers and separates them into different fields. Apache log messages can be formatted like:
"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %T %v"
Here is a sample message:
192.168.1.1 - - [31/Dec/2007:00:17:10 +0100] "GET /cgi-bin/example.cgi HTTP/1.1" 200 2708 "-" "curl/7.15.5 (i4 86-pc-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8c zlib/1.2.3 libidn/0.6.5" 2 example.mycompany
To parse such logs, the delimiter character is set to a single whitespace (delimiter=" "
). Excess leading and trailing whitespace characters are stripped.
block filterx p_apache() {
${APACHE} = json();
cols = [
"CLIENT_IP", "IDENT_NAME", "USER_NAME",
"TIMESTAMP", "REQUEST_URL", "REQUEST_STATUS",
"CONTENT_LENGTH", "REFERER", "USER_AGENT",
"PROCESS_TIME", "SERVER_NAME"
];
${APACHE} = parse_csv(${MESSAGE}, columns=cols, delimiter=(" "), strip_whitespaces=true, dialect="escape-double-char");
# Set the important elements as name-value pairs so they can be referenced in the destination template
${APACHE_USER_NAME} = ${APACHE.USER_NAME};
};
The results can be used for example, to separate log messages into different files based on the APACHE.USER_NAME field. in case the field is empty, the nouser
string is assigned as default.
log {
source(s_local);
filterx(p_apache());
destination(d_file);
};
destination d_file {
file("/var/log/messages-${APACHE_USER_NAME:-nouser}");
};
Segment a part of a message
You can use multiple parsers in a layered manner to split parts of an already parsed message into further segments. The following example splits the timestamp of a parsed Apache log message into separate fields. Note that the scoping of FilterX variables is important:
- If you add the new parser to the FilterX block used in the previous example, every variable is available.
- If you use a separate FilterX block, only global variables and name-value pairs (variables with names starting with the
$
character) are accessible from the block.
block filterx p_apache_timestamp() {
cols = ["TIMESTAMP.DAY", "TIMESTAMP.MONTH", "TIMESTAMP.YEAR", "TIMESTAMP.HOUR", "TIMESTAMP.MIN", "TIMESTAMP.SEC", "TIMESTAMP.ZONE"];
${APACHE.TIMESTAMP} = parse_csv(${APACHE.TIMESTAMP}, columns=cols, delimiters=("/: "), dialect="escape-none");
# Set the important elements as name-value pairs so they can be referenced in the destination template
${APACHE_TIMESTAMP_DAY} = ${APACHE.TIMESTAMP_DAY};
};
destination d_file {
file("/var/log/messages-${APACHE_USER_NAME:-nouser}/${APACHE_TIMESTAMP_DAY}");
};
log {
source(s_local);
filterx(p_apache());
filterx(p_apache_timestamp());
destination(d_file);
};