FilterX function reference
FilterX is an experimental feature currently under development. Feedback is most welcome on Discord and GitHub.
Available in AxoSyslog 4.8.1 and later.
This page describes the functions you can use in FilterX blocks.
Functions have arguments that can be either mandatory or optional.
- Mandatory options are always positional, so you need to pass them in the correct order. You cannot set them in the
arg=value
format. - Optional arguments are always named, like
arg=value
. You can pass optional arguments in any order.
cache_json_file
Load the contents of an external JSON file in an efficient manner. You can use this to lookup contextual information. (Basically, this is a FilterX-specific implementation of the add-contextual-data() functionality
.)
Usage: cache_json_file("/path/to/file.json")
For example, if your context-info-db.json
file contains the following:
{
"nginx": "web",
"httpd": "web",
"apache": "web",
"mysql": "db",
"postgresql": "db"
}
Then the following FilterX expression selects only “web” traffic:
filterx {
declare known_apps = cache_json_file("/context-info-db.json");
${app} = known_apps[${PROGRAM}] ?? "unknown";
${app} == "web"; # drop everything that's not a web server log
}
datetime
Cast a value into a datetime variable.
Usage: datetime(<string or expression to cast as datetime>)
For example:
date = datetime("1701350398.123000+01:00");
Usually, you use the strptime FilterX function to create datetime values. Alternatively, you can cast an integer, double, string, or isodate variable into datetime with the datetime()
FilterX function. Note that:
- When casting from an integer, the integer is the number of microseconds elapsed since the UNIX epoch (January 1, 1970 12:00:00 AM).
- When casting from a double, the double is the number of seconds elapsed since the UNIX epoch (January 1, 1970 12:00:00 AM). (The part before the floating points is the seconds, the part after the floating point is the microseconds.)
- When casting from a string, the string (for example,
1701350398.123000+01:00
) is interpreted as:<the number of seconds elapsed since the UNIX epoch>.<microseconds>+<timezone relative to UTC (GMT +00:00)>
flatten
Flattens the nested elements of an object using the specified separator, similarly to the format-flat-json()
template function. For example, you can use it to flatten nested JSON objects in the output if the receiving application cannot handle nested JSON objects.
Usage: flatten(dict, separator=".")
You can use multi-character separators, for example, =>
. If you omit the separator, the default dot (.
) separator is used.
sample-dict = json({"a": {"b": {"c": "1"}}});
${MESSAGE} = flatten(sample-dict);
The value of ${MESSAGE}
will be: {"a.b.c": "1"}
format_csv
Formats a dictionary or a list into a comma-separated string.
Usage: format_csv(<input-list-or-dict>, columns=<json-list>, delimiter=<delimiter-character>, default_value=<string>)
Only the input is mandatory, other arguments are optional. Note that the delimiter must be a single character.
By default, the delimiter is the comma (delimiter=","
), the columns
and default_value
are empty.
If the columns
option is set, AxoSyslog checks that the number of entries in the input data matches the number of columns. If there are fewer entries, it adds the default_value
to the missing entries.
format_kv
Formats a dictionary into a string containing key=value pairs.
Usage: format_kv(kvs_dict, value_separator="<separator-character>", pair_separator="<separator-string>")
By default, format_kv
uses =
to separate values, and ,
(comma and space) to separate the pairs:
filterx {
${MESSAGE} = format_kv(<input-dictionary>);
};
The value_separator
option must be a single character, the pair_separator
can be a string. For example, to use the colon (:) as the value separator and the semicolon (;) as the pair separator, use:
format_kv(<input-dictionary>, value_separator=":", pair_separator=";")
format_json
Formats any value into a raw JSON string.
Usage: format_json($data)
isodate
Parses a string as a date in ISODATE format: %Y-%m-%dT%H:%M:%S%z
isset
Returns true if the argument exists and its value is not empty or null.
Usage: isset(<name of a variable, macro, or name-value pair>)
istype
Returns true if the object (first argument) has the specified type (second argument). The type must be a quoted string. (See List of type names.)
Usage: istype(object, "type_str")
For example:
istype({"key": "value"}, "json_object"); # True
istype(${PID}, "string");
istype(my-local-json-object.mylist, "json_array");
If the object doesn’t exist, istype()
returns with an error, causing the FilterX statement to become false, and logs an error message to the internal()
source of AxoSyslog.
json, json_object
Cast a value into a JSON object. json_object()
is an alias for json()
.
Usage: json(<string or expression to cast as json>)
For example:
js = json({"key": "value"});
json_array
Cast a value into a JSON array.
Usage: json_array(<string or expression to cast as json array>)
For example:
list = json_array(["first_element", "second_element", "third_element"]);
len
Returns the number of items in an object as an integer: the length (number of characters) of a string, the number of elements in a list, or the number of keys in an object.
Usage: len(object)
lower
Converts a string into lowercase characters.
Usage: lower(string)
otel_array
Creates a dictionary represented as an OpenTelemetry array.
otel_kvlist
Creates a dictionary represented as an OpenTelemetry key-value list.
otel_logrecord
Creates an OpenTelemetry log record object.
otel_resource
Creates an OpenTelemetry resource object.
otel_scope
Creates an OpenTelemetry scope object.
parse_csv
Separate a comma-separated or similar string.
Usage: parse_csv(msg_str [columns=json_array, delimiter=string, string_delimiters=json_array, dialect=string, strip_whitespace=boolean, greedy=boolean])
For details, see Comma-separated values.
parse_kv
Separate a string consisting of whitespace or comma-separated key=value
pairs (for example, WELF-formatted messages).
Usage: parse_kv(msg, value_separator="=", pair_separator=", ", stray_words_key="stray_words")
The value_separator
must be a single-character string. The pair_separator
must be a string.
For details, see key=value pairs.
regexp_search
Searches a string and returns the matches of a regular expression as a list or a dictionary. If there are no matches, the list or dictionary is empty.
Usage: regexp_search("<string-to-search>", <regular-expression>)
For example:
# ${MESSAGE} = "ERROR: Sample error message string"
my-variable = regexp_search(${MESSAGE}, "ERROR");
You can also use unnamed match groups (()
) and named match groups ((?<first>ERROR)(?<second>message)
).
Note the following points:
- Regular expressions are case sensitive by default. For case insensitive matches, add
(?i)
to the beginning of your pattern. - You can use regexp constants (slash-enclosed regexps) within FilterX blocks to simplify escaping special characters, for example,
/^beginning and end$/
. - FilterX regular expressions are interpreted in “leave the backslash alone mode”, meaning that a backslash in a string before something that doesn’t need to be escaped and will be interpreted as a literal backslash character. For example,
string\more-string
is equivalent tostring\\more-string
.
Unnamed match groups
${MY-LIST} = json(); # Creates an empty JSON object
${MY-LIST}.unnamed = regexp_search("first-word second-part third", /(first-word)(second-part)(third)/);
${MY-LIST}.unnamed
is a list containing: ["first-word second-part third", "first-word", "second-part", "third"],
Named match groups
${MY-LIST} = json(); # Creates an empty JSON object
${MY-LIST}.named = regexp_search("first-word second-part third", /(?<one>first-word)(?<two>second-part)(?<three>third)/);
${MY-LIST}.named
is a dictionary with the names of the match groups as keys, and the corresponding matches as values: {"0": "first-word second-part third", "one": "first-word", "two": "second-part", "three": "third"},
Mixed match groups
If you use mixed (some named, some unnamed) groups, the output is a dictionary, where AxoSyslog automatically assigns a key to the unnamed groups. For example:
${MY-LIST} = json(); # Creates an empty JSON object
${MY-LIST}.mixed = regexp_search("first-word second-part third", /(?<one>first-word)(second-part)(?<three>third)/);
${MY-LIST}.mixed
is: {"0": "first-word second-part third", "first": "first-word", "2": "second-part", "three": "third"}
regexp_subst
Rewrites a string using regular expressions. This function implements the subst
rewrite rule functionality.
Usage: regexp_subst(<input-string>, <pattern-to-find>, <replacement>, flags
The following example replaces the first IP
in the text of the message with the IP-Address
string.
regexp_subst(${MESSAGE}, "IP", "IP-Address");
To replace every occurrence, use the global=true
flag:
regexp_subst(${MESSAGE}, "IP", "IP-Address", global=true);
Note the following points:
- Regular expressions are case sensitive by default. For case insensitive matches, add
(?i)
to the beginning of your pattern. - You can use regexp constants (slash-enclosed regexps) within FilterX blocks to simplify escaping special characters, for example,
/^beginning and end$/
. - FilterX regular expressions are interpreted in “leave the backslash alone mode”, meaning that a backslash in a string before something that doesn’t need to be escaped and will be interpreted as a literal backslash character. For example,
string\more-string
is equivalent tostring\\more-string
.
For a case sensitive search, use the ignorecase=true
option.
Options
You can use the following flags with the regexp_subst
function:
-
global=true
:Replace every occurrence of the search string.
-
ignorecase=true
:Do case insensitive search.
-
jit=true
:Enable just-in-time compilation function for PCRE regular expressions.
-
newline=true
:When configured, it changes the newline definition used in PCRE regular expressions to accept either of the following:
- a single carriage-return
- linefeed
- the sequence carriage-return and linefeed (
\\r
,\\n
and\\r\\n
, respectively)
This newline definition is used when the circumflex and dollar patterns (
^
and$
) are matched against an input. By default, PCRE interprets the linefeed character as indicating the end of a line. It does not affect the\\r
,\\n
or\\R
characters used in patterns. -
utf8=true
:Use Unicode support for UTF-8 matches: UTF-8 character sequences are handled as single characters.
string
Cast a value into a string. Note currently AxoSyslog evaluates strings and executes template functions and template expressions. In the future, template evaluation will be moved to a separate FilterX function.
Usage: string(<string or expression to cast>)
For example:
myvariable = string(${LEVEL_NUM});
Sometimes you have to explicitly cast values to strings, for example, when you want to concatenate them into a message using the +
operator.
strptime
Creates a datetime
object from a string, similarly to the date-parser()
. The first argument is the string containing the date. The second argument is a format string that specifies how to parse the date string. Optionally, you can specify additional format strings that are applied in order if the previous one doesn’t match the date string.
Usage: strptime(time_str, format_str_1, ..., format_str_N)
For example:
${MESSAGE} = strptime("2024-04-10T08:09:10Z", "%Y-%m-%dT%H:%M:%S%z");
strptime
returns the null value and logs an error message to the internal()
source of AxoSyslog. If you want the FilterX block to explicitly return false in such cases, use the isset
FilterX function on the result of strptime
.
You can use the following elements in the format string:
%% PERCENT
%a day of the week, abbreviated
%A day of the week
%b month abbr
%B month
%c MM/DD/YY HH:MM:SS
%C ctime format: Sat Nov 19 21:05:57 1994
%d numeric day of the month, with leading zeros (eg 01..31)
%e like %d, but a leading zero is replaced by a space (eg 1..31)
%f microseconds, leading 0's, extra digits are silently discarded
%D MM/DD/YY
%G GPS week number (weeks since January 6, 1980)
%h month, abbreviated
%H hour, 24 hour clock, leading 0's)
%I hour, 12 hour clock, leading 0's)
%j day of the year
%k hour
%l hour, 12 hour clock
%L month number, starting with 1
%m month number, starting with 01
%M minute, leading 0's
%n NEWLINE
%o ornate day of month -- "1st", "2nd", "25th", etc.
%p AM or PM
%P am or pm (Yes %p and %P are backwards :)
%q Quarter number, starting with 1
%r time format: 09:05:57 PM
%R time format: 21:05
%s seconds since the Epoch, UCT
%S seconds, leading 0's
%t TAB
%T time format: 21:05:57
%U week number, Sunday as first day of week
%w day of the week, numerically, Sunday == 0
%W week number, Monday as first day of week
%x date format: 11/19/94
%X time format: 21:05:57
%y year (2 digits)
%Y year (4 digits)
%Z timezone in ascii format (for example, PST), or in format -/+0000
%z timezone in ascii format (for example, PST), or in format -/+0000 (Required element)
When using the %z and %Z format elements, consider that while %z strictly expects a specified timezone, and triggers a warning if the timezone is not specified, %Z does not trigger a warning if the timezone is not specified.
For further information about the %z and %Z format elements, see the ‘DESCRIPTION’ section on the srtptime(3) - NetBSD Manual Pages.
For example, for the date 01/Jan/2016:13:05:05 PST
use the following format string: "%d/%b/%Y:%H:%M:%S %Z"
The isodate
FilterX function is a specialized version of strptime
that accepts only a fixed format.
unset
Deletes a variable, a name-value pair, or a key in a complex object (like JSON), for example: unset(${<name-value-pair-to-unset>});
You can also list multiple values to delete: unset(${<first-name-value-pair-to-unset>}, ${<second-name-value-pair-to-unset>});
See also Delete values.
unset_empties
Deletes (unsets) the empty fields of an object, for example, a JSON object or list. Use the recursive=true
parameter to delete empty values of inner dicts’ and lists’ values.
Usage: unset_empties(object, recursive=true)
upper
Converts a string into uppercase characters.
Usage: upper(string)
vars
Returns the variables (including pipeline variables and name-value pairs) defined in the FilterX block as a JSON object.
For example:
filterx {
${logmsg_variable} = "foo";
local_variable = "bar";
declare pipeline_level_variable = "baz";
${MESSAGE} = vars();
};
The value of ${MESSAGE}
will be: {"logmsg_variable":"foo","pipeline_level_variable":"baz"}