This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Enriching log messages with external data

To properly interpret the events that the log messages describe, you must be able to handle log messages as part of a system of events, instead of individual information chunks. The AxoSyslog application allows you to import data from external sources to include in the log messages, thus extending, enriching, and complementing the data found in the log message.

The AxoSyslog application currently provides the following possibilities to enrich log messages.

1 - Adding metadata from an external file

In AxoSyslog version 3.8 and later, you can use an external database file to add additional metadata to your log messages. For example, you can create a database (or export it from an existing tool) that contains a list of hostnames or IP addresses, and the department of your organization that the host belongs to, the role of the host (mailserver, webserver, and so on), or similar contextual information.

The database file is a simple text file in comma-separated value (CSV) format, where each line contains the following information:

  • A selector or ID that appears in the log messages, for example, the hostname. To use shell-style globbing (wildcards) in selectors, see Shell-style globbing in the selector. You can also reference the name of a filter that matches the messages, see Using filters as selector

  • The name of the name-value pair that AxoSyslog adds to matching log messages.

  • The value of the name-value pairs. Starting with AxoSyslog version 3.22, the value of the name-value pair can be a template or a template function, for example, "selector3,name,$(echo $HOST_FROM)";

For example, the following csv-file contains three lines identified with the IP address, and adds the host-role field to the log message.

   192.168.1.1,host-role,webserver
    192.168.2.1,host-role,firewall
    192.168.3.1,host-role,mailserver

The database file:

The database file must comply with the RFC4180 CSV format, with the following exceptions and limitations:

  • The values of the CSV-file cannot contain line-breaks

To add multiple name-value pairs to a message, include a separate line in the database for each name-value pair, for example:

   192.168.1.1,host-role,webserver
    192.168.1.1,contact-person,"John Doe"
    192.168.1.1,contact-email,[email protected]

Technically, add-contextual-data() is a parser in AxoSyslog so you have to define it as a parser object.

Declaration:

   parser p_add_context_data {
        add-contextual-data(
            selector("${HOST}"),
            database("context-info-db.csv"),
        );
    };

You can also add data to messages that do not have a matching selector entry in the database using the default-selector() option.

If you modify the database file, you have to reload AxoSyslog for the changes to take effect. If reloading AxoSyslog or the database file fails for some reason, AxoSyslog will keep using the last working database file.

Example: Adding metadata from a CSV file

The following example defines uses a CSV database to add the role of the host based on its IP address, and prefixes the added name-value pairs with .metadata. The destination includes a template that simply appends the added name-value pairs to the end of the log message.

   @include "scl.conf"
    
    source s_network {
        network(port(5555));
    };
    
    destination d_local {
        file("/tmp/test-msgs.log"
        template("$MSG Additional metadata:[${.metadata.host-role}]")};
    
    parser p_add_context_data {
        add-contextual-data(selector("$SOURCEIP"), database("context-info-db.csv"), default-selector("unknown"), prefix(".metadata."));
    };
    
    log {
        source(s_network);
        parser(p_add_context_data);
        destination(d_local);
    };
   192.168.1.1,host-role,webserver
    192.168.2.1,host-role,firewall
    192.168.3.1,host-role,mailserver
    unknown,host-role,unknown

1.1 - Using filters as selector

To better control to which log messages you add contextual data, you can use filters as selectors. In this case, the first column of the CSV database file must contain the name of a filter. For each message, AxoSyslog evaluates the filters in the order they appear in the database file. If a filter matches the message, AxoSyslog adds the name-value pair related to the filter.

For example, the database file can contain the entries. (For details on the accepted CSV-format, see database().)

   f_auth,domain,all
    f_localhost,source,localhost
    f_kern,domain,kernel

Note that AxoSyslog does not evaluate other filters after the first match. For example, if you use the previous database file, and a message matches both the f_auth and f_localhost filters, AxoSyslog adds only the name-value pair of f_auth to the message.

To add multiple name-value pairs to a message, include a separate line in the database for each name-value pair, for example:

   f_localhost,host-role,firewall
    f_localhost,contact-person,"John Doe"
    f_localhost,contact-email,[email protected]

You can also add data to messages that do not have a matching selector entry in the database using the default-selector() option.

You must store the filters you reference in a database in a separate file. This file is similar to a AxoSyslog configuration file, but must contain only a version string and filters (and optionally comments). You can use the `syslog-ng –syntax-only command to ensure that the file is valid. For example, the content of such a file can be:

   @version: 4.5.0
    filter f_localhost { host("mymachine.example.com") };
    filter f_auth { facility(4) };
    filter f_kern { facility(0) };

Declaration:

   parser p_add_context_data_filter {
        add-contextual-data(
            selector(filters("filters.conf")),
            database("context-info-db.csv"),
            prefix(".metadata.")
        );
    };

If you modify the database file, or the file that contains the filters, you have to reload AxoSyslog for the changes to take effect. If reloading AxoSyslog or the files fails for some reason, AxoSyslog will keep using the last working version of the file.

1.2 - Shell-style globbing in the selector

Starting with in AxoSyslog 3.24 and later, you can use shell-style globbing (’*’ and ‘?’ wildcards) in the selector.

To use globs in a selector

  1. Use the glob() option within the selector() option in your AxoSyslog configuration file, for example:

        parser p_add_context_data {
            add-contextual-data(
                selector(glob("${HOST}"))
                database("context-info-db.csv")
            );
        };
    
  2. Use globs and wildcards in the selector column of your CSV-file, for example:

        example-glob-entry1*,sourcetype,:hec:user
        example-glob-entry2*,sourcetype,:hec:user
        postfix*,sourcetype,:hec:mta
    

Note the following points when using globbing in the selector:

  • The order of the patterns depends on the CSV-file. The order of entries in the database determines the matching order.

  • The globs are matched against the expanded template string sequentially.

  • Put more specific patterns to the top of the CSV-file. The AxoSyslog appication does not evaluate other entries after the first match.

  • In debug mode, AxoSyslog sends log messages to its internal() destination to help troubleshooting. For example:

        [2019-09-21T06:01:10.748237] add-contextual-data(): Evaluating glob against message; glob-template='$PROGRAM', string='postfix/smtpd', pattern='example-glob-entry1*', matched='0'
        [2019-09-21T06:01:10.748562] add-contextual-data(): Evaluating glob against message; glob-template='$PROGRAM', string='postfix/smtpd', pattern='example-glob-entry2*', matched='0'
        [2019-09-21T06:01:10.748697] add-contextual-data(): Evaluating glob against message; glob-template='$PROGRAM', string='postfix/smtpd', pattern='postfix*', matched='1'
        [2019-09-21T06:01:10.750084] add-contextual-data(): message lookup finished; message='almafa', resolved_selector='postfix*', selector='postfix*', msg='0x8e15320'
    

1.3 - Options add-contextual-data()

The add-contextual-data() has the following options.

Required options:

The following options are required: selector(), database().

database()

Type:.csv
Default:

Description: Specifies the path to the CSV file, for example, /opt/syslog-ng/my-csv-database.csv. The extension of the file must be .csv, and can include Windows-style (CRLF) or UNIX-style (LF) linebreaks. You can use absolute path, or relative to the syslog-ng binary.

default-selector()

Synopsis:default-selector()

Description: Specifies the ID of the entry (line) that is corresponds to log messages that do not have a selector that matches an entry in the database. For example, if you add name-value pairs from the database based on the hostname from the log message (selector("${HOST}")), then you can include a line for unknown hosts in the database, and set default-selector() to the ID of the line for unknown hosts. In the CSV file:

   unknown-hostname,host-role,unknown

In the AxoSyslog configuration file:

   add-contextual-data(
        selector("$HOST")
        database("context-info-db.csv")
        default-selector("unknown-hostname")
    );

ignore-case()

Synopsis:ignore-case()
Default:ignore-case(no)

Description: Specifies if selectors are handled as case insensitive. If you set the ignore-case() option to yes, selectors are handled as case insensitive.

prefix()

Synopsis:prefix()

Description: Insert a prefix before the name part of the added name-value pairs (including the pairs added by the default-selector()) to help further processing.

selector()

Synopsis:selector()

Description: Specifies the string or macro that AxoSyslog evaluates for each message, and if its value matches the ID of an entry in the database, AxoSyslog adds the name-value pair of every matching database entry to the log message. You can use the following in the selector() option.

2 - Looking up GeoIP data from IP addresses (DEPRECATED)

This parser is deprecated. Use Looking up GeoIP2 data from IP addresses instead.

The AxoSyslog application can lookup IPv4 addresses from an offline GeoIP database, and make the retrieved data available in name-value pairs. IPv6 addresses are not supported. Depending on the database used, you can access country code, longitude, and latitude information.

You can refer to the separated parts of the message using the key of the value as a macro. For example, if the message contains KEY1=value1,KEY2=value2, you can refer to the values as ${KEY1} and ${KEY2}.

Declaration:

   parser parser_name {
        geoip(
            <macro-containing-the-IP-address-to-lookup>
            prefix()
            database("<path-to-database-file>")
        );
    };

Example: Using the GeoIP parser

In the following example, AxoSyslog retrieves the GeoIP data of the IP address contained in the ${HOST} field of the incoming message, and includes the data (prefixed with the geoip. string) in the output JSON message.

   @version: 3.7
    
    options {
        keep-hostname(yes);
    };
    
    source s_file {
        file("/tmp/input");
    };
    
    parser p_geoip { geoip( "${HOST}", prefix( "geoip." ) database( "/usr/share/GeoIP/GeoLiteCity.dat" ) ); };
    
    destination d_file {
        file( "/tmp/output" template("$(format-json --scope core --key geoip*)\n") );
    };
    
    
    log {
        source(s_file);
        parser(p_geoip);
        destination(d_file);
    };

For example, for the <38>an 1 14:45:22 192.168.1.1 prg00000[1234]: test message message the output will look like:

   {"geoip":{"longitude":"47.460704","latitude":"19.049968","country_code":"HU"},"PROGRAM":"prg00000","PRIORITY":"info","PID":"1234","MESSAGE":"test message","HOST":"192.168.1.1","FACILITY":"auth","DATE":"Jan  1 14:45:22"}

If you are transferring your log messages into Elasticsearch, use the following rewrite rule to combine the longitude and latitude information into a single value (called geoip.location), and set the mapping in Elasticsearch accordingly. Do not forget to include the rewrite in your log path. For details on transferring your log messages to Elasticsearch, see elasticsearch2: DEPRECATED - Send messages directly to Elasticsearch version 2.0 or higher.

   rewrite r_geoip {
        set(
            "${geoip.latitude},${geoip.longitude}",
            value( "geoip.location" ),
            condition(not "${geoip.latitude}" == "")
        );
    };

In your Elasticsearch configuration, set the appropriate mappings:

   {
       "mappings" : {
          "_default_" : {
             "properties" : {
                "geoip" : {
                   "properties" : {
                      "country_code" : {
                         "index" : "not_analyzed",
                         "type" : "string",
                         "doc_values" : true
                      },
                      "latitude" : {
                         "index" : "not_analyzed",
                         "type" : "string",
                         "doc_values" : true
                      },
                      "longitude" : {
                         "type" : "string",
                         "doc_values" : true,
                         "index" : "not_analyzed"
                      },
                      "location" : {
                         "type" : "geo_point"
                      }
                   }
                }
             }
          }
       }
    }

2.1 - Options of geoip parsers

The geoip parser has the following options.

prefix()

Synopsis:prefix()

Description: Insert a prefix before the name part of the parsed name-value pairs to help further processing. For example:

  • To insert the my-parsed-data. prefix, use the prefix(my-parsed-data.) option.

  • To refer to a particular data that has a prefix, use the prefix in the name of the macro, for example, ${my-parsed-data.name}.

  • If you forward the parsed messages using the IETF-syslog protocol, you can insert all the parsed data into the SDATA part of the message using the prefix(.SDATA.my-parsed-data.) option.

Names starting with a dot (for example, .example) are reserved for use by AxoSyslog. If you use such a macro name as the name of a parsed value, it will attempt to replace the original value of the macro (note that only soft macros can be overwritten, see Hard versus soft macros for details). To avoid such problems, use a prefix when naming the parsed values, for example, prefix(my-parsed-data.)

For example, to insert the geoip. prefix, use the prefix(.geoip.) option. To refer to a particular data when using a prefix, use the prefix in the name of the macro, for example, ${geoip.country_code} .

database()

Synopsis:database()
Default:/usr/share/GeoIP/GeoIP.dat

Description: The full path to the GeoIP database to use. Note that AxoSyslog must have the required privileges to read this file. Do not modify or delete this file while AxoSyslog is running, it can crash AxoSyslog.

3 - Looking up GeoIP2 data from IP addresses

The AxoSyslog application can lookup IP addresses from an offline GeoIP2 database, and make the retrieved data available in name-value pairs. Depending on the database used, you can access country code, longitude, and latitude information and so on.

The AxoSyslog application works with the Country and the City version of the GeoIP2 database, both free and the commercial editions. The AxoSyslog application works with the mmdb (GeoIP2) format of these databases. Other formats, like csv are not supported.

3.1 - Referring to parts of the message as a macro

You can refer to the separated parts of the message using the key of the value as a macro. For example, if the message contains KEY1=value1,KEY2=value2, you can refer to the values as ${KEY1} and ${KEY2}.

for example, if the default prefix (.geoip2) is used, you can determine the country code using ${.geoip2.country.iso_code}.

To look up all keys:

  1. Install the mmdb-bin package.

    After installing this package, you will be able to use the mmdblookup command.

  2. Create a dump using the following command: `mmdblookup –file GeoLite2-City.mmdb –ip

    The resulting dump file will contain the keys that you can use.

For a more complete list of keys, you can also check the GeoIP2 City and Country CSV Databases. However, note that the AxoSyslog application works with the mmdb (GeoIP2) format of these databases. Other formats, like csv are not supported.

3.2 - Using the GeoIP2 parser

Declaration:

   parser parser_name {
        geoip2(
            <macro-containing-the-IP-address-to-lookup>
            prefix()
            database("<path-to-geoip2-database-file>")
        );
    };

In the following example, AxoSyslog retrieves the GeoIP2 data of the IP address contained in the ${HOST} field of the incoming message (assuming that in this case the ${HOST} field contains an IP address), and includes the data (prefixed with the geoip2 string) in the output JSON message.

   @version: 3.11
    
    options {
        keep-hostname(yes);
    };
    
    source s_file {
        file("/tmp/input");
    };
    
    parser p_geoip2 {
        geoip2(
            "${HOST}",
            prefix( "geoip2." )
            database( "/usr/share/GeoIP2/GeoLiteCity.dat" )
        );
    };
    
    destination d_file {
        file(
            "/tmp/output"
            flags(syslog-protocol)
            template("$(format-json --scope core --key geoip2*)\n")
        );
    };
    
    
    log {
        source(s_file);
        parser(p_geoip2);
        destination(d_file);
    };

For example, for the <38>2017-05-24T13:09:46 192.168.1.1 prg00000[1234]: test message message the output will look like:

   <38>1 2017-05-24T13:09:46+02:00 192.168.1.1 prg00000 1234 - [meta sequenceId="3"] {"geoip2":{"subdivisions":{"0":{"names":{"en":"Budapest"},"iso_code":"BU","geoname_id":"3054638"}},"registered_country":{"names":{"en":"Hungary"},"iso_code":"HU","geoname_id":"719819"},"postal":{"code":"1063"},"location":{"time_zone":"Europe/Budapest","longitude":"19.070200","latitude":"47.510200","accuracy_radius":"5"},"country":{"names":{"en":"Hungary"},"iso_code":"HU","geoname_id":"719819"},"continent":{"names":{"en":"Europe"},"geoname_id":"6255148","code":"EU"},"city":{"names":{"en":"Budapest"},"geoname_id":"3054643"}},"PROGRAM":"prg00000","PRIORITY":"info","PID":"1234","MESSAGE":"test message","HOST":"192.168.1.1","FACILITY":"auth","DATE":"May 24 13:09:46"}

3.3 - Transferring your logs to Elasticsearch using GeoIP2

If you are transferring your log messages into Elasticsearch, use the following rewrite rule to combine the longitude and latitude information into a single value (called geoip2.location), and set the mapping in Elasticsearch accordingly. Do not forget to include the rewrite in your log path. These examples assume that you used prefix("geoip2.") instead of the default for the geoip2 parser. For details on transferring your log messages to Elasticsearch, see elasticsearch2: DEPRECATED - Send messages directly to Elasticsearch version 2.0 or higher.

   rewrite r_geoip2 {
        set(
            "${geoip2.location.latitude},${geoip2.location.longitude}",
            value( "geoip2.location2" ),
            condition(not "${geoip2.location.latitude}" == "")
        );
    };

In your Elasticsearch configuration, set the appropriate mappings:

   {
       "mappings" : {
          "_default_" : {
             "properties" : {
                "geoip2" : {
                   "properties" : {
                      "location2" : {
                         "type" : "geo_point"
                      }
                   }
                }
             }
          }
       }
    }

3.4 - Options of geoip2 parsers

The geoip2 parser has the following options.

prefix()

Synopsis:prefix()

Description: Insert a prefix before the name part of the parsed name-value pairs to help further processing. For example:

  • To insert the my-parsed-data. prefix, use the prefix(my-parsed-data.) option.

  • To refer to a particular data that has a prefix, use the prefix in the name of the macro, for example, ${my-parsed-data.name}.

  • If you forward the parsed messages using the IETF-syslog protocol, you can insert all the parsed data into the SDATA part of the message using the prefix(.SDATA.my-parsed-data.) option.

Names starting with a dot (for example, .example) are reserved for use by AxoSyslog. If you use such a macro name as the name of a parsed value, it will attempt to replace the original value of the macro (note that only soft macros can be overwritten, see Hard versus soft macros for details). To avoid such problems, use a prefix when naming the parsed values, for example, prefix(my-parsed-data.)

For example, to insert the .geoip2 prefix, use the prefix(.geoip2) option. To refer to a particular data when using a prefix, use the prefix in the name of the macro, for example, ${geoip2.country_code} .

database()

Synopsis:database()
Default:

Description: Path to the GeoIP2 database to use. This works with absolute and relative paths as well. Note that AxoSyslog must have the required privileges to read this file. Do not modify or delete this file while AxoSyslog is running, it can crash AxoSyslog.

Starting with version 3.24, AxoSyslog tries to automatically detect the location of the database. If that is successful, the database() option is not mandatory.