1 - The philosophy of AxoSyslog

Typically, AxoSyslog is used to manage log messages and implement centralized logging, where the aim is to collect the log messages of several devices on a single, central log server. The different devices — called AxoSyslog clients — all run AxoSyslog, and collect the log messages from the various applications, files, and other sources. The clients send all important log messages to the remote AxoSyslog server, which sorts and stores them.

2 - Logging with AxoSyslog

The AxoSyslog application reads incoming messages and forwards them to the selected destinations. The AxoSyslog application can receive messages from files, remote hosts, and other sources.

Log messages enter AxoSyslog in one of the defined sources, and are sent to one or more destinations.

Sources and destinations are independent objects, log paths define what AxoSyslog does with a message, connecting the sources to the destinations. A log path consists of one or more sources and one or more destinations: messages arriving from a source are sent to every destination listed in the log path. A log path defined in AxoSyslog is called a log statement.

Optionally, log paths can include filters. Filters are rules that select only certain messages, for example, selecting only messages sent by a specific application. If a log path includes filters, AxoSyslog sends only the messages satisfying the filter rules to the destinations set in the log path.

Other optional elements that can appear in log statements are parsers and rewriting rules. Parsers segment messages into different fields to help processing the messages, while rewrite rules modify the messages by adding, replacing, or removing parts of the messages.

2.1 - The route of a log message in AxoSyslog

Purpose:

The following procedure illustrates the route of a log message from its source on the AxoSyslog client to its final destination on the central AxoSyslog server.

The route of a log message

Steps:

  1. A device or application sends a log message to a source on the AxoSyslog client. For example, an Apache web server running on Linux enters a message into the /var/log/apache file.

  2. The AxoSyslog client running on the web server reads the message from its /var/log/apache source.

  3. The AxoSyslog client processes the first log statement that includes the /var/log/apache source.

  4. The AxoSyslog client performs optional operations (message filtering, parsing, and rewriting) on the message, for example, it compares the message to the filters of the log statement (if any). If the message complies with all filter rules, AxoSyslog sends the message to the destinations set in the log statement, for example, to the remote AxoSyslog server.

  5. The AxoSyslog client processes the next log statement that includes the /var/log/apache source, repeating Steps 3-4.

  6. The message sent by the AxoSyslog client arrives from a source set in the AxoSyslog server.

  7. The AxoSyslog server reads the message from its source and processes the first log statement that includes that source.

  8. The AxoSyslog server performs optional operations (message filtering, parsing, and rewriting) on the message, for example, it compares the message to the filters of the log statement (if any). If the message complies with all filter rules, AxoSyslog sends the message to the destinations set in the log statement.

  9. The AxoSyslog server processes the next log statement, repeating Steps 7-9.

3 - Modes of operation

The AxoSyslog application has three typical operation scenarios: Client, Server, and Relay.

3.1 - Client mode

Processing logs in client mode

In client mode, AxoSyslog collects the local logs generated by the host and forwards them through a network connection to the central AxoSyslog server or to a relay. Clients often also log the messages locally into files.

3.2 - Relay mode

Processing logs in relay mode

In relay mode, AxoSyslog receives logs through the network from AxoSyslog clients and forwards them to the central AxoSyslog server using a network connection. Relays also log the messages from the relay host into a local file, or forward these messages to the central AxoSyslog server.

Example relay use cases

The relay collects log messages through the network and after processing, but without writing them on the disk for storage, forwards them to one or more remote destinations.

You can use a relay for many different use cases as described in the examples below.

UDP-only source devices

Most network devices send log messages over UDP. However, UDP does not guarantee that all packets are delivered, which makes UDP unreliable.

To ensure at least a best effort level of reliability, Axoflow recommends that you deploy a relay on the network, close to the source devices. With the most reliable hops between the source and the relay, you can minimize the risk of losing UDP packets. Once the packet arrives at the relay, AxoSyslog ensures that the messages are delivered to the central server in a reliable manner.

Too many source devices

Depending on the hardware and configuration, an average instance can usually handle the following number of concurrent connections:

If the maximum message rate is lower than 200,000 messages per second:

  • maximum ca. 5,000 TCP connections

  • maximum ca. 1,000 TLS connections

If the message rate is higher than 200,000 messages per second, contact Axoflow.

If you have more source devices, you must deploy a relay machine at least per 5,000 sources and batch up all the logs into a single TCP connection that connects the relay to the server. If TLS is used, deploy relays per 1,000 source devices.

Collecting logs from remote sites (especially over public WAN)

If you need to collect log messages from geographically remote sites or over public WAN, Axoflow recommends that you install at least a relay node per each remote site. The relay can be the last outgoing hop for all the messages of the remote site, which has several benefits:

  • Maintenance: You only need to change the configuration of the relay if you want to re-route the logs of some or all sources of the remote site. Also you do not need to change each source’s configuration one by one.

  • Security: If you trust your internal network, it is not necessary to hold encrypted connections within the LAN of the remote site as the messages can get to the relay without encryption. Messages must be sent in an encrypted way over the public WAN, and it is enough to hold only a single TCP/TLS connection between the sites, that is, between the remote relay and the central server. This eliminates the wasting of resources as holding several TLS connections directly from the clients is more costly than holding a single connection from the relay.

  • Reliability: You can set up a main disk-buffer on the relay. The main disk-buffer is only responsible for buffering all the logs of the remote site if the central AxoSyslog server is temporarily unavailable. It is easier to maintain this single main disk-buffer instead of setting disk-buffers on individual client machines.

Separation, distribution, and balancing of message processing tasks

Most Linux applications have their own human readable, but difficult to handle, log messages. Without parsing and normalization it is difficult to alert and report on these log messages. Many AxoSyslog users use the message parsing tools of AxoSyslog to normalize their different log messages. Just like normalization, filtering can also be resource-heavy, depending on what the filtering is based on. In this case, it might be inefficient to perform all the message processing tasks on the server as it can result in decreased overall performance.

It is a typical setup to deploy relays in front of the central server operating as a receiver front-end. Most resource-heavy tasks, for example, parsing, filtering, and so on, are performed on this receiver layer. As all resource-heavy tasks are performed on the relay, the central server behind it only needs to get the messages from the relay and write them into the final text-based or tamper-proof (logstore) format. Since you can run several relays, you can balance the resource-heavy tasks between more relays, and a single server behind the relays can still be fast enough to write all the messages on the disk.

Acting as a relay also depends on the functionality. A relay does not have to be a dedicated relay machine at all. For log collection, it can be one of the clients with a relay configuration. Note that in a robust log collection infrastructure, the relays have their own purpose, and Axoflow recommends running dedicated relay machines.

You can run several parallel relays to ensure horizontal redundancy. For example, if each of the relays has the same configuration, when one relay goes down another relay can take over the processing. Distribution of the logs can be done by the built-in client-side failover functionality and also by a general load balancer. The load balancer is also used to serve N+1 redundant relay deployments. In this case, switching from one relay to another relay is done when there is an outage but also for real load balancing purposes.

What AxoSyslog relays are not good for

The purpose of the relay is to buffer the logs for short term, for example, a few minutes or a few hours long outages (depending on the log volume). It is not designed to buffer logs generated by the sources during a very long server or connection outage, for example, up to a few days long.

If you expect extended outages, Axoflow recommends that you deploy servers instead of relays. There are many deployments where long term storage and archiving are performed on the central AxoSyslog server, but relays also do short-term log storage.

3.3 - Server mode

Processing logs in server mode

In server mode, AxoSyslog acts as a central log-collecting server. It receives messages from AxoSyslog clients and relays over the network, and stores them locally in files, or passes them to other applications, for example, log analyzers.

4 - Global objects

The AxoSyslog application uses the following objects:

  • Source driver: A communication method used to receive log messages. For example, AxoSyslog can receive messages from a remote host via TCP/IP, or read the messages of a local application from a file. For details on source drivers, see source: Read, receive, and collect log messages.

  • Source: A named collection of configured source drivers.

  • Destination driver: A communication method used to send log messages. For example, AxoSyslog can send messages to a remote host via TCP/IP, or write the messages into a file or database. For details on destination drivers, see destination: Forward, send, and store log messages.

  • Destination: A named collection of configured destination drivers.

  • Filter: An expression to select messages. For example, a simple filter can select the messages received from a specific host. For details, see Customize message format using macros and templates.

  • Macro: An identifier that refers to a part of the log message. For example, the ${HOST} macro returns the name of the host that sent the message. Macros are often used in templates and filenames. For details, see Customize message format using macros and templates.

  • Parser: Parsers are objects that parse the incoming messages, or parts of a message. For example, the csv-parser() can segment messages into separate columns at a predefined separator character (for example, a comma). Every column has a unique name that can be used as a macro. For details, see parser: Parse and segment structured messages and db-parser: Process message content with a pattern database (patterndb).

  • Rewrite rule: A rule modifies a part of the message, for example, replaces a string, or sets a field to a specified value. For details, see Modifying messages using rewrite rules.

  • Log paths: A combination of sources, destinations, and other objects like filters, parsers, and rewrite rules. The AxoSyslog application sends messages arriving from the sources of the log paths to the defined destinations, and performs filtering, parsing, and rewriting of the messages. Log paths are also called log statements. Log statements can include other (embedded) log statements and junctions to create complex log paths. For details, see log: Filter and route log messages using log paths, flags, and filters.

  • Template: A template is a set of macros that can be used to restructure log messages or automatically generate file names. For example, a template can add the hostname and the date to the beginning of every log message. For details, see Customize message format using macros and templates.

  • Option: Options set global parameters of syslog-ng, like the parameters of name resolution and timezone handling. For details, see Global options.

For details on the above objects, see The configuration syntax in detail.

5 - Timezones and daylight saving

The AxoSyslog application receives the timezone and daylight saving information from the operating system it is installed on. If the operating system handles daylight saving correctly, so does syslog-ng.

The AxoSyslog application supports messages originating from different timezones. The original syslog protocol (RFC3164) does not include timezone information, but syslog-ng provides a solution by extending the syslog protocol to include the timezone in the log messages. The AxoSyslog application also enables administrators to supply timezone information for legacy devices which do not support the protocol extension.

5.1 - Assigning timezone to the message

When AxoSyslog receives a message, it assigns timezone information to the message using the following algorithm.

  1. The sender application or host specifies the timezone of the messages. If the incoming message includes a timezone it is associated with the message. Otherwise, the local timezone is assumed.

  2. Specify the time-zone() parameter for the source driver that reads the message. This timezone will be associated with the messages only if no timezone is specified within the message itself. Each source defaults to the value of the recv-time-zone() global option. It is not possible to override only the timezone information of the incoming message, but setting the keep-timestamp() option to no allows AxoSyslog to replace the full timestamp (timezone included) with the time the message was received.

  3. Specify the timezone in the destination driver using the time-zone() parameter. Each destination driver might have an associated timezone value: syslog-ng converts message timestamps to this timezone before sending the message to its destination (file or network socket). Each destination defaults to the value of the send-time-zone() global option.

  4. If the timezone is not specified, local timezone is used.

  5. When macro expansions are used in the destination filenames, the local timezone is used. (Also, if the timestamp of the received message does not contain the year of the message, AxoSyslog uses the local year.)

5.2 - A note on timezones and timestamps

If the clients run syslog-ng, then use the ISO timestamp, because it includes timezone information. That way you do not need to adjust the recv-time-zone() parameter of syslog-ng.

If you want to output timestamps in Unix (POSIX) time format, use the S_UNIXTIME and R_UNIXTIME macros. You do not need to change any of the timezone related parameters, because the timestamp information of incoming messages is converted to Unix time internally, and Unix time is a timezone-independent time representation. (Actually, Unix time measures the number of seconds elapsed since midnight of Coordinated Universal Time (UTC) January 1, 1970, but does not count leap seconds.)

6 - Product licensing

Starting with version 3.2, the AxoSyslog application is licensed under a combined LGPL+GPL license. The core of AxoSyslog is licensed under the GNU Lesser General Public License Version 2.1 license, while the rest of the codebase is licensed under the GNU General Public License Version 2 license.

For details about the LGPL and GPL licenses, see GNU Lesser General Public License and GNU General Public License, respectively.

The Documentation is licensed separately. For details, see The syslog-ng OSE Documentation License.

7 - High availability support

Multiple AxoSyslog servers can be run in fail-over mode. The AxoSyslog application does not include any internal support for this, as clustering support must be implemented on the operating system level. A tool that can be used to create UNIX clusters is Heartbeat (for details, see this page).

Starting with AxoSyslog version 3.2, AxoSyslog clients can be configured to send the log messages to failover servers in case the primary syslog server becomes unaccessible. For details on configuring failover servers, see the description of the failover-servers() destination option in destination: Forward, send, and store log messages.

8 - The structure of a log message

The following sections describe the structure of log messages. Currently there are two standard syslog message formats:

8.1 - BSD-syslog or legacy-syslog messages

This section describes the format of a syslog message, according to the legacy-syslog or BSD-syslog protocol. A syslog message consists of the following parts:

  • [PRI](/docs/axosyslog-core/chapter-concepts/concepts-message-structure/concepts-message-bsdsyslog/concepts-message-bsdsyslog-pri/)

  • [HEADER](/docs/axosyslog-core/chapter-concepts/concepts-message-structure/concepts-message-bsdsyslog/concepts-message-bsdsyslog-header/)

  • [MSG](/docs/axosyslog-core/chapter-concepts/concepts-message-structure/concepts-message-bsdsyslog/concepts-message-bsdsyslog-msg/)

The total message cannot be longer than 1024 bytes.

The following is a sample syslog message:

   <133>Feb 25 14:09:07 webserver syslogd: restart

The message corresponds to the following format:

   <priority>timestamp hostname application: message

The different parts of the message are explained in the following sections.

8.1.1 - The PRI message part

This section describes the PRI message part of a syslog message, according to the legacy-syslog or BSD-syslog protocol.

For further details about the HEADER and MSG parts of a syslog message, see the following sections:

  • [HEADER](/docs/axosyslog-core/chapter-concepts/concepts-message-structure/concepts-message-bsdsyslog/concepts-message-bsdsyslog-header/)

  • [MSG](/docs/axosyslog-core/chapter-concepts/concepts-message-structure/concepts-message-bsdsyslog/concepts-message-bsdsyslog-msg/)

The PRI message part

The PRI part of the syslog message (known as Priority value) represents the Facility and Severity of the message. Facility represents the part of the system sending the message, while Severity marks its importance.

PRI formula

The Priority value is calculated using the following formula:

   <PRI> = ( <facility> * 8) + <severity> 

That is, you first multiply the Facility number by 8, and then add the numerical value of the Severity to the multiplied sum.

Example: the correlation between facility value, severity value, and the Priority value in the PRI message part

The following example illustrates a sample syslog message with a sample PRI field (that is, Priority value):

   <133> Feb 25 14:09:07 webserver syslogd: restart

In this example, <133> represents the PRI field (Priority value). The syslog message’s Facility value is 16, and the Severity value is 5.

Substituting the numerical values into the <PRI> = ( <facility> * 8) + <severity> formula, the results match the Priority value in our example:

<133> = ( <16> * 8) + <5>.

Facility and Severity values

The possible Facility values (between 0 and 23) and Severity values (between 0 and 7) each correspond to a message type (see Table 1: syslog Message Facilities), or a message importance level (see Table 2: syslog Message Severities).

syslog Message Facilities

The following table lists possible Facility values.

Numerical CodeFacility
0kernel messages
1user-level messages
2mail system
3system daemons
4security/authorization messages
5messages generated internally by syslogd
6line printer subsystem
7network news subsystem
8UUCP subsystem
9clock daemon
10security/authorization messages
11FTP daemon
12NTP subsystem
13log audit
14log alert
15clock daemon
16-23locally used facilities (local0-local7)

syslog Message Severities

The following table lists possible Severity values.

Numerical CodeSeverity
0Emergency: system is unusable
1Alert: action must be taken immediately
2Critical: critical conditions
3Error: error conditions
4Warning: warning conditions
5Notice: normal but significant condition
6Informational: informational messages
7Debug: debug-level messages

8.1.2 - The HEADER message part

This section describes the HEADER message part of a syslog message, according to the legacy-syslog or BSD-syslog protocol.

For further details about the MSG and PRI parts of a syslog message, see the following sections:

The HEADER message part

The HEADER message part contains a timestamp and the hostname (without the domain name) or the IP address of the device. The timestamp field is the local time in the Mmm dd hh:mm:ss format, where:

  • Mmm is the English abbreviation of the month: Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec.

  • dd is the day of the month on two digits. If the day of the month is less than 10, the first digit is replaced with a space. (for example, Aug 7.)

  • hh:mm:ss is the local time. The hour (hh) is represented in a 24-hour format. Valid entries are between 00 and 23, inclusive. The minute (mm) and second (ss) entries are between 00 and 59 inclusive.

8.1.3 - The MSG message part

This section describes the MSG message part of a syslog message, according to the legacy-syslog or BSD-syslog protocol.

For further details about the HEADER and PRI message parts of a syslog message, see the following sections:

The MSG message part

The MSG part contains the name of the program or process that generated the message, and the text of the message itself. The MSG part is usually in the following format: program[pid]: message text.

8.2 - IETF-syslog messages

This section describes the format of a syslog message, according to the IETF-syslog protocol. A syslog message consists of the following parts:

  • HEADER (includes the PRI as well)

  • STRUCTURED-DATA

  • MSG

The following is a sample syslog message (source: https://tools.ietf.org/html/rfc5424):

   <34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 - BOM'su root' failed for lonvick on /dev/pts/8

The message corresponds to the following format:

   <priority>VERSION ISOTIMESTAMP HOSTNAME APPLICATION PID MESSAGEID STRUCTURED-DATA MSG
  • Facility is 4, severity is 2, so PRI is 34.

  • The VERSION is 1.

  • The message was created on 11 October 2003 at 10:14:15pm UTC, 3 milliseconds into the next second.

  • The message originated from a host that identifies itself as “mymachine.example.com”.

  • The APP-NAME is “su” and the PROCID is unknown.

  • The MSGID is “ID47”.

  • The MSG is “‘su root’ failed for lonvick…”, encoded in UTF-8.

  • In this example, the encoding is defined by the BOM:

    The byte order mark (BOM) is a Unicode character used to signal the byte-order of the message text.

  • There is no STRUCTURED-DATA present in the message, this is indicated by “-” in the STRUCTURED-DATA field.

The HEADER part of the message must be in plain ASCII format, the parameter values of the STRUCTURED-DATA part must be in UTF-8, while the MSG part should be in UTF-8. The different parts of the message are explained in the following sections.

The PRI message part

The PRI part of the syslog message (known as Priority value) represents the Facility and Severity of the message. Facility represents the part of the system sending the message, while severity marks its importance. The Priority value is calculated by first multiplying the Facility number by 8 and then adding the numerical value of the Severity. The possible facility and severity values are presented below.

syslog Message Facilities

Numerical Code

Facility

0

kernel messages

1

user-level messages

2

mail system

3

system daemons

4

security/authorization messages

5

messages generated internally by syslogd

6

line printer subsystem

7

network news subsystem

8

UUCP subsystem

9

clock daemon

10

security/authorization messages

11

FTP daemon

12

NTP subsystem

13

log audit

14

log alert

15

clock daemon

16-23

locally used facilities (local0-local7)

The following table lists the severity values.

Numerical CodeSeverity
0Emergency: system is unusable
1Alert: action must be taken immediately
2Critical: critical conditions
3Error: error conditions
4Warning: warning conditions
5Notice: normal but significant condition
6Informational: informational messages
7Debug: debug-level messages

syslog Message Severities

The HEADER message part

The HEADER part contains the following elements:

  • VERSION: Version number of the syslog protocol standard. Currently this can only be 1.

  • ISOTIMESTAMP: The time when the message was generated in the ISO 8601 compatible standard timestamp format (yyyy-mm-ddThh:mm:ss+-ZONE), for example: 2006-06-13T15:58:00.123+01:00.

  • HOSTNAME: The machine that originally sent the message.

  • APPLICATION: The device or application that generated the message

  • PID: The process name or process ID of the syslog application that sent the message. It is not necessarily the process ID of the application that generated the message.

  • MESSAGEID: The ID number of the message.

The AxoSyslog application will truncate the following fields:

  • If APP-NAME is longer than 48 characters it will be truncated to 48 characters.

  • If PROC-ID is longer than 128 characters it will be truncated to 128 characters.

  • If MSGID is longer than 32 characters it will be truncated to 32 characters.

  • If HOSTNAME is longer than 255 characters it will be truncated to 255 characters.

The STRUCTURED-DATA message part

The STRUCTURED-DATA message part may contain meta- information about the syslog message, or application-specific information such as traffic counters or IP addresses. STRUCTURED-DATA consists of data blocks enclosed in brackets ([]). Every block includes the ID of the block, and one or more name=value pairs. The AxoSyslog application automatically parses the STRUCTURED-DATA part of syslog messages, which can be referenced in macros (for details, see Macros of AxoSyslog). An example STRUCTURED-DATA block looks like:

   [exampleSDID@0 iut="3" eventSource="Application" eventID="1011"][examplePriority@0 class="high"]

The MSG message part

The MSG part contains the text of the message itself. The encoding of the text must be UTF-8 if the BOM

The byte order mark (BOM) is a Unicode character used to signal the byte-order of the message text.

character is present in the message. If the message does not contain the BOM character, the encoding is treated as unknown. Usually messages arriving from legacy sources do not include the BOM character. CRLF characters will not be removed from the message.

8.3 - Enterprise-wide message model (EWMM)

The following section describes the structure of log messages using the Enterprise-wide message model or EWMM message format.

The Enterprise-wide message model or EWMM allows you to deliver structured messages from the initial receiving AxoSyslog component right up to the central log server, through any number of hops. It does not matter if you parse the messages on the client, on a relay, or on the central server, their structured results will be available where you store the messages. Optionally, you can also forward the original raw message as the first AxoSyslog component in your infrastructure has received it, which is important if you want to forward a message for example, to a SIEM system. To make use of the enterprise-wide message model, you have to use the syslog-ng() destination on the sender side, and the default-network-drivers() source on the receiver side.

The following is a sample log message in EWMM format.

   <13>1 2018-05-13T13:27:50.993+00:00 my-host @syslog-ng - - -
    {"MESSAGE":"<34>Oct 11 22:14:15 mymachine su: 'su root' failed for username on
    /dev/pts/8","HOST_FROM":"my-host","HOST":"my-host","FILE_NAME":"/tmp/in","._TAGS":".source.s_file"}

The message has the following parts:

  • The header of the complies with the RFC5424 message format, where the PROGRAM field is set to @syslog-ng, and the SDATA field is empty.

  • The MESSAGE part is in JSON format, and contains the actual message, as well as any name-value pairs that AxoSyslog has attached to or extracted from the message. The ${._TAGS} field contains the identifier of the AxoSyslog source that has originally received the message on the first AxoSyslog node.

To send a message in EWMM format, you can use the syslog-ng() destination driver, or the format-ewmm() template function.

To receive a message in EWMM format, you can use the default-destination-drivers() source driver, or the ewmm-parser() parser.

9 - Message representation in AxoSylog

When the AxoSyslog application receives a message, it automatically parses the message. The AxoSyslog application can automatically parse log messages that conform to the RFC3164 (BSD or legacy-syslog) or the RFC5424 (IETF-syslog) message formats. If AxoSyslog cannot parse a message, it results in an error.

A parsed syslog message has the following parts:

  • Timestamps

    Two timestamps are associated with every message: one is the timestamp contained within the message (that is, when the sender sent the message), the other is the time when AxoSyslog has actually received the message.

  • Severity

    The severity of the message.

  • Facility

    The facility that sent the message.

  • Tags

    Custom text labels added to the message that are mainly used for filtering. None of the current message transport protocols adds tags to the log messages. Tags can be added to the log message only within AxoSyslog. The AxoSyslog application automatically adds the id of the source as a tag to the incoming messages. Other tags can be added to the message by the pattern database, or using the tags() option of the source.

  • IP address of the sender

    The IP address of the host that sent the message. Note that the IP address of the sender is a hard macro and cannot be modified within AxoSyslog but the associated hostname can be modified, for example, using rewrite rules.

  • Hard macros

    Hard macros contain data that is directly derived from the log message, for example, the ${MONTH} macro derives its value from the timestamp. The most important consideration with hard macros is that they are read-only, meaning they cannot be modified using rewrite rules or other means.

  • Soft macros

    Soft macros (sometimes also called name-value pairs) are either built-in macros automatically generated from the log message (for example, ${HOST}), or custom user-created macros generated by using the AxoSyslog pattern database or a CSV-parser. The SDATA fields of RFC5424-formatted log messages become soft macros as well. In contrast with hard macros, soft macros are writable and can be modified within AxoSyslog, for example, using rewrite rules.

Message size and encoding

Internally, AxoSyslog represents every message as UTF-8. The maximal length of the log messages is limited by the log-msg-size() option: if a message is longer than this value, AxoSyslog truncates the message at the location it reaches the log-msg-size() value, and discards the rest of the message.

When encoding is set in a source (using the encoding() option) and the message is longer (in bytes) than log-msg-size() in UTF-8 representation, AxoSyslog splits the message at an undefined location (because the conversion between different encodings is not trivial).

10 - Structuring macros, metadata, and other value-pairs

Available in AxoSyslog 3.3 and later.

The AxoSyslog application allows you to select and construct name-value pairs from any information already available about the log message, or extracted from the message itself. You can directly use this structured information, for example, in the following places:

When using value-pairs, there are three ways to specify which information (that is, macros or other name-value pairs) to include in the selection.

  • Select groups of macros using the scope() parameter, and optionally remove certain macros from the group using the exclude() parameter.

  • List specific macros to include using the key() parameter.

  • Define new name-value pairs to include using the pair() parameter.

These parameters are detailed in value-pairs().

10.1 - Specifying data types in value-pairs

Prior to version 4.0, AxoSyslog handled every data as strings, and allowed you to convert the strings into other types of data that only certain destinations data formats supported. For example, SQL, MongoDB, JSON, or AMQP support data types like numbers or dates. The AxoSyslog application allows you to specify the data type in templates (this is also called type-hinting or type-casting). If the destination driver supports data types, AxoSyslog converts the incoming data to the specified data type. For example, this allows you to store integer numbers as numbers in MongoDB, instead of strings.

Starting with AxoSyslog 4.0, each name-value pair is a (name, type, value) triplet, and several components of AxoSyslog have typing support, for example, json-parser() and the $(format-json) template function. For details, see the list of supported data types.

Using explicit type-hinting

You can explicitly type-cast a AxoSyslog template to a specific type. To use type-hinting, enclose the macro or template containing the data with the type: <datatype>("<macro>"), for example: int("$PID"). See the Type-hinting examples and the list of supported data types for details.

Not every destination or other component supports data types. For details, see the list of AxoSyslog components that support data types.

Type-hinting examples

The following example stores the MESSAGE, PID, DATE, and PROGRAM fields of a log message in a MongoDB database. The DATE and PID parts are stored as numbers instead of strings.

   mongodb(
        value-pairs(pair("date", datetime("$UNIXTIME"))
            pair("pid", int64("$PID"))
            pair("program", "$PROGRAM"))
            pair("message", "$MESSAGE"))
        )
    );

The following example formats the same fields into JSON.

   $(format-json date=datetime($UNIXTIME) pid=int64($PID) program=$PROGRAM message=$MESSAGE)

The following example formats the MESSAGE field as a JSON list.

   $(format-json message=list($MESSAGE))"

Data types in AxoSyslog

The AxoSyslog application currently supports the following data-types.

  • boolean: Converts the data to a boolean value. Anything that begins with a t or 1 is converted to true, anything that begins with an f or 0 is converted to false.
  • datetime: Use it only with UNIX timestamps, anything else will likely result in an error. This means that currently you can use only the $UNIXTIME macro for this purpose.
  • double: A floating-point number.
  • json: A JSON snippet. (Available in AxoSyslog 4.0 and later.)
  • list: The data as a list. For details, see the list manipulation template functions in Template functions of AxoSyslog.
  • null: An unset value.
  • integer: A 32-bit or 64-bit integer, determined by the destination. For example, mongodb uses int32 if the number is less than MAXINT32 and int64 otherwise.
  • string: The data as a string.

Components that support data types

In AxoSyslog 4.0 and later, the following AxoSyslog components that support data types. Other components treat every data as strings.

  • Comparisons in filter expressions: the previously numeric operators are type-aware. The exact comparison depends on the types associated with the values you compare. For details, see Comparing macro values in filters.

  • json-parser() and the $(format-json) template function:

    When using the json-parser(), AxoSyslog converts all elements of the JSON object to name-value pairs. Any type information originally present in the incoming JSON object is retained, and automatically propagated to other AxoSyslog components (for example, a destination) if they support types.

    • Elements without a type are treated as strings.
    • JSON lists (arrays) are converted to AxoSyslog lists, so you can manipulate them using the $(list-*) template functions.
  • set(), groupset() rewrite rules:

    You can set the type of the field. Where you can use of templates in set() and groupset(), you can use type-casting, and the type information is properly promoted. For details, see Specifying data types in value-pairs.

  • db-parser(): The db-parser() rules can associate types with values using the "type" attribute, for example:

    <value name="foobar" type="integer">$PID</value>
    

    The integer is a type-cast that associates $foobar with an integer type. db-parser()’s internal parsers (for example, @NUMBER@) automatically associate type information to the parsed name-value pair.

  • add-contextual-data(): Name-value pairs that are populated using add-contextual-data() propagate type information, similarly to db-parser().

  • map-value-pairs(): map-value-pairs() propagates type information.

  • SQL type support: The sql() driver supports types, so that columns with specific types are stored as those types.

  • Template type support: You can cast templates explicitly to a specific type. Templates also propagate type information from macros, template functions, and values in the template string.

  • python() typing: All Python components (sources, destinations, parsers, and template functions) support all data types, except json().

  • On-disk serialized formats (that is, disk buffer): Version 4.0 and newer are compatible with messages serialized with an earlier version, and the format is compatible for downgrades as well. This means that even if a newer version of AxoSyslog serialized a message, older versions and associated tools are able to read it (but drop the type information of course).

10.2 - value-pairs()

Type:parameter list of the value-pairs() option
Default:empty string

Description: The value-pairs() option allows you to select specific information about a message easily using predefined macro groups. The selected information is represented as name-value pairs and can be used formatted to JSON format, or directly used in a mongodb() destination.

Example: Using the value-pairs() option

The following example selects every available information about the log message, except for the date-related macros (R_* and S_*), selects the .SDATA.meta.sequenceId macro, and defines a new value-pair called MSGHDR that contains the program name and PID of the application that sent the log message.

   value-pairs(
        scope(nv_pairs core syslog all_macros selected_macros everything)
        exclude("R_*")
        exclude("S_*")
        key(".SDATA.meta.sequenceId")
        pair("MSGHDR" "$PROGRAM[$PID]: ")
    )

The following example selects the same information as the previous example, but converts it into JSON format.

   $(format-json --scope nv_pairs,core,syslog,all_macros,selected_macros,everything \
        --exclude R_* --exclude S_* --key .SDATA.meta.sequenceId \
        --pair MSGHDR="$PROGRAM[$PID]: ")

value-pairs() parameters

The value-pairs() option has the following parameters. The parameters are evaluated in the following order:

  1. scope()
  2. exclude()
  3. key()
  4. pair()

exclude()

Type:Space-separated list of macros to remove from the selection created using the scope() option.
Default:empty string

Description: This option removes the specified macros from the selection. Use it to remove unneeded macros selected using the scope() parameter. For example, the following example removes the SDATA macros from the selection.

   value-pairs(
        scope(rfc5424 selected_macros)
        exclude(".SDATA*")
    )

The name of the macro to remove can include wildcards (*, ?). Regular expressions are not supported.

key()

Type:Space-separated list of macros to be included in the selection.
Default:empty string

Description: This option selects the specified macros. The selected macros will be included as MACRONAME = MACROVALUE, that is using key("HOST") will result in HOST = $HOST. You can use wildcards (*, ?) to select multiple macros. For example:

   value-pairs(
        scope(rfc3164)
        key("HOST")
    )
   value-pairs(
        scope(rfc3164)
        key("HOST", "PROGRAM")
    )

omit-empty-values()

Type:flag
Default:N/A

Description: If this option is specified, AxoSyslog does not include value-pairs with empty values in the output. For example: $(format-json --scope none --omit-empty-values) or

   value-pairs(
        scope(rfc3164 selected-macros)
        omit-empty-values()
    )

Available in AxoSyslog version 3.21 and later.

pair()

Type:name-value pairs in "<NAME>" "<VALUE>" format
Default:empty string

Description: This option defines a new name-value pair to be included in the message. The value part can include macros, templates, and template functions as well. For example:

   value-pairs(
        scope(rfc3164)
        pair("TIME" "$HOUR:$MIN")
        pair("MSGHDR" "$PROGRAM[$PID]: ")
    )

rekey()

Type:<pattern-to-select-names>, <list of transformations>
Default:empty string

Description: This option allows you to manipulate and modify the name of the value-pairs. You can define transformations, which are are applied to the selected name-value pairs. The first parameter of the rekey() option is a glob pattern that selects the name-value pairs to modify. If you omit the pattern, the transformations are applied to every key of the scope. For details on globs, see glob.

If you want to modify the names of several message fields, see also map-value-pairs: Rename value-pairs to normalize logs.

  • If rekey() is used within a key() option, the name-value pairs specified in the glob of the key() option are transformed.
  • If rekey() is used outside the key() option, every name-value pair of the scope() is transformed.

The following transformations are available:

  • add-prefix("<my-prefix>")

    Adds the specified prefix to every name. For example, rekey( add-prefix("my-prefix."))

  • replace-prefix("<prefix-to-replace>", "<new-prefix>")

    Replaces a substring at the beginning of the key with another string. Only prefixes can be replaced. For example, replace-prefix(".class", ".patterndb") changes the beginning tag .class to .patterndb

    This option was called replace() in AxoSyslog version 3.4.

  • lower

    Convert all keys to lowercase. Only supports US ASCII.

  • shift("<number>")

    Cuts the specified number of characters from the beginning of the name.

  • shift-levels("<number>")

    Similar to –shift, but instead of cutting characters, it cuts dot-delimited “levels” in the name (including the initial dot). For example, --shift-levels 2 deletes the prefix up to the second dot in the name of the key: .iptables.SRC becomes SRC

  • upper

    Convert all keys to uppercase. Only supports US ASCII.

Example: Using the rekey() option

The following sample selects every value-pair that begins with .cee., deletes this prefix by cutting 4 characters from the names, and adds a new prefix (events.).

   value-pairs(
        key(".cee.*"
            rekey(
                shift(4)
                add-prefix("events.")
            )
        )
    )

The rekey() option can be used with the format-json template-function as well, using the following syntax:

   $(format-json --rekey .cee.* --add-prefix events.)

scope()

Type:Space-separated list of macro groups to be included in the selection.
Default:empty string

Description: This option selects predefined groups of macros. The following groups are available:

  • nv-pairs: Every soft macro (name-value pair) associated with the message, except the ones that start with a dot (.) character. Macros starting with a dot character are generated within AxoSyslog and are not originally part of the message, therefore are not included in this group.

  • dot-nv-pairs: Every soft macro (name-value pair) associated with the message which starts with a dot (.) character. For example, .classifier.rule_id and .sdata.*. Macros starting with a dot character are generated within AxoSyslog and are not originally part of the message.

  • all-nv-pairs: Include every soft macro (name-value pair). Equivalent to using both nv-pairs and dot-nv-pairs.

  • rfc3164: The macros that correspond to the RFC3164 (legacy or BSD-syslog) message format: $FACILITY, $PRIORITY, $HOST, $PROGRAM, $PID, $MESSAGE, and $DATE.

  • rfc5424: The macros that correspond to the RFC5424 (IETF-syslog) message format: $FACILITY, $PRIORITY, $HOST, $PROGRAM, $PID, $MESSAGE, $MSGID, $R_DATE, and the metadata from the structured-data (SDATA) part of RFC5424-formatted messages, that is, every macro that starts with .SDATA..

    The rfc5424 group also has the following alias: syslog-proto. Note that the value of $R_DATE will be listed under the DATE key.

    The rfc5424 group does not contain any metadata about the message, only information that was present in the original message. To include the most commonly used metadata (for example, the $SOURCEIP macro), use the selected-macros group instead.

  • all-macros: Include every hard macro. This group is mainly useful for debugging, as it contains redundant information (for example, the date-related macros include the date-related information several times in various formats).

  • selected-macros: Include the macros of the rfc3164 groups, and the most commonly used metadata about the log message: the $TAGS, $SOURCEIP, and $SEQNUM macros.

  • sdata: The metadata from the structured-data (SDATA) part of RFC5424-formatted messages, that is, every macro that starts with .SDATA.

  • everything: Include every hard and soft macros. This group is mainly useful for debugging, as it contains redundant information (for example, the date-related macros include the date-related information several times in various formats).

  • none: Reset previously added scopes, for example, to delete automatically-added name-value pairs. The following example deletes every value-pair from the scope, and adds only the ones starting with iptables: $(format-welf --scope none .iptables.*)

For example:

   value-pairs(
        scope(rfc3164 selected-macros)
    )

11 - Things to consider when forwarding messages between AxoSyslog hosts

When you send your log messages from a AxoSyslog client through the network to a AxoSyslog server, you can use different protocols and options. Every combination has its advantages and disadvantages. The most important thing is to use matching protocols and options, so the server handles the incoming log messages properly.

In AxoSyslog you can change many aspects of the network communication. First of all, there is the structure of the messages itself. Currently, AxoSyslog supports two standard syslog protocols: the BSD (RFC3164) and the syslog (RFC5424) message format.

These RFCs describe the format and the structure of the log message, and add a (lightweight) framing around the messages. You can set this framing/structure by selecting the appropriate driver in AxoSyslog. There are two drivers you can use: the network() driver and the syslog() driver. The syslog() driver is for the syslog (RFC5424) protocol and the network() driver is for the BSD (RFC3164) protocol.

The tcp() and udp() drivers are now deprecated, they are essentially equivalent with the network(transport(tcp)) and network(transport(udp)) drivers.

In addition to selecting the driver to use, both drivers allow you to use different transport-layer protocols: TCP and UDP, and optionally also higher-level transport protocols: TLS (over TCP. To complicate things a bit more, you can configure the network() driver (corresponding to the BSD (RFC3164) protocol) to send the messages in the syslog (RFC5424) format (but without the framing used in RFC5424) using the flag(syslog-protocol) option.

Because some combination of drivers and options are invalid, you can use the following drivers and options as sources and as destinations:

  1. syslog(transport(tcp))

  2. syslog(transport(udp))

  3. syslog(transport(rltp))

  4. syslog(transport(tls))

  5. syslog(transport(rltp(tls-required(yes)))

  6. network(transport(tcp))

  7. network(transport(udp))

  8. network(transport(rltp))

  9. network(transport(tls))

  10. network(transport(rltp(tls-required(yes)))

  11. network(transport(tcp) flag(syslog-protocol))

  12. network(transport(udp) flag(syslog-protocol))

  13. network(transport(rltp)flag(syslog-protocol))

  14. network(transport(tls) flag(syslog-protocol))

  15. network(transport(rltp(tls-required(yes)) flag(syslog-protocol))

If you use the same driver and options in the destination of your AxoSyslog client and the source of your AxoSyslog server, everything should work as expected. Unfortunately there are some other combinations, that seem to work, but result in losing parts of the messages. The following table show the combinations:

Source \ Destinationsyslog/tcpsyslog/udpsyslog/tlsnetwork/tcpnetwork/udpnetwork/tlsnetwork/tcp/flagnetwork/udp/flagnetwork/tls/flag
syslog/tcp--!--!--
syslog/udp---!--!-
syslog/tls----!--!
network/tcp-----✔?--
network/udp-✔?----✔?-
network/tls-------✔?
network/tcp/flag!--!----
network/udp/flag-!--!---
network/tls/flag--!--!--

Source-destination driver combinations

  • - This method does not work. The logs will not get to the server.

  • This method works.

  • ! This method has some visible drawbacks. The logs go through, but some of the values are missing/misplaced/and so on.

  • ✔? This method seems to work, but it is not recommended because this can change in a future release.