Even though we have released syslog-ng 4.1 only two months ago, we are excited to announce the syslog-ng 4.2 release, with new drivers, several incremental changes in the metrics and Python support, and lots of smaller improvements that make it easier for you to collect your logs and manage your observability supply chain, both in cloud-native and on-premise environments.
- Packages are available for various platforms.
- Our AxoSyslog project provides cloud-ready container images and Helm charts.
For the in-depth details of every change, read the release notes on the GitHub Releases page. Let’s see the highlights of the new release.
Sending messages to Splunk HEC
Splunk is one of the most popular solutions to collect logs and observability data. The new syslog-ng release includes the splunk-hec-event()
and splunk-hec-raw()
destinations that you can use to feed your observability data to Splunk via the HEC events API and the HEC raw API.
Their minimal configuration snippets look like:
destination d_splunk_hec_event {
splunk-hec-event(
url("https://localhost:8088")
token("70b6ae71-76b3-4c38-9597-0c5b37ad9630")
);
};
destination d_splunk_hec_raw {
splunk-hec-raw(
url("https://localhost:8088")
token("70b6ae71-76b3-4c38-9597-0c5b37ad9630")
channel("05ed4617-f186-4ccd-b4e7-08847094c8fd")
);
};
For more details, see pull request #4462.
Smart multi-line for recognizing backtraces
Backtraces are important, as they can be signs of severe errors. The multi-line-mode(smart)
option recognizes the inherently multi-line backtrace format in the logs, and converts the corresponding lines to a single message for easier analysis. The syslog-ng 4.2 release supports backtraces for: Python, Java, JavaScript, PHP, Go, Ruby, and Dart.
The regular expressions to recognize these programming languages are stored in an external file called /usr/share/syslog-ng/smart-multi-line.fsm
(The installation path depends on configure arguments), in a format that is described in that file.
To correlate multi-line messages, we added a new parser called group-lines()
parser, which can change multi-line messages received as separate, but subsequent lines into a single log message. The received messages are first collected into streams of related messages (based on the key()
), then they are assigned to correlation contexts up to timeout()
seconds. The identification of multi-line messages are then performed on these message contexts within the time period.
group-lines(key("$FILE_NAME")
multi-line-mode("smart")
template("$MESSAGE")
timeout(10)
line-separator("\n")
);
For details, see the documentation of AxoSyslog, the cloud-native syslog-ng distribution.
ebpf()
plugin and reuseport packet randomizer
A new eBPF plugin was added as a framework to leverage the kernel’s eBPF infrastructure to improve performance and scalability of syslog-ng. The solution improves performance when a single (or very few) senders generate most of the inbound UDP traffic that syslog-ng needs to process. Normally, the kernel distributes load between so-reuseport sockets by keeping each flow (like the same source/dest ip/port) in its dedicated receiver. This fails to balance the sockets properly if only a few senders are responsible for most of the
load. The ebpf(reuseport()) option replaces the original kernel algorithm with an alternative, so individual packets are assigned to one of the sockets randomly, thereby producing a more uniform load. For example:
source s_udp {
udp(so-reuseport(yes) port(2000) persist-name("udp1")
ebpf(reuseport(sockets(4)))
);
udp(so-reuseport(yes) port(2000) persist-name("udp2"));
udp(so-reuseport(yes) port(2000) persist-name("udp3"));
udp(so-reuseport(yes) port(2000) persist-name("udp4"));
};
NOTE: The ebpf()
plugin is considered advanced usage so its compilation is disabled by default. Use it only after trying all other avenues of
configuration optimizations. You will need a special toolchain and a recent kernel version to compile and run eBPF programs, or as an alternative use the AxoSyslog container images, which have this option enabled.
To learn more about optimizing and scaling syslog-ng for syslog over UDP traffic see our related blog posts:
- Why syslog over UDP loses messages and how to avoid that
- syslog over UDP: how to avoid losing messages
- Scaling syslog to 1M EPS with eBPF
Metrics-related improvements
As we are working on making the observability supply chain more transparent, manageable, and easy to monitor, we have added lots of different metrics that help monitor syslog-ng and the status of the pipeline, including:
- Destination-related metrics, for example:
syslogng_socket_connections{id="tcp_src#0",driver_instance="afsocket_sd.(stream,AF_INET(0.0.0.0:5555))",direction="input"} 3 syslogng_socket_max_connections{id="tcp_src#0",driver_instance="afsocket_sd.(stream,AF_INET(0.0.0.0:5555))",direction="input"} 10 syslogng_socket_rejected_connections_total{id="tcp_src#0",driver_instance="afsocket_sd.(stream,AF_INET(0.0.0.0:5555))",direction="input"} 96 syslogng_socket_receive_buffer_used_bytes{id="#anon-source0#3",direction="input",driver_instance="afsocket_sd.udp4"} 0 syslogng_socket_receive_buffer_max_bytes{id="#anon-source0#3",direction="input",driver_instance="afsocket_sd.udp4"} 268435456 syslogng_socket_receive_dropped_packets_total{id="#anon-source0#3",direction="input",driver_instance="afsocket_sd.udp4"} 619173 syslogng_socket_connections{id="#anon-source0#0",direction="input",driver_instance="afsocket_sd.(stream,AF_INET(0.0.0.0:2000))"} 1
- Configuration-related metrics (#4420), for example:
syslogng_last_config_reload_timestamp_seconds 1681309903 syslogng_last_successful_config_reload_timestamp_seconds 1681309758 syslogng_last_config_file_modification_timestamp_seconds 1681309877
- Queue-related metrics (#4392), where:
- The corresponding driver is identified with the “id” and “driver_instance” labels.
- Available counters are “memory_usage_bytes” and “events”.
- Memory queue metrics are available with “syslogng_memory_queue_” prefix,
disk-buffer
metrics are available with “syslogng_disk_queue_” prefix. disk-buffer
metrics have an additional “path” label, pointing to the location of the disk-buffer file
and a “reliable” label, which can be either “true” or “false”.- Threaded destinations, like
http
,python
, etc have an additional “worker” label. - Metrics for monitoring the available space in disk-buffer
dir()
s.
These metrics look like:
syslogng_disk_queue_events{driver_instance="http,http://localhost:1239",id="d_http_disk_buffer#0",path="/var/syslog-ng/syslog-ng-00000.rqf",reliable="true",worker="0"} 80 syslogng_disk_queue_memory_usage_bytes{driver_instance="http,http://localhost:1239",id="d_http_disk_buffer#0",path="/var/syslog-ng/syslog-ng-00003.rqf",reliable="true",worker="3"} 2776 syslogng_memory_queue_events{driver_instance="tcp,localhost:1234",id="d_network#0"} 29 syslogng_memory_queue_memory_usage_bytes{driver_instance="http,http://localhost:1236",id="d_http#0",worker="1"} 5552 syslogng_memory_queue_memory_usage_bytes{driver_instance="tcp,localhost:1234",id="d_network#0"} 11448
- Byte-based metrics for incoming/outgoing events in the
network()
,syslog()
,file()
,http()
,kubernetes()
drivers. These metrics show the serialized message sizes (protocol-specific header/framing/etc. length is not included). For example:syslogng_input_event_bytes_total{id="s_network#0",driver_instance="tcp,127.0.0.1"} 1925529600 syslogng_output_event_bytes_total{id="d_network#0",driver_instance="tcp,127.0.0.1:5555"} 565215232 syslogng_output_event_bytes_total{id="d_http#0",driver_instance="http,http://127.0.0.1:8080/"} 1024
- disk-buffer related metrics for capacity, disk_allocated and disk_usage (#4356), and abandoned disk-buffer files (#4402). These metrics are available from
stats(level(1))
. By default, the metrics are generated every 5 minutes, but it can be changed in the global options, like this. Settingfreq(0)
disables this feature.options { disk-buffer( stats( freq(10) ) ); };
Example metrics:
syslogng_disk_queue_capacity_bytes{abandoned="true",path="/var/syslog-ng/syslog-ng-00000.rqf",reliable="true"} 104853504 syslogng_disk_queue_disk_allocated_bytes{abandoned="true",path="/var/syslog-ng/syslog-ng-00000.rqf",reliable="true"} 273408 syslogng_disk_queue_disk_usage_bytes{abandoned="true",path="/var/syslog-ng/syslog-ng-00000.rqf",reliable="true"} 269312 syslogng_disk_queue_events{abandoned="true",path="/var/syslog-ng/syslog-ng-00000.rqf",reliable="true"} 860
The meaning of the different disk-buffer metrics is:
- “capacity_bytes”: The theoretical maximal useful size of the disk-buffer. This is always smaller, than
disk-buf-size()
, as there is some reserved
space for metadata. The actual full disk-buffer file can be larger than this, as syslog-ng allows to write over this limit once, at the end of the file. - “disk_allocated_bytes”: The current size of the disk-buffer file on the disk. Note that the disk-buffer file size does not strictly correlate with the number
of messages, as it is a ring buffer implementation, and also syslog-ng optimizes the truncation of the file for performance reasons. - “disk_usage_bytes”: The serialized size of the queued messages in the disk-buffer file. This counter is useful for calculating the disk usage percentage (disk_usage_bytes / capacity_bytes) or the remaining available space (capacity_bytes – disk_usage_bytes).
- “capacity_bytes”: The theoretical maximal useful size of the disk-buffer. This is always smaller, than
Metrics-probe parser improvements
You can now set the stats level of the generated metrics using the level()
option (#4453). Also, you can set a template using the increment()
option, which resolves to a number that modifies the increment of the counter. If not set, the increment is 1 (#4447).
Changes in Python support
You can now use typed custom options in the python
source, python-fetcher
source, python
destination, python
parser, and python-http-header
inner destination.
Note that this is a breaking change. Previously, values were converted to strings if possible, now they are passed to the python class with their real type. Make sure to follow up these changes
in your python code! (#4354)
Example configuration snippet:
python(
class("TestClass")
options(
"string_option" => "example_string"
"bool_option" => True # supported values are: True, False, yes, no
"integer_option" => 123456789
"double_option" => 123.456789
"string_list_option" => ["string1", "string2", "string3"]
"template_option" => LogTemplate("${example_template}")
)
);
There are also new LogMessage methods (#4410) for querying as string (with default values):
get(key[, default])
: Return the value forkey
ifkey
exists, elsedefault
. Ifdefault
is not given, it defaults toNone
, so that this method never raises aKeyError
.get_as_str(key, default=None, encoding='utf-8', errors='strict', repr='internal')
: Return the string value forkey
ifkey
exists, elsedefault
. Ifdefault
is not given, it defaults toNone
, so that this method never
raises aKeyError
. The string value is decoded using the codec registered forencoding
.errors
may be given to set the desired error handling scheme. Note that currentlyrepr='internal'
is the only available representation. We may implement another more Pythonic representation in the future, so please specify therepr
argument explicitly if you want to avoid future representation changes in your code.
The python()
, python-fetcher()
sources now support a mapping for the flags()
option (#4455). The state of the flags()
option is mapped to the self.flags
variable, which is a Dict[str, bool]
, for example:
{
'parse': True,
'check-hostname': False,
'syslog-protocol': True,
'assume-utf8': False,
'validate-utf8': False,
'sanitize-utf8': False,
'multi-line': True,
'store-legacy-msghdr': True,
'store-raw-message': False,
'expect-hostname': True,
'guess-timezone': False,
'header': True,
'rfc3164-fallback': True,
}
HYPR Audit Trail source
The hypr-audit-trail()
and hypr-app-audit-trail()
source drivers allow you to monitor the audit trails for HYPR applications. For details, see the README.md file in the driver’s directory.
source s_hypr {
hypr-audit-trail(
url('https://<custom domain>.hypr.com')
bearer-token('<base64 encoded bearer token>')
page-size(<number of results to return in a single page>)
initial-hours(<number of hours to search backward on initial fetch>)
application-skip-list('HYPRDefaultApplication', 'HYPRDefaultWorkstationApplication')
log-level('INFO')
flags(<optional flags passed to the source>)
ignore-persistence(<yes/no>)
);
};
Other minor features
network
source: During a TLS handshake, syslog-ng now automatically sets thecertificate_authorities
field of the certificate request based on theca-file()
andca-dir()
options. Thepkcs12-file()
option already had this feature. (#4412)mongodb
destination: Added support for list, JSON and null types. (#4437)add-contextual-data()
: significantly reduce memory usage for large CSV files. (#4444)kubernetes()
source: Added support for json-file logging driver format. (#4419)- The new
$RAWMSG_SIZE
hard macro can be used to query the original size of the incoming message in bytes. This information may not be available for all source drivers. (#4440) - syslog-ng configuration identifier (#4420): A new syslog-ng configuration keyword has been added, which allows specifying a config identifier. For example:
@config-id: cfg-20230404-13-g02b0850fc
This keyword can be used for config identification in managed environments, where syslog-ng instances and their configuration are deployed/generated automatically.
syslog-ng-ctl config --id
can be used to query the active configuration ID and the SHA256 hash of the full “preprocessed” syslog-ng configuration. For example:$ syslog-ng-ctl config --id cfg-20230404-13-g02b0850fc (08ddecfa52a3443b29d5d5aa3e5114e48dd465e195598062da9f5fc5a45d8a83)
syslog-ng
: add--config-id
command line option: similarly to--syntax-only
, this command line option parses the configuration
and then prints its ID before exiting. It can be used to query the ID of the current configuration persisted on
disk. (#4435)- Health metrics and
syslog-ng-ctl healthcheck:
A newsyslog-ng-ctl
command has been introduced, which can be used to query a healthcheck status from syslog-ng. Currently, only 2 basic health values are reported.syslog-ng-ctl healthcheck --timeout <seconds>
can be specified to use it as a boolean healthy/unhealthy check.Health checks are also published as periodically updated metrics. The frequency of these checks can be configured with thestats(healthcheck-freq())
option. The default is 5 minutes. (#4362) $(format-json)
and template functions which support value-pairs expressions: new key transformations upper() and lower() have been added to translate the caps of keys while formatting the output template. For
example:template("$(format-json test.* --upper)\n")
Would convert all keys to uppercase. Only supports US ASCII. (#4452)
Summary
As you can see, this release is another important step in making syslog-ng a real cloud-native observability tool – and we haven’t even listed everything. For the complete list of smaller changes and bugfixes, see the release notes. Stay tuned for more exciting features in the upcoming releases!
Thank you for everyone contributing with bug reports, feature requests, or pull requests. Feedback and any kind of contribution are always appreciated. Visit the AxoSyslog-ng GitHub page or join Axoflow’s Discord server to reach out to us, and or subscribe to the Axoflow newsletter to receive updates about syslog-ng and our observability and logging-related products.
On-deman Webinar
Parsing
sucks!
What can you do
about it?
56 minutes
Balázs SCHEIDLER
Founder syslog-ng™
Mark BONSACK
Co-creator SC4S
Sándor GUBA
Founder Logging Operator
Neil BOYD
Moderator
On-demand Webinar
Parsing
sucks!
What can you do about it?
56 minutes
Follow Our Progress!
We are excited to be realizing our vision above with a full Axoflow product suite.