Syslog is still the number one data source for most organizations when collecting observability and log data. What’s more, sending syslog over UDP is still used to transport log messages, even though it is infamously prone to losing messages. From our previous post about syslog over UDP you have learned:

  • why and where message loss usually occurs when sending syslog over UDP, and
  • how can you find out which scenario affects you.

From this post, you’ll learn how to tune the reception parameters of the kernel and syslog-ng to decrease the risk of losing messages.

Change kernel and syslog-ng parameters for syslog over UDP

Let’s start with tuning syslog-ng’s udp() receiver.

This is how an udp() source looks like in the configuration:

source s_network {
    udp();
};

You can tune both syslog-ng and various kernel parameters through this interface, for instance, you can add various parameters here:

source s_network {
    udp(so-rcvbuf(32MiB) log-iw-size(250k) log-fetch-limit(10k));
};

The settings above probably resolve most cases where message drops are a problem. But let’s investigate what each of these mean and how we came up with those figures:

  • so-rcvbuf(32MiB) sets the kernel receive buffer to 32MiB
  • log-iw-size(250k) sets the flow control window to 250,000 log messages
  • log-fetch-limit(10k) sets the number of messages syslog-ng fetches in a single main loop iteration to 10,000.
UDP receive buffer under normal operation

How to change the receive buffer

The default receive buffer sizes are too low for receiving UDP traffic at 10-20k/sec. Even worse, even the maximum values the kernel permits for a user-space process to set is also small. The default sizes for UDP socket buffers are:

cat /proc/sys/net/core/rmem_default
212992
cat /proc/sys/net/core/rmem_max
212992

To increase the receive buffer size, first you need to increase the maximum the kernel allows to any user space process. Use this command as root to increase the current value in the kernel to 32MiB:

echo 33445532 > /proc/sys/net/core/rmem_max

To persist this setting, you have to:

  1. Add the following line to /etc/sysctl.conf:
    net.core.rmem_max=33445532
  2. Run sysctl -f /etc/sysctl.conf as root (or reboot the host).
  3. Then set the so-rcvbuf() parameter of your udp source, like this:
source s_udp {
     udp(so-rcvbuf(32MiB));
};

How to size the receive buffer

Your next question might be related to what the correct size of the receive buffer is and whether the default sizes are enough.

A typical log message received via UDP is around 300-500 bytes. The kernel adds a significant overhead, around 2x this figure. So you can store roughly 1 message per kilobyte of buffer space. On my laptop, the default receive buffer is 212992 bytes, which can store roughly 200 messages. How long this buffer keeps up if the user space is busy doing something else depends on the incoming rate of messages.

If the incoming rate is 1000 EPS, then you need to process each message within 1ms to keep up on the long term, and  you have a buffer space of 0.2 seconds in total, considering ~200 messages fitting in the receive buffer. But if the incoming rate grows to 20000 EPS, that means 0.05ms per message and 0.5ms in total.

The size of the receive buffer you need depends on how fast the syslog application is able to come back for the next message and how fast it processes messages in general. Syslog-ng is pretty performant when it comes to pure message processing, doing 200-300k messages per second per CPU core as long as you don’t do heavy filtering with regexps and such. If your configuration is more complex, this can decrease to 1/10th, like 20-30k/sec per CPU core. 

But it’s not just the raw message processing performance that matters. Syslog-ng (or any other syslog application) is not continuously looking at the socket buffer to fetch new messages, rather it does so every once in a while, as its mainloop and worker threads are executing other tasks too. Also, the kernel is free to suspend an application while other processes are scheduled, and similarly the hypervisor can do the same if the host is running in a virtualized environment, such as VMWare or AWS.

This means that application performance, configuration complexity, kernel and hypervisor latency all affect how fast the application can respond to a packet entering its socket buffer.

At 20k events/s incoming UDP rate you have a time budget of 0.05 ms to process a single message, and with a 200k socket buffer, you have 0.5ms worth of buffer. Let’s assume that syslog-ng can otherwise handle this rate (as it does so when working with a TCP stream), so the key metric here is application/kernel/hypervisor latency.

You can probably see by now that the defaults for the socket buffer are clearly insufficient, kernel scheduling latencies for a server with an I/O bound load (such as a syslog server function) can easily reach 2-3 ms. The solution is to increase your socket buffer to a size that can accommodate the received messages for as long as syslog-ng is busy doing something else. I recommend you to use a couple of seconds worth of socket buffer, with the assumption that syslog-ng has enough buffer of its own to store these messages.

20k/s * 1kB * 1s = 20MB.

Can the receive buffer be too large?

Interesting question, and the answer is yes.

  • On one hand, the receive buffer is stored in non-pageable kernel memory, which is a scarce resource. Your log server might be dedicated and you might be able to dedicate this much RAM.
  • On the other hand, due to how the kernel performs bookkeeping on the receive buffer, the larger the buffer is, the slower the kernel gets out of the congested state, causing a lot more messages to be dropped than anticipated. This topic could be a post of its own, but for now, just believe me that you don’t want to oversize the socket buffer. Stick to a value that keeps you out of trouble, but not more than that.
UDP receive buffer still dropping messages

If messages are not stored in the receive buffer, where are they?

In a paragraph above, I mentioned that “I would use a couple of seconds worth of socket buffer, with the assumption that syslog-ng has enough buffer of its own to store these messages.” What does this assumption mean in practice?

syslog-ng generally attempts to balance its inputs and outputs. If the rate of an output is slow, there’s a feedback mechanism to propagate the slowness back to the source. With UDP, however, the only solution to avoid losses is to consume any packets from the receive buffer as fast as possible. But if the output is slow, syslog-ng needs to store the messages somewhere, right?

In syslog-ng, each destination has a dedicated output queue where syslog-ng stores the messages. In a sense, syslog-ng pulls them out of the kernel’s receive buffer and copies them into its own buffer space. This means that the output queue needs to be large enough to keep up with input rate even if it’s temporarily faster than the output.

This was the configuration I recommended a few sections earlier:

source s_network {
    udp(so-rcvbuf(32MiB) log-iw-size(250k) log-fetch-limit(10k));
};

The trick is in log-iw-size(250k), which means that syslog-ng would allow up to 250,000 messages to be fetched from this source and only start the backpressure mechanism once this “window” is depleted.

With overhead, a log message would take roughly 1kB/message, so 250,000 messages would consume around 250MB. Once this fills up, reading of the UDP socket slows down, almost certainly causing messages to be dropped. You can increase 250k to be 1M or more, but be conscious about the associated increase in memory consumption.

You can also store these messages in a disk-based buffer, if you are certain that your disks and syslog-ng’s disk-based queueing mechanism are fast enough to handle your input rate.

UDP receive buffer returning to normal operation

How to check the current usage of your receive buffer

The memory used by packets in the receive buffer fluctuates as messages are received from the network and are consumed by syslog-ng. So the value you can query at any given time is just an approximation.

# netstat -anu
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address       	Foreign Address     	State 
udp   162560  	  0 0.0.0.0:514        	0.0.0.0:*

The second column in the output above (Recv-Q) shows how much memory buffered packets use in bytes at the moment. 

Summary

There’s more to scaling UDP based syslog than what fits into a single article. Our next article will explain causes of message drops that happen when the CPU becomes the bottleneck.

Join the community!
Discord
Slack
Twitter
LinkedIn
Follow by Email