By default, AxoSyslog processes log messages arriving from a single connection sequentially. Sequential processing:
- ensures message ordering, and the
- efficient use CPU on a per message basis.
Sequential processing performs well if you have relatively many parallel connections, in which case it uses all the available CPU cores. However, if a small number of connections deliver a large number of messages, this behavior becomes a bottleneck.
Starting with AxoSyslog version 4.3, AxoSyslog can distribute a stream of incoming messages between a set of workers to process the stream by multiple threads in parallel. Depending on how you partition the stream, you might lose the message ordering, but can scale the incoming load to all CPUs in the system, even if the entire load is coming from a single, chatty sender.
To enable this mode of execution, use the parallelize()
element in your log path.
The following example takes the messages of the tcp()
source and processes them with 4 parallel threads, regardless of the number of connections used to deliver the messages to the tcp()
source.
log {
source {
tcp(
port(2000)
log-iw-size(10M) max-connections(10) log-fetch-limit(100000)
);
};
parallelize(workers(4));
# from this part on, messages are processed in parallel even if
# messages are originally coming from a single connection
parser { ... };
destination { ... };
};
parallelize()
uses round-robin to allocate messages to workers (called partitions in versions between 4.3-4.16) by default, but you can retain ordering for a subset of messages with the worker-partition-key()
option. The worker-partition-key()
option specifies a template: messages that expand the template to the same value are mapped to the same partition. For example, you can partition messages based on their sender host:
log {
source {
tcp(
port(2000)
log-iw-size(10M) max-connections(10) log-fetch-limit(100000)
);
};
parallelize(workers(4) worker-partition-key("$HOST"));
# from this part on, messages are processed in parallel if their
# $HOST value differs. Messages with the same $HOST will be mapped
# to the same partition and are processed sequentially.
parser { ... };
destination { ... };
};
Staring with AxoSyslog version 4.17, you can use the batch-size()
option to specify how many consecutive messages should be processed by a single parallelize()
worker. This ensures that this many messages preserve their order on the destination side, and also improves parallelize()
performance. A value around 100 is recommended for batch-size()
. Default value: 0
(batching is disabled).