
Syslog Scaling and Performance Considerations
Syslog was designed to be lightweight and flexible, but modern environments generate many orders of magnitude more log data than early syslog implementations ever anticipated. High event rates, bursty traffic, and complex routing rules can quickly turn a simple logging setup into a performance bottleneck.
Scaling syslog is not just about adding more CPU or bandwidth. It requires understanding how ingestion, buffering, processing, and forwarding work and interact under load — and how architectural choices affect reliability and latency.
In this post, we’ll examine the key performance considerations when running syslog at scale, and the patterns used to create high-throughput, enterprise-grade logging pipelines.
Understanding Throughput and Event Rates
Syslog performance is often measured in events per second (EPS). In small environments, EPS may be in the hundreds or thousands. In larger infrastructures, it can reach hundreds of thousands or millions of messages per second.
Factors That Influence Throughput
- Number, type, and verbosity of log sources
- Message size and structure
- Transport protocol (UDP vs TCP/TLS vs OpenTelemetry)
- Parsing and filtering complexity
- Downstream destination performance
A common mistake is designing pipelines for average load rather than peak bursts, which often occur during incidents — precisely when logs matter most.
Transport Protocol Trade-Offs at Scale
Transport choice has a direct impact on performance and reliability.
UDP: Fast but Fragile
- Minimal overhead
- High throughput
- No flow control or retransmission
UDP can silently drop packets, especially on congested networks or overloaded receivers. For mitigation strategies, see Syslog over UDP: How to Avoid Losing Messages.
Also, compliance regulations might forbid using UDP for collecting sensitive data. However, in some cases with extremely high message volumes you need to stick to UDP. For such cases, read how you can Scale syslog to 1M EPS with eBPF.
TCP and TLS: Reliable but Heavier
- Built-in flow control
- More reliable delivery and ordering
- Higher CPU and memory usage
At scale, TCP/TLS requires careful tuning to avoid connection exhaustion or head-of-line blocking. For ways to diagnose message loss in TCP-based syslog transport, see Detect TCP and UDP packet drops in syslog and telemetry pipelines.
OpenTelemetry for Syslog Data
While most network appliances and legacy applications can send their logs using only syslog, deploying an edge collector or relay that receives data using the syslog protocol and forwards it downstream using the OpenTelemetry protocol can significantly improve the reliability and performance of your data pipeline.
OTLP/gRPC supports flow control, retries, batching, and acknowledgments. Also, its binary encoding and built-in compression drastically reduces the payload size compared to syslog. Deploying edge collectors (for example, AxoRouter) that receives local syslog data and transports it using OTLP is a fast way to improve the reliability of your pipeline. Read a more detailed comparison of OTLP/gRPC vs. Traditional Syslog.
Buffering and Queueing Strategies
Buffering is the foundation of reliable syslog at scale. Without it, any downstream slowdown can result in data loss.
Common Queue Types
- In-memory queues: Fast, but volatile - loses messages if the host fails
- Disk-based buffers: Durable, but significantly slower
- Persistent queues: Balance performance and reliability
Disk-backed queues are especially important when forwarding logs to external systems like SIEMs or cloud services, which may throttle or become temporarily unavailable. Modern syslog implementations allow queues to be sized, and persist or flush in-memory queues during graceful reloads and shutdowns. For an in-depth explanation of how disk buffering works in syslog-ng and AxoSyslog, see Disk buffering for a resilient syslog architecture.
Backpressure and Flow Control
Backpressure ensures that syslog pipelines degrade predictably under load, rather than failing catastrophically. For example, if the downstream components (for example, a SIEM or syslog server) of the pipeline cannot process the data received from a syslog relay, then the relay throttles its sources to give a chance to the server to catch up.
Why Backpressure Matters
- Prevents memory and buffer exhaustion
- Protects upstream systems
- Can prevent message loss
Without proper flow control, overloaded syslog servers may drop messages indiscriminately or crash entirely.
Enterprise-grade implementations like AxoSyslog (a drop-in replacement for syslog-ng) combine flow control and backpressure management with effective buffering to avoid message loss.
It is important to ensure that the source devices can handle the backpressure, and buffer locally to account for the (temporary) server overload.
Parsing, Filtering, and Processing Costs
Every additional processing step impacts throughput.
Common Performance Pitfalls
- Complex regular expressions used to filter or parse messages
- Deeply nested parsing rules
- Over-filtering at ingestion time
A common scaling strategy is to:
- Perform minimal processing early
- Perform expensive parsing in the pipeline on aggregator/router nodes, so you can filter, reduce, and route the data before it reaches the SIEM
- Separate hot paths (critical logs) from cold paths (debug or verbose logs)
However, note that deferring processing like parsing and classification to downstream systems (like the SIEM) can seriously increase SIEM costs, as you don’t effectively filter data before ingestion. For more details, see the blogs Security Data Pipeline Management and How high-quality data saves you $$$$.
Horizontal Scaling and Architectural Patterns
When vertical scaling reaches its limits, syslog pipelines are typically scaled horizontally. Let’s cover the most common approaches.
Multiple collectors behind a load balancer
Using load balancers is a common way to scale performance, however, the syslog protocol is not suited for using load balancers. The reason is that the protocol itself (unlike HTTP/OTLP) has no application-level acknowledgment, and even when using TCP for transport, the messages are sent with no expectation of a response (“send and forget”).
The best way to accomplish scale at the edge is to run syslog collectors as close (networking-wise) to the data sources themselves as possible, and scale horizontally by increasing the number of collectors. Once the syslog collector onboards the data, it can be forwarded using a much more reliable (and routable) protocol such as HTTP or OTLP, at which point load balancing is appropriate.
Regional or edge collectors (relays)
Deploying edge collectors or relays results in a hierarchical architecture that increases the reliability and performance of the pipeline. Instead of sending data directly to the SIEM/central syslog server, the local hosts and devices communicate with a local collector, which forwards the data. This architecture has several benefits:
- You can do filtering and routing at the relays
- The relays can buffer the messages in case the SIEM becomes unavailable (for example, because of a network error)
When using a proper relay, the communication between the relay and the central server is not restricted to the syslog protocol, but can use reliable, advanced protocols like OTLP. Read a more detailed comparison of OTLP/gRPC vs. Traditional Syslog. Plus, these protocols are better suited for using load balancers, as discussed in the previous section.
Sharded pipelines
Sharding pipelines by log type or source protects critical logs during overload, and reduces SIEM ingestion costs if you can implement proper filtering and routing in the router. For example, you can direct debug logs that have no security impact to a temporal storage (like AxoStore), application data to the observability/analytics tool of the operations team, and all security or compliance-related logs to your SIEM.
Measuring and Monitoring Syslog Performance
Scaling syslog requires visibility into the logging pipeline itself.
Key Metrics to Track
- Ingest rate (EPS)
- Queue size and processing delay (latency)
- Dropped or throttled messages
- CPU, memory, and disk I/O usage
Without monitoring, performance issues often go unnoticed until logs are missing — usually during an incident or audit. For details on syslog monitoring, see our Observability and the Telemetry Pipeline whitepaper.
Performance vs Reliability: Finding the Balance
High-performance syslog deployments always involve trade-offs.
- Maximum throughput often means less processing
- Strong reliability requires buffering and flow control
- Compliance-driven environments favor durability over latency
Successful designs explicitly choose where to sit on this spectrum, rather than relying on defaults.
Key Takeaways
- Syslog pipelines must be intentionally designed to scale under modern workloads.
- Transport choice affects both performance and data loss risk.
- Buffering and backpressure are essential for reliability.
- Parsing complexity directly impacts throughput.
- Horizontal scaling is the preferred model for large environments.
For a complete list of topics covered in this series, see the Comprehensive Guide to Syslog.
Follow Our Progress!
We are excited to be realizing our vision above with a full Axoflow product suite.
Sign Me UpFighting data Loss?

Book a free 30-min consultation with syslog-ng creator Balázs Scheidler
