Observability and Data Filtering Case Study – Large Healthcare Company
One of the biggest US healthcare companies
- 10 TB/day Data volume
- Multiple data centers
- Thousands of log sources
Problem
Storage costs of log data were getting out of hand. Identifying the total data contribution of sources and departments was problematic. In an effort to reduce the amount of data, the task of finding out which data sources were sending which kind of data to Splunk became onerous and error-prone.
Goals
- Visibility into the data pipeline
- Understanding of the data flows: which router is sending how much data, and what this data consists of
- Identify high volume, low value data sources and log messages and drop them
- Solution must integrate with existing, syslog-ng based logging pipeline, without having to replace their routers and their configuration
Deployment
The customer deployed Axolet agents on their syslog-ng nodes to get insight into their data flows. Axolet pulls real-time analytics and metrics from the syslog-ng servers without disrupting the data flow or accessing the data itself, and sends these metrics to Axoflow Console (SaaS). On the Console, operators get an overview of the pipeline topology and the data flow, and can analyze “what data goes where” by using the instrumented labels. Pipeline health, bottlenecks, and issues are also apparent from the dashboards.
Tech stack
Axoflow products used
Benefits
Installing Axolet on a syslog-ng instance, adding it to the Axoflow Console, and instrumenting the syslog-ng configuration file for metrics takes only a few minutes. After that the customer immediately has access to metrics about the data flows, including data volume per source host, and details about the data a Splunk index is receiving.
- Visualize and analyze the data transported by the individual syslog-ng nodes.
- See which sources send data to specific Splunk indexes, and the ratio of their data volume.
- Tap into logs that are sent to a fallback Splunk indexes and quickly identify/filter/route them.
- Identify total data contribution to the SIEM by business unit and source.
- See the performance, load, and possible bottlenecks of the syslog-ng nodes.
- Identify and mitigate data drops.
- Troubleshoot a specific log source from any individual business unit/source.
This deployment not only solved the customer's immediate problems, but is also a future-proof solution with many possibilities for further improvements. For example, upgrading the current syslog-ng router nodes to AxoRouter would make centralized management, automatic classification, normalization, and data reduction possible. Also, better source and data labeling makes data attribution easier.
Results
- Effective pipeline monitoring, with easy-to-spot outliers and top talkers.
- Significantly decreased time to identify and filter unneeded log messages.
- Decreased data volume by 25%.
- Decreased operational costs by 30%.
- Adding Axoflow to the existing observability pipeline required only minimal instrumentation.
- After the initial tests, deploying into production was done in days.