# Python parser

The Python log parser (available in AxoSyslog version 3.10 and later) allows you to write your own parser in Python. Practically, that way you can process the log message (or parts of the log message) any way you need. For example, you can import external Python modules to process the messages, query databases to enrich the messages with additional data, and many other things.

The following points apply to using Python blocks in AxoSyslog in general:

  * Python parsers and template functions are available in AxoSyslog version 3.10 and later.

Python destinations and sources are available in AxoSyslog version 3.18 and later.

  * Supported Python versions: 2.7 and 3.4+ (if you are using pre-built binaries, check the dependencies of the package to find out which Python version it was compiled with).

  * The Python block must be a top-level block in the AxoSyslog configuration file.

  * If you store the Python code in a separate Python file and only include it in the AxoSyslog configuration file, make sure that the PYTHONPATH environment variable includes the path to the Python file, and export the PYTHON_PATH environment variable. For example, if you start AxoSyslog manually from a terminal and you store your Python files in the `/opt/syslog-ng/etc` directory, use the following command: `export PYTHONPATH=/opt/syslog-ng/etc`.

In production, when AxoSyslog starts on boot, you must configure your startup script to include the Python path. The exact method depends on your operating system. For recent Red Hat Enterprise Linux, Fedora, and CentOS distributions that use systemd, the `systemctl` command sources the `/etc/sysconfig/syslog-ng` file before starting AxoSyslog. (On openSUSE and SLES, `/etc/sysconfig/syslog` file.) Append the following line to the end of this file: `PYTHONPATH="<path-to-your-python-file>"`, for example, `PYTHONPATH="/opt/syslog-ng/etc"`.

  * The Python object is initiated every time when AxoSyslog is started or reloaded.

Warning If you reload AxoSyslog, existing Python objects are destroyed, therefore the context and state information of Python blocks is lost. Log rotation and updating the configuration of AxoSyslog typically involves a reload. 

  * The Python block can contain multiple Python functions.

  * Using Python code in AxoSyslog can significantly decrease the performance of AxoSyslog, especially if the Python code is slow. In general, the features of AxoSyslog are implemented in C, and are faster than implementations of the same or similar features in Python.

  * Validate and lint the Python code before using it. The AxoSyslog application does not do any of this.

  * Python error messages are available in the `internal()` source of AxoSyslog.

  * You can access the name-value pairs of AxoSyslog directly through a message object or a dictionary.

  * To help debugging and troubleshooting your Python code, you can send log messages to the `internal()` source of AxoSyslog. For details, see [Logging from your Python code](../../docs/axosyslog-core/chapter-configuration-file/python-code-logging/index.md).




## Declaration:

Python parsers consist of two parts. The first is a AxoSyslog parser object that you use in your AxoSyslog configuration, for example, in the log path. This parser references a Python class, which is the second part of the Python parsers. The Python class processes the log messages it receives, and can do virtually anything that you can code in Python.
```
 
       parser <name_of_the_python_parser>{
            python(
                class("<name_of_the_python_class_executed_by_the_parser>")
            );
        };
        
        python {
        class MyParser(object):
            def init(self, options):
                '''Optional. This method is executed when syslog-ng is started or reloaded.'''
                return True
            def deinit(self):
                '''Optional. This method is executed when syslog-ng is stopped or reloaded.'''
                pass
            def parse(self, msg):
                '''Required. This method receives and processes the log message.'''
                return True
        };
    
```

## Methods of the python() parser

## The init (self, options) method (optional)

The AxoSyslog application initializes Python objects only when it is started or reloaded. That means it keeps the state of internal variables while AxoSyslog is running. The `init` method is executed as part of the initialization. You can perform any initialization steps that are necessary for your parser to work. For example, if you want to perform a lookup from a file or a database, you can open the file or connect to the database here, or you can initialize a counter that you will increase in the `parse()` method.

The return value of the `init()` method must be `True`. If it returns `False`, or raises an exception, AxoSyslog will not start.

`options`: This optional argument contains the contents of the `options()` parameter of the parser object as a Python dict.
```
 
       parser my_python_parser{
            python(
                class("MyParser")
                options("regex", "seq: (?P<seq>\\d+), thread: (?P<thread>\\d+), runid: (?P<runid>\\d+), stamp: (?P<stamp>[^ ]+) (?P<padding>.*$)")
            );
        };
        class MyParser(object):
            def init(self, options):
                pattern = options["regex"]
                self.regex = re.compile(pattern)
                self.counter = 0
                return True
    
```

## The parse(self, log_message) method

The `parse()` method processes the log messages it receives, and can do virtually anything that you can code in Python. This method is required, otherwise AxoSyslog will not start.

The return value of the `parse()` method must be `True`. If it returns `False`, or raises an exception, AxoSyslog will drop the message.

  * To reference a name-value pair or a macro in the Python code, use the following format. For example, if the first argument in the definition of the function is called `log-message`, the value of the HOST macro is `log-message['HOST']`, and so on. (The `log-message` contains the entire log message (not just the text body) in a structure similar to a Python dict, but it is actually an object.)

  * You can define new name-value pairs in the Python function. For example, if the first argument in the definition of the function is called `log-message`, you can create a new name-value pair like this: `log_message["new-macro-name"]="value"`. This is useful when you parse a part of the message from Python, or lookup a value based on data extracted from the log message.

Note that the names of the name-value pairs are case-sensitive. If you create a new name-value pair called `new-macro-name` in Python, and want to reference it in another part of the AxoSyslog configuration file (for example, in a template), use the `${new-macro-name}` macro.

  * You cannot override hard macros (see [Hard versus soft macros](../../docs/axosyslog-core/chapter-manipulating-messages/customizing-message-format/macros-hard-vs-soft/index.md)).

  * To list all available keys (names of name-value pairs), use the `log_message.keys()` function.




## The deinit(self) method (optional)

This method is executed when AxoSyslog is stopped or reloaded.

Warning If you reload AxoSyslog, existing Python objects are destroyed, therefore the context and state information of Python blocks is lost. Log rotation and updating the configuration of AxoSyslog typically involves a reload. 

## Example: Parse loggen logs

The following sample code parses the messages of the `loggen` tool (for details, see [The loggen manual page](../../docs/axosyslog-core/app-man-syslog-ng/loggen.1/index.md)). The following is a sample loggen message:
```
 
       <38>2017-04-05T12:16:46 localhost prg00000[1234]: seq: 0000000000, thread: 0000, runid: 1491387406, stamp: 2017-04-05T12:16:46 PADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADD
    
```

The AxoSyslog parser object references the LoggenParser class and passes a set of regular expressions to parse the loggen messages. The `init()` method of the LoggenParser class compiles these expressions into a pattern. The `parse` method uses these patterns to extract the fields of the message into name-value pairs. The destination template of the AxoSyslog log statement uses the extracted fields to format the output message.
```
 
       @version: 4.24
        @include "scl.conf"
        parser my_python_parser{
            python(
                class("LoggenParser")
                options("regex", "seq: (?P<seq>\\d+), thread: (?P<thread>\\d+), runid: (?P<runid>\\d+), stamp: (?P<stamp>[^ ]+) (?P<padding>.*$)")
            );
        };
        log {
            source { tcp(port(5555)); };
            parser(my_python_parser);
            destination {
                file("/tmp/regexparser.log.txt" template("seq: $seq thread: $thread runid: $runid stamp: $stamp my_counter: $MY_COUNTER"));
            };
        };
        python {
        import re
        class LoggenParser(object):
            def init(self, options):
                pattern = options["regex"]
                self.regex = re.compile(pattern)
                self.counter = 0
                return True
            def deinit(self):
                pass
            def parse(self, log_message):
                match = self.regex.match(log_message['MESSAGE'])
                if match:
                    for key, value in match.groupdict().items():
                        log_message[key] = value
                    log_message['MY_COUNTER'] = self.counter
                    self.counter += 1
                    return True
                return False
        };
    
```

## Example: Parse Windows eventlogs in Python - performance

The following example uses regular expressions to process Windows log messages received in XML format. The parser extracts different fields from messages received from the Security and the Application eventlog containers. Using the following configuration file, AxoSyslog could process about 25000 real-life Windows log messages per second.
```
 
       @version: 4.24
        options {
            keep-hostname(yes);
            keep-timestamp(no);
            stats-level(2);
            use-dns(no);
        };
        source s_network_aa5fdf25c39d4017a8e504cdb641b477 {
            network(
                flags(no-parse)
                ip(0.0.0.0)
                log-fetch-limit(1000)
                log-iw-size(100000)
                max-connections(100)
                port(514)
            );
        };
        parser p_python_parser_79c31da44bb64de6b5de84be4ae15a15 {
            python(options("regex_for_security", ".* Security ID:  (?P<security_id>\\S+)   Account Name:  (?P<account_name>\\S+)   Account Domain:  (?P<account_domain>\\S+)   Logon ID:  (?P<logon_id>\\S+).*Process Name: (?P<process_name>\\S+).*EventID (?P<event_id>\\d+)", "regex_others", "(.*)EventID (?P<event_id>\\d+)")
        class("EventlogParser"));
        };
        destination d_file_78363e1dd90c4ebcbb0ee1eff5a2e310 {
            file(
                "/var/testdb_working_dir/fcd713a2-d48e-4025-9192-ec4a9852cafa.$HOST"
                flush-lines(1000)
                log-fifo-size(200000)
            );
        };
        log {
            source(s_network_aa5fdf25c39d4017a8e504cdb641b477);
            parser(p_python_parser_79c31da44bb64de6b5de84be4ae15a15);
            destination(d_file_78363e1dd90c4ebcbb0ee1eff5a2e310);
            flags(flow-control);
        };
        
        python {
        import re
        class EventlogParser(object):
            def init(self, options):
                self.regex_security = re.compile(options["regex_for_security"])
                self.regex_others = re.compile(options["regex_others"])
                return True
            def deinit(self):
                pass
            def parse(self, log_message):
                security_match = self.regex_security.match(log_message['MESSAGE'])
                if security_match:
                    for key, value in security_match.groupdict().items():
                        log_message[key] = value
                else:
                    others_match = self.regex_others.match(log_message['MESSAGE'])
                    if others_match:
                        for key, value in others_match.groupdict().items():
                            log_message[key] = value
                return True
        };
    
```

Last modified October 16, 2025: [Fix @version config numbers in examples (89688d8)](<https://github.com/axoflow/axosyslog-core-docs/commit/89688d8719a35ac2c048319e8fa82c11c6cad085>)