# python-fetcher: writing fetcher-style Python sources

The Python source allows you to write your own source in Python. You can import external Python modules to receive or fetch the messages. Since many services have a Python library, the Python source makes integrating AxoSyslog very easy and quick.

You can write two different type of sources in Python:

  * Server-style sources that receives messages. Write server-style sources if you want to use an event-loop based, nonblocking server framework in Python, or if you want to implement a custom loop.

  * Fetcher-style sources that actively fetch messages. In general, write fetcher-style sources (for example, when using simple blocking APIs), unless you explicitly need a server-style source.




This section describes fetcher-style sources. For details on server-style sources, see [python: writing server-style Python sources](../../docs/axosyslog-core/chapter-sources/python-source/index.md).

The following points apply to using Python blocks in AxoSyslog in general:

  * Python parsers and template functions are available in AxoSyslog version 3.10 and later.

Python destinations and sources are available in AxoSyslog version 3.18 and later.

  * Supported Python versions: 2.7 and 3.4+ (if you are using pre-built binaries, check the dependencies of the package to find out which Python version it was compiled with).

  * The Python block must be a top-level block in the AxoSyslog configuration file.

  * If you store the Python code in a separate Python file and only include it in the AxoSyslog configuration file, make sure that the PYTHONPATH environment variable includes the path to the Python file, and export the PYTHON_PATH environment variable. For example, if you start AxoSyslog manually from a terminal and you store your Python files in the `/opt/syslog-ng/etc` directory, use the following command: `export PYTHONPATH=/opt/syslog-ng/etc`.

In production, when AxoSyslog starts on boot, you must configure your startup script to include the Python path. The exact method depends on your operating system. For recent Red Hat Enterprise Linux, Fedora, and CentOS distributions that use systemd, the `systemctl` command sources the `/etc/sysconfig/syslog-ng` file before starting AxoSyslog. (On openSUSE and SLES, `/etc/sysconfig/syslog` file.) Append the following line to the end of this file: `PYTHONPATH="<path-to-your-python-file>"`, for example, `PYTHONPATH="/opt/syslog-ng/etc"`.

  * The Python object is initiated every time when AxoSyslog is started or reloaded.

Warning If you reload AxoSyslog, existing Python objects are destroyed, therefore the context and state information of Python blocks is lost. Log rotation and updating the configuration of AxoSyslog typically involves a reload. 

  * The Python block can contain multiple Python functions.

  * Using Python code in AxoSyslog can significantly decrease the performance of AxoSyslog, especially if the Python code is slow. In general, the features of AxoSyslog are implemented in C, and are faster than implementations of the same or similar features in Python.

  * Validate and lint the Python code before using it. The AxoSyslog application does not do any of this.

  * Python error messages are available in the `internal()` source of AxoSyslog.

  * You can access the name-value pairs of AxoSyslog directly through a message object or a dictionary.

  * To help debugging and troubleshooting your Python code, you can send log messages to the `internal()` source of AxoSyslog. For details, see [Logging from your Python code](../../docs/axosyslog-core/chapter-configuration-file/python-code-logging/index.md).




## Declaration:

Python sources consist of two parts. The first is a AxoSyslog source object that you define in your AxoSyslog configuration and use in the log path. This object references a Python class, which is the second part of the Python source. The Python class receives or fetches the log messages, and can do virtually anything that you can code in Python. You can either embed the Python class into your AxoSyslog configuration file, or [store it in an external Python file](../../docs/axosyslog-core/chapter-configuration-file/python-code-external-file/index.md).
```
 
       source <name_of_the_python_source>{
            python-fetcher(
                class("<name_of_the_python_class_executed_by_the_source>")
            );
        };
        
        python {
        from syslogng import LogFetcher
        from syslogng import LogMessage
        
        class <name_of_the_python_class_executed_by_the_source>(LogFetcher):
            def init(self, options): # optional
                print("init")
                print(options)
                return True
        
            def deinit(self): # optional
                print("deinit")
        
            def open(self): # optional
                print("open")
                return True
        
            def fetch(self): # mandatory
                print("fetch")
                # return LogFetcher.FETCH_ERROR,
                # return LogFetcher.FETCH_NOT_CONNECTED,
                # return LogFetcher.FETCH_TRY_AGAIN,
                # return LogFetcher.FETCH_NO_DATA,
                return LogFetcher.FETCH_SUCCESS, msg
        
            def request_exit(self):
                print("request_exit")
                # If your fetching method is blocking, do something to break it
                # For example, if it reads a socket: socket.shutdown()
        
            def close(self): # optional
                print("close")
        };
    
```

## Methods of the python-fetcher() source

Fetcher-style Python sources must be inherited from the `syslogng.LogFetcher` class, and must implement at least the `fetch` method. Multiple inheritance is allowed, but only for pure Python super classes.

For fetcher-style Python sources, AxoSyslog handles the event loop and the scheduling automatically. You can use simple blocking server/client libraries to receive or fetch logs.

You can retrieve messages using the `fetch()` method.

## init(self, options) method (optional)

The AxoSyslog application initializes Python objects every time when it is started or reloaded. The `init` method is executed as part of the initialization. You can perform any initialization steps that are necessary for your source to work.

Warning If you reload AxoSyslog, existing Python objects are destroyed, therefore the context and state information of Python blocks is lost. Log rotation and updating the configuration of AxoSyslog typically involves a reload. 

When this method returns with False, AxoSyslog does not start. It can be used to check options and return False when they prevent the successful start of the source.

`options`: This optional argument contains the contents of the `options()` parameter of the AxoSyslog configuration object as a Python dictionary.

## open(self) method (optional)

The `open(self)` method opens the resources required for the source, for example, it initiates a connection to the target service. It is called after `init()` when AxoSyslog is started or reloaded. If `fetch()` returns with an error, AxoSyslog calls the `close()` and `open()` methods before trying to fetch a new message.

If `open()` fails, it should return the False value. In this case, AxoSyslog retries it every `time-reopen()` seconds. By default, this is 1 second for Python sources and destinations, the value of `time-reopen()` is not inherited from the global option. For details, see [Error handling in the python() destination](../../docs/axosyslog-core/chapter-destinations/python-destination/index.md).

## fetch(self) method (mandatory)

Use the `fetch` method to fetch messages and pass them to the log paths.

For details on parsing messages, see [Python LogMessage API](../../docs/axosyslog-core/chapter-sources/python-source/python-source-logmessage/index.md).

The `fetch` method must return one of the following values:

  * `LogFetcher.FETCH_ERROR`: Fetching new messages failed, AxoSyslog calls the `close` and `open` methods.

  * `LogFetcher.FETCH_NO_DATA`: There was not any data available. The source waits before calling the fetch method again. The wait time is equal to `time-reopen()` by default, but you can override it by setting the `fetch-no-data-delay()` option in the source.

  * `LogFetcher.FETCH_NOT_CONNECTED`: Could not access the source, AxoSyslog calls the `open` method.

  * `LogFetcher.FETCH_SUCCESS, msg`: Post the message returned as the second argument.

  * `LogFetcher.FETCH_TRY_AGAIN`: The fetcher could not provide a message this time, but will make the source call the fetch method as soon as possible.




## request_exit(self) method (optional)

If you use blocking operations within the `fetch()` method, use `request_exit()` to interrupt those operations (for example, to shut down a socket), otherwise AxoSyslog is not able to stop. Note that AxoSyslog calls the `request_exit` method from a thread different from the source thread.

## close(self) method (optional)

Close the connection to the target service. Usually it is called right before `deinit()` when stopping or reloading AxoSyslog. It is also called when `fecth()` fails.

## close_batch(self)

Closes the current source-side batch. Source-side batching helps AxoSyslog to effectively process a larger chunk of messages, instead of processing messages each message. For example, when feeding a destination queue and instead of taking a lock on the queue for every message (causing contention), we only take it once per batch.

The native drivers built into AxoSyslog typically close batches once every mainloop iteration, allowing a single iteration to process multiple messages. For instance, when receiving multiple messages in a single TCP datagram, all of those messages can be processed as a part of the same batch.

In Python-based log sources, a batch will automatically be closed after every message posted via `post_message()`, except if `self.auto_close_batches` is set to `False` during initialization. In case `self.auto_close_batches` is set to `False`, the driver has to call `close_batch()` explicitly, preferably at a natural boundary between incoming batches of messages. A good example is when we retrieve several messages via the same HTTP REST call, then the right time to close the batch would be after the last message in the response is posted.

## The deinit(self) method (optional)

This method is executed when AxoSyslog is stopped or reloaded. This method does not return a value.

Warning If you reload AxoSyslog, existing Python objects are destroyed, therefore the context and state information of Python blocks is lost. Log rotation and updating the configuration of AxoSyslog typically involves a reload. 

For the list of available optional parameters, see [python() and python-fetcher() source options](../../docs/axosyslog-core/chapter-sources/python-source/reference-source-python/index.md).

Last modified January 5, 2024: [[4.5][python] Adds close batch to fetcher source (741c193c)](<https://github.com/axoflow/axosyslog-core-docs/commit/741c193cdc299eeb6633c9907a7f0d9b84f061fb>)