# hdfs: Store messages on the Hadoop Distributed File System (HDFS)

Starting with version 3.7, AxoSyslog can send plain-text log files to the [Hadoop Distributed File System (HDFS)](<http://hadoop.apache.org/>), allowing you to store your log data on a distributed, scalable file system. This is especially useful if you have huge amounts of log messages that would be difficult to store otherwise, or if you want to process your messages using Hadoop tools (for example, Apache Pig).

Note the following limitations when using the AxoSyslog `hdfs` destination:

  * Since AxoSyslog uses the official Java HDFS client, the `hdfs` destination has significant memory usage (about 400MB).

  * You cannot set when log messages are flushed. Hadoop performs this action automatically, depending on its configured block size, and the amount of data received. There is no way for the AxoSyslog application to influence when the messages are actually written to disk. This means that AxoSyslog cannot guarantee that a message sent to HDFS is actually written to disk. When using flow-control, AxoSyslog acknowledges a message as written to disk when it passes the message to the HDFS client. This method is as reliable as your HDFS environment.




## Declaration:
```
       @include "scl.conf"
        
        hdfs(
            client-lib-dir("/opt/syslog-ng/lib/syslog-ng/java-modules/:<path-to-preinstalled-hadoop-libraries>")
            hdfs-uri("hdfs://NameNode:8020")
            hdfs-file("<path-to-logfile>")
        );
    
```

## Example: Storing logfiles on HDFS

The following example defines an `hdfs` destination using only the required parameters.
```
 
       @include "scl.conf"
        
        destination d_hdfs {
            hdfs(
                client-lib-dir("/opt/syslog-ng/lib/syslog-ng/java-modules/:/opt/hadoop/libs")
                hdfs-uri("hdfs://10.140.32.80:8020")
                hdfs-file("/user/log/logfile.txt")
            );
        };
    
```

  * To install the software required for the `hdfs` destination, see [Prerequisites](../../docs/axosyslog-core/chapter-destinations/configuring-destinations-hdfs/destination-hdfs-prerequisites/index.md).

  * For details on how the `hdfs` destination works, see [How AxoSyslog interacts with HDFS](../../docs/axosyslog-core/chapter-destinations/configuring-destinations-hdfs/destination-hdfs-interaction/index.md).

  * For details on using MapR-FS, see [Storing messages with MapR-FS](../../docs/axosyslog-core/chapter-destinations/configuring-destinations-hdfs/destination-hdfs-maprfs/index.md).

  * For details on using Kerberos authentication, see [Kerberos authentication with the hdfs() destination](../../docs/axosyslog-core/chapter-destinations/configuring-destinations-hdfs/destination-hdfs-kerberos-authentication/index.md).

  * For the list of options, see [HDFS destination options](../../docs/axosyslog-core/chapter-destinations/configuring-destinations-hdfs/reference-destination-hdfs/index.md).




The `hdfs()` driver is actually a reusable configuration snippet configured to receive log messages using the Java language-binding of AxoSyslog. For details on using or writing such configuration snippets, see [Reusing configuration blocks](../../docs/axosyslog-core/chapter-configuration-file/large-configs/config-blocks/index.md). You can find the source of the hdfs configuration snippet on [GitHub](<https://github.com/axoflow/axosyslog/blob/master/scl/hdfs/plugin.conf>).

Note If you delete all Java destinations from your configuration and reload `syslog-ng`, the JVM is not used anymore, but it is still running. If you want to stop JVM, stop `syslog-ng` and then start `syslog-ng` again. 

* * *

[Prerequisites](../../docs/axosyslog-core/chapter-destinations/configuring-destinations-hdfs/destination-hdfs-prerequisites/index.md)

[How AxoSyslog interacts with HDFS](../../docs/axosyslog-core/chapter-destinations/configuring-destinations-hdfs/destination-hdfs-interaction/index.md)

[Storing messages with MapR-FS](../../docs/axosyslog-core/chapter-destinations/configuring-destinations-hdfs/destination-hdfs-maprfs/index.md)

[Kerberos authentication with the hdfs() destination](../../docs/axosyslog-core/chapter-destinations/configuring-destinations-hdfs/destination-hdfs-kerberos-authentication/index.md)

[HDFS destination options](../../docs/axosyslog-core/chapter-destinations/configuring-destinations-hdfs/reference-destination-hdfs/index.md)

Last modified November 20, 2024: [Broken link updates (5644de9)](<https://github.com/axoflow/axosyslog-core-docs/commit/5644de9a8069da37e3bebf0ed5a4e73cf958a66b>)