SkillAgentSearch skills...

Journalpump

systemd journald to aws_cloudwatch, elasticsearch, google cloud logging, kafka, rsyslog or logplex log sender

Install / Use

/learn @Aiven-Open/Journalpump
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

journalpump |BuildStatus|_

.. |BuildStatus| image:: https://github.com/aiven/journalpump/actions/workflows/build.yml/badge.svg?branch=master .. _BuildStatus: https://github.com/aiven/journalpump/actions .. image:: https://codecov.io/gh/aiven/journalpump/branch/master/graph/badge.svg?token=nLr7M7hvCx :target: https://codecov.io/gh/aiven/journalpump

journalpump is a daemon that takes log messages from journald and pumps them to a given output. Currently supported outputs are Elasticsearch, OpenSearch, Apache Kafka®, logplex, rsyslog, websocket and AWS CloudWatch. It reads messages from journald and optionally checks if they match a config rule and forwards them as JSON messages to the desired output.

Building

To build an installation package for your distribution, go to the root directory of a journalpump Git checkout and then run:

Debian::

make deb

This will produce a .deb package into the parent directory of the Git checkout.

Fedora::

make rpm

This will produce an RPM in rpm/RPMS/noarch/.

Other::

python3 setup.py bdist_egg

This will produce an egg file into a dist directory within the same folder.

For a source install the dependency python-systemd <https://github.com/systemd/python-systemd>_ has to be installed through your distribution's package manager (The PyPI systemd package is not the same!).

journalpump requires Python 3.4 or newer.

Installation

To install it run as root:

Debian::

dpkg -i ../journalpump*.deb

Fedora::

su -c 'dnf install rpm/RPMS/noarch/*'

On Fedora it is recommended to simply run journalpump under systemd::

systemctl enable journalpump.service

and eventually after the setup section, you can just run::

systemctl start journalpump.service

Other::

python3 setup.py install

On systems without systemd it is recommended that you run journalpump within supervisord_ or similar process control system.

.. _supervisord : http://supervisord.org

Setup

After installation you need to create a suitable JSON configuration file for your installation.

General notes

If correctly installed, journalpump comes with a single executable, journalpump that takes as an argument the path to journalpump's JSON configuration file.

journalpump is the main process that should be run under systemd or supervisord.

While journalpump is running it may be useful to read the JSON state file that will be created as journalpump_state.json to the current working directory. The JSON state file is human readable and should give an understandable description of the current state of the journalpump.

Top level configuration

Example::

{ "log_level": "INFO", "field_filters": { ... }, "unit_log_levels": { ... }, "json_state_file_path": "/var/lib/journalpump/journalpump_state.json", "readers": { ... }, "statsd": { "host": "127.0.0.1", "port": 12345, "prefix": "user-", "tags": { "sometag": "somevalue" } } }

json_state_file_path (default "journalpump_state.json")

Location of a JSON state file which describes the state of the journalpump process.

statsd (default null)

Enables metrics sending to a statsd daemon that supports the influxdb-statsd / telegraf syntax with tags.

The tags setting can be used to enter optional tag values for the metrics.

The prefix setting can be used to enter an optional prefix for all metric names.

Metrics sending follows the Telegraf spec_.

.. _Telegraf spec: https://github.com/influxdata/telegraf/tree/master/plugins/inputs/statsd

log_level (default "INFO")

Determines log level of journalpump. Available log levels <https://docs.python.org/3/library/logging.html#logging-levels>_.

Field filter configuration

Field filters can be used to restrict the journald fields that journalpump sends forward. Field filter configuration structure::

{ "field_filters": { "filter_name": { "type": "whitelist|blacklist", "fields": ["field1", "field2"] } } }

filter_name

Name of the filter. The filters can be configured per sender and depending on the use case the filters for different senders may vary.

type (default whitelist)

Specifies whether the listed fields will be included (whitelist) or excluded (blacklist).

fields

The actual fields to include or exclude. Field name matching is case insensitive and underscores in the beginning of the fields are trimmed.

Unit log levels configuration

Unit log levels can be used to specify which log levels you want to set on a per unit basis. Matching supports glob patterns. For example, to only process messsages for a systemd-unit called test-unit with severity WARNING or higher, your config could look like this::

{ "unit_log_levels": { "log_level_name": [ { "service_glob": "test-unit*", "log_level": "WARNING" }, { "service_glob": "*-unit", "log_level": "INFO" } ] } }

Note that if your unit would match multiple patterns (like "test-unit" would in the example above), the first match will get used, i.e "WARNING" in this case.

log_level_name

Name of the log level configuration. This can be configured per sender and depending on the use case the settings for different senders may vary.

Reader configuration

Reader configuration structure::

{ "readers": { "some_reader": { "senders": { "some_log": { ... }, "another_log": { ... } } }, "another_reader": { "senders": { "some_kafka": { ... } } } } }

Example configuration for a single reader::

{ "field_filters": { "drop_process_id": { "fields": ["process_id"], "type": "blacklist" } }, "unit_log_levels": { "drop_everything_below_warning": [ { "service_glob": "*", "log_level": "WARNING" } ] }, "journal_path": "/var/lib/machines/container1/var/log/journal/b09ffd62229f4bd0829e883c6bb12c4e", "senders": { "k1": { "output_type": "kafka", "field_filter": "drop_process_id", "unit_log_level": "drop_everything_below_warning", "ca": "/etc/journalpump/ca-bundle.crt", "certfile": "/etc/journalpump/node.crt", "kafka_address": "kafka.somewhere.com:12345", "kafka_topic": "journals", "keyfile": "/etc/journalpump/node.key", "ssl": true }, }, "searches": [ { "fields": { "MESSAGE": "kernel: Out of memory: Kill process .+ \((?P<process>[^ ]+)\)" }, "name": "journal.oom_killer" } ], "secret_filter_metrics": true, "secret_filters": [ { "pattern": "SENSITIVE", "replacement": "[REDACTED]" }], "threshold_for_metric_emit": 10 "tags": { "type": "container" } }

initial_position (default head)

Controls where the readers starts when the journalpump is launched for the first time:

  • head: First entry in the journal
  • tail: Last entry in the journal
  • <integer>: Seconds from current boot session

match_key (default null)

If you want to match against a single journald field, this configuration key defines the key to match against.

match_value (default null)

If you want to match against a single journald field, this configuration key defines the value to match against. Currently only equality is allowed. Note this means if you specify match_key and not match_value, then the reader will match all entries that do not contain the match_key.

msg_buffer_max_length (default 50000)

How many journal entries to read at most into a memory buffer from which the journalpump feeds the configured logsender.

journal_path (default null)

Path to the directory containing journal files if you want to override the default one.

journal_namespace (default null - read from default systemd namespace)

Journal namespace to read logs from. This feature requires latest version of python-systemd with namespace support <https://github.com/systemd/python-systemd/pull/87>_

units_to_match (default [])

Require that the logs message matches only against certain _SYSTEMD_UNITs. If not set, we allow log events from all units.

flags (default LOCAL_ONLY)

"LOCAL_ONLY" opens journal on local machine only; "RUNTIME_ONLY" opens only volatile journal files; and "SYSTEM" opens journal files of system services and the kernel, "CURRENT_USER" opens files of the current user; and "OS_ROOT" is used to open the journal from directories relative to the specified directory path or file descriptor. Multiple flags can be OR'ed together using a list: ["LOCAL_ONLY", "CURRENT_USER"].

secret_filters (default [])

Secret filters can be used to redact sensitive data which matches known patterns in logs before forwarding the message along to it's final destination. To use: add a number of filters following the pattern below to the reader config. The pattern is a standard python regex, and the matching substring will b

View on GitHub
GitHub Stars63
CategoryDevelopment
Updated5d ago
Forks23

Languages

Python

Security Score

100/100

Audited on Mar 26, 2026

No findings