SkillAgentSearch skills...

Ksml

Kafka Streams without Java

Install / Use

/learn @Axual/Ksml
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Build and test

KSML – Kafka Streams without Java

KSML is a wrapper language and interpreter around Kafka Streams that lets you express any topology in a YAML syntax. Simply define your topology as a processing pipeline with a series of steps that your data passes through. Your custom functions can be expressed inline in Python. KSML will read your definition and construct the topology dynamically via the Kafka Streams DSL and run it in GraalVM.

KSML was started by Axual in early 2021 and open-sourced in May 2021.

Why KSML?

Kafka Streams is powerful but Java-centric. KSML eliminates the Java boiler-plate through:

  • Declarative YAML for topology wiring
  • User-defined functions in Python for customized business logic
  • One command to package and run (container image or in your own JVM)

Language

To quickly jump to the KSML specification, use this link: https://axual.github.io/ksml/

Examples

The following examples are provided in the examples directory:

| Filename | Description | |---------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 01-example-inspect.yaml | Reads messages from Avro, CSV, JSON, Protobuf, and XML topics and outputs them on stdout |
| 02-example-copy.yaml | Provides two ways to copy all messages from one topic to another topic: using predefined stream references and inline topic definitions | | 03-example-filter.yaml | Reads messages from a topic, filters them and sends the results to an output topic | | 04-example-branch.yaml | Splits the stream into blue-sensor and red-sensor sub-flows, with a default branch for all other colors | | 05-example-route.yaml | Routes each message to sensor0, sensor1, or sensor2 topics by computing the target topic name from the record key at runtime | | 06-example-duplicate.yaml | Reads messages from a topic, duplicates them in-memory and sends the results to an output topic | | 07-example-convert.yaml | Walks through a multi-step conversion chain, from AVRO → JSON → String → JSON → XML → String → XML- then writes XML to the target topic | | 08-example-count.yaml | Demonstrates windowed aggregation: groups messages by owner, applies 20-second tumbling windows, and counts messages per owner per window with state store configuration | | 09-example-aggregate.yaml | Same as count above, but performs steps manually through the aggregate operation | | 10-example-queryable-table.yaml | Filters out records without a key and sinks the result into a queryable state store (table) that external services can poll for the latest value | | 11-example-field-modification.yaml | Shows field-level manipulation in AVRO messages: modifying the "owner" field and removing the "color" field by schema modification, producing a new schema type | | 12-example-byte-manipulation.yaml | Reads data from a binary input topic, modifies some bytes and writes to an output topic | | 13-example-join.yaml | Joins live sensor data (stream) with alert-settings (table) and produces individual alert records for each threshold breach | | 14-example-manual-state-store.yaml | Declares and accesses a custom in-memory key/value store inside a forEach processor to track the last value per sensor | | 15-example-pipeline-linking.yaml | Demonstrates pipeline chaining where the output of one pipeline becomes the input of the next, creating a sequential processing flow through five linked pipelines. Finally resulting in a terminal pipeline that logs the results | | 16-example-transform-metadata.yaml | Shows how to modify message metadata including timestamps and headers, adding custom headers to messages during processing | | 17-example-inspect-with-metrics.yaml | Metric-keeping version of first example above | | 18-example-timestamp-extractor.yaml | Demonstrates custom timestamp extraction from message content, using a counter-based timestamp extractor with global state | | 19-example-performance-measurement.yaml | Tracks total messages and runtime, logging average messages-per-second every 100 records to give a quick throughput overview |

Project Overview

The project is divided into modules based functionality in order to be included separately depending on the use case.

The submodules are as follows:

| Module | Description | |---------------------------------------------|--------------------------------------------------------------------------------------------------------------| | ksml-data | contains core data type and schema logic. | | ksml-data-avro | extension to the data library for AVRO support. | | ksml-data-binary | extension to the data library for BINARY support. | | ksml-data-csv | extension to the data library for CSV support. | | ksml-data-json | extension to the data library for JSON support. | | ksml-data-protobuf | extension to the data library for PROTOBUF support. | | ksml-data-soap | extension to the data library for SOAP support. | | ksml-data-xml | extension to the data library for XML support. | | ksml | the core component that parses KSML definitions and converts them to a Kafka Streams topology. | | ksml-kafka-clients | the set of Kafka clients for KSML, injected i

View on GitHub
GitHub Stars34
CategoryDevelopment
Updated4d ago
Forks14

Languages

Java

Security Score

95/100

Audited on Mar 27, 2026

No findings