Streamhouse

Open-source event streaming platform built on S3. Kafka-compatible APIs, built-in SQL engine, schema registry — one Rust binary replaces Kafka + ZooKeeper + KSQL. Retention costs pennies, not thousands

Generate Convert Improve

Install / Use

/learn @gbram1/Streamhouse

About this skill

Quality Score

0/100

README

StreamHouse

S3-native event streaming. One binary replaces Kafka.

StreamHouse stores everything in S3 — no broker fleets, no disk replication, no JVM. Kafka protocol compatible. 1 TB of retention costs ~$23/month.

Demo

Stream events to Postgres in 5 commands. No Kafka, no Flink, no glue code.

StreamHouse Demo

Get Started

Cloud (fastest)

CLI

brew install gbram1/tap/stm
stm auth login

# Create a topic
stm topic create events --partitions 3

# Produce
stm produce events --key user-1 --value '{"event":"signup","user":"alice"}'

# Consume
stm consume events --partition 0

# SQL
stm sql query "SELECT * FROM events LIMIT 10"

# Metrics
stm metrics overview

Python SDK

pip install streamhouse
# On macOS, use a venv if pip is blocked:
# python3 -m venv .venv && source .venv/bin/activate && pip install streamhouse

from streamhouse import StreamHouse

sh = StreamHouse(api_key="sk_live_...")

# Create a topic
sh.admin.create_topic("events", partitions=3)

# Produce
producer = sh.producer()
producer.send("events", key="user-1", value='{"event":"signup"}')

# Consume
consumer = sh.consumer()
records = consumer.poll("events", partition=0, offset=0)
for record in records:
    print(record.value_str)

TypeScript SDK

npm install streamhouse

import { StreamHouse } from "streamhouse";

const sh = new StreamHouse({ apiKey: "sk_live_..." });

// Create a topic
await sh.admin.createTopic("events", { partitions: 3 });

// Produce
const producer = sh.producer();
await producer.send("events", { event: "signup", user: "alice" }, { key: "user-1" });

// Consume
const consumer = sh.consumer();
for await (const record of consumer.subscribe("events")) {
  console.log(record.value);
}

SQL Queries

Query your streams with SQL — no external engine needed.

stm sql query "SELECT value->>'event' as event, COUNT(*) as cnt FROM events GROUP BY 1"

result = sh.admin.get_metrics()  # or use the SQL endpoint

const result = await sh.query('SELECT * FROM events WHERE value->>\'event\' = \'signup\' LIMIT 10');
console.log(result.rows);

Pipelines & Sink Connectors

Stream data from topics to external systems with optional SQL transforms.

# Create a Postgres sink target
stm pipeline target create my-postgres \
  --target-type postgres \
  --url "postgres://user:pass@host:5432/mydb" \
  --table events_sink

# Create a pipeline with a SQL transform
stm pipeline create signup-pipeline \
  --source-topic events \
  --target my-postgres \
  --transform "SELECT value->>'user' as user_id, value->>'event' as event_type, timestamp FROM events WHERE value->>'event' = 'signup'"

# Start it
stm pipeline start signup-pipeline

# Check status
stm pipeline list

Supported sinks: PostgreSQL, S3 (Parquet/JSON/CSV), Elasticsearch.

Kafka Protocol

Works with any Kafka client — kcat, kafka-python, confluent-kafka, etc.

# kcat
echo '{"event":"signup"}' | kcat -P -b localhost:9092 -t events -k user-1
kcat -C -b localhost:9092 -t events -o beginning

Self-Hosted

docker compose up -d

| Service | URL | |---------|-----| | REST API | http://localhost:8080 | | gRPC | localhost:50051 | | Kafka | localhost:9092 | | Swagger UI | http://localhost:8080/swagger-ui/ | | Grafana | http://localhost:3001 |

Why StreamHouse?

Cost. Kafka replicates data 3x on broker disks. StreamHouse stores data in S3 — retention is nearly free.

Simplicity. No JVM, no ZooKeeper, no KRaft, no broker fleet. One Rust binary. SQLite for dev, PostgreSQL for prod.

Flexibility. Choose durability per-write:

| Mode | Latency | Best for | |------|---------|----------| | acks=buffered | ~1ms | High-throughput ingestion | | acks=durable | ~150ms | Production workloads needing S3 durability |

Features

Kafka protocol — 23 APIs, consumer groups, transactions, SASL auth
REST + gRPC APIs — OpenAPI docs at /swagger-ui/
SQL engine — Query streams with SQL, window aggregations, JSON operators
Schema Registry — JSON Schema, Avro, Protobuf with compatibility checking
Pipelines — Stream processing with SQL transforms, sink to Postgres/S3/Elasticsearch
Multi-tenancy — Org-scoped isolation, API keys with permissions and topic scopes
Log compaction — Tombstone handling, background compaction
Observability — Prometheus metrics, Grafana dashboards, real-time WebSocket metrics
Disaster recovery — S3 metadata snapshots, self-healing reconciliation

Performance

| Metric | Throughput | |--------|-----------| | WAL writes | 2.21M records/sec | | Full path (WAL → S3) | 769K records/sec | | gRPC ProduceBatch | 100K+ messages/sec | | Segment read (LZ4) | 3.10M records/sec |

Docs

| | | |---|---| | Getting Started | Installation, first topic, Docker | | Architecture | Segments, WAL, leases, rebalancing | | API Reference | REST, gRPC, Kafka endpoints | | Configuration | Env vars, tuning, durability | | Authentication | API keys, SASL, multi-tenancy |

Compared to Alternatives

| | StreamHouse | Kafka | WarpStream | |---|---|---|---| | Storage | S3 | Broker disks (3x) | S3 | | Runtime | Single Rust binary | JVM + ZK/KRaft | Go agents | | Retention (1 TB) | ~$23/mo | ~$3K+/mo | ~$23/mo | | Built-in SQL | Yes | No | No | | Schema Registry | Built-in | Separate | No |

Contributing

Issues and PRs welcome. See CONTRIBUTING.md.

License

Apache License, Version 2.0

Related Skills

himalaya

347.6k

CLI to manage emails via IMAP/SMTP. Use `himalaya` to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language).

oracle

347.6k

Best practices for using the oracle CLI (prompt + file bundling, engines, sessions, and file attachment patterns).

taskflow

347.6k

name: taskflow description: Use when work should span one or more detached tasks but still behave like one job with a single owner context. TaskFlow is the durable flow substrate under authoring layer

prose

347.6k

OpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.