Comet

Embedded segmented log with 2μs writes and multi-process coordination. Optimized for edge and single-node deployments.

Generate Convert Improve

Install / Use

/learn @orbiterhq/Comet

About this skill

Quality Score

0/100

README

☄️ Comet

High-performance embedded segmented log for edge observability. Built for single-digit microsecond latency and bounded resources.

Architecture Guide | Performance Guide | Troubleshooting | Security | API Reference

[!NOTE] This is very much an experiment in vibe coding. While the ideas are sound and the test coverage is robust, you may want to keep that in mind before using it for now.

What is Comet?

Comet is a segmented append-only log optimized for observability data (metrics, logs, traces) at edge locations. It implements the same pattern as Kafka's storage engine - append-only segments with time/size-based retention - but embedded directly in your service with aggressive deletion policies for resource-constrained environments.

Each shard maintains a series of immutable segment files that are rotated at size boundaries and deleted based on retention policies, ensuring predictable resource usage without the complexity of circular buffers or in-place overwrites.

Comet requires local filesystems (ext4, xfs, etc.) for its microsecond latency guarantees. It is unapologetically local. If you need distributed storage, use a proper distributed system like NATS JetStream or Kafka instead.

The Edge Storage Problem

Edge deployments need local observability buffering, but other solutions fall short:

Kafka: Requires clusters, complex ops, ~1-5ms latency
RocksDB: Single-threaded writes, 50-200μs writes, 100ms+ during compaction stalls
Redis: Requires separate server, memory-only without persistence config
Ring buffers: No persistence, no compression, data loss on overflow
Files + rotation: No indexing, no consumer tracking, manual everything

The gap: No embedded solution with Kafka's reliability at microsecond latencies.

Features

Ultra-low latency: Up to 12,634,200 entries/sec with batching (2.4M ops/sec with optimal sharding)
- Comet uses periodic checkpoints (default: every 1000 writes or 1 second) to persist data to disk. Between checkpoints, writes are acknowledged after being written to the OS page cache.
Predictable performance: No compaction stalls or write amplification like LSM-trees
True multi-process support: Hybrid coordination (mmap + file locks), crash-safe rotation, real OS processes
O(log n) lookups: Binary searchable index with bounded memory usage
Lock-free reads: Atomic pointers, zero-copy via mmap with memory safety
Automatic retention: Time and size-based cleanup, protects unconsumed data
Production ready: Crash recovery, built-in metrics, extensive testing
Smart sharding: Consistent hashing, automatic discovery, batch optimizations
Optional zstd compression: ~37% storage savings when needed

Multi-Process Coordination

Unlike other embedded solutions, Comet enables true multi-process coordination through memory-mapped state files. Perfect for prefork web servers and multi-process deployments.

Automatic shard ownership - Each process owns specific shards based on shardID % processCount == processID
Per-shard state files - Each shard has its own comet.state for metrics and recovery
Memory-mapped coordination - Lock-free operations through atomic memory access
Crash-safe design - State files enable automatic recovery on restart

How Does Comet Compare?

| Feature | Comet | Kafka | Redis Streams | RocksDB | Proof | | -------------------- | -------------------------- | ------------------- | ------------------ | ------------------ | ------------------------------------- | | Write Latency | 1.7μs (33μs multi-process) | 1-5ms | 50-100μs | 50-200μs | Benchmarks | | Multi-Process | ✅ Real OS processes | ✅ Distributed | ❌ Single process | ⚠️ Mutex locks | Tests | | Resource Bounds | ✅ Time & size limits | ⚠️ JVM heap | ⚠️ Memory only | ⚠️ Manual compact | Retention | | Crash Recovery | ✅ Automatic | ✅ Replicas | ⚠️ AOF/RDB | ✅ WAL | Recovery | | Zero Copy Reads | ✅ mmap | ❌ Network | ❌ Serialization | ❌ Deserialization | Reader | | Storage Overhead | ~12 bytes/entry | ~50 bytes/entry | ~20 bytes/entry | ~30 bytes/entry | Format | | Sharding | ✅ Built-in | ✅ Partitions | ❌ Manual | ❌ Manual | Client | | Compression | ✅ Optional zstd | ✅ Multiple codecs | ❌ None | ✅ Multiple | Config | | Embedded | ✅ Native | ❌ Requires cluster | ❌ Requires server | ✅ Native | - |

Quick Start

The Easy Way™

Step 1: Create a client

client, err := comet.NewClient("/var/lib/comet")
defer client.Close()

Step 2: Write your events

// Pick which shard to write to based on a key (for consistent routing)
// High cardinality keys (e.g. uuid) are recommended for consistent routing
stream := client.PickShardStream("events:v1", event.ID, 256)
// This returns something like "events:v1:shard:00A7" based on hash(event.ID) % 256

ids, err := client.Append(ctx, stream, [][]byte{
    []byte(event.ToJSON()),
})

Step 3: Process events

consumer := comet.NewConsumer(client, comet.ConsumerOptions{
    Group: "my-processor",
})

// Process() is the main API - it handles everything for you!
// By default, it discovers and processes ALL shards automatically
err = consumer.Process(ctx, func(ctx context.Context, messages []comet.StreamMessage) error {
    for _, msg := range messages {
        processEvent(msg.Data)  // Your logic here
    }
    return nil  // Success = automatic progress tracking
})

That's it! Comet handles:

✅ Compression - Large events compressed automatically
✅ Sharding - Load distributed across 16 shards
✅ Retries - Failed batches retry automatically
✅ Progress - Consumer offsets tracked per shard
✅ Cleanup - Old data deleted automatically
✅ Recovery - Crash? Picks up where it left off

Want more control? Scale horizontally:

// Deploy this same code across 3 processes:
err = consumer.Process(ctx, processEvents,
    comet.WithStream("events:v1:shard:*"),
    comet.WithConsumerAssignment(workerID, numWorkers),  // This worker + total count
)
// Each worker processes different shards automatically!
// No coordination needed - Comet handles it

Production-Ready Example

// Define your processing function with context support
processEvents := func(ctx context.Context, messages []comet.StreamMessage) error {
    for _, msg := range messages {
        // Check for cancellation
        if ctx.Err() != nil {
            return ctx.Err()
        }
        // Process each message
        if err := handleEvent(msg.Data); err != nil {
            return err // Will trigger retry
        }
    }
    return nil
}

err = consumer.Process(ctx, processEvents,
    comet.WithStream("events:v1:shard:*"),
    comet.WithBatchSize(1000),
    comet.WithPollInterval(50 * time.Millisecond),

    // Optional: Add observability
    comet.WithErrorHandler(func(err error, retryCount int) {
        metrics.Increment("comet.errors", 1)
        log.Printf("Retry %d: %v", retryCount, err)
    }),
    comet.WithBatchCallback(func(size int, duration time.Duration) {
        metrics.Histogram("comet.batch.size", float64(size))
        metrics.Histogram("comet.batch.duration_ms", duration.Milliseconds())
    }),
)

Need to tweak something?

// Only override what you need:
config := comet.DefaultCometConfig()
config.Retention.MaxAge = 24 * time.Hour  // Keep data longer
client, err := comet.NewClient("/var/lib/comet", config)

Configuration Structure

type CometConfig struct {
    Compression CompressionConfig  // Controls compression behavior
    Indexing    IndexingConfig     // Controls indexing and lookup
    Storage     StorageConfig      // Controls file storage
    Concurrency ConcurrencyConfig  // Controls multi-process behavior
    Retention   RetentionConfig    // Controls data retention
}

Architecture

┌─────────────────┐     ┌─────────────────┐
│   Your Service  │     │  Your Service   │
│    Process 0    │     │    Process 1    │
│  ┌───────────┐  │     │  ┌───────────┐  │
│  │   Comet   │  │     │  │   Comet   │  │
│  │  Client   │  │     │  │  Client   │  │
│  └─────┬─────┘  │     │  └─────┬─────┘  │
└────────┼────────┘     └────────┼────────┘
         │                       │
         ▼                       ▼
    ┌──────────────────────────────────┐
    │      Segmented Log Storage       │
    │                                  │
    │  Shard 0: [seg0][seg1][seg2]→    │ ← Process 0
    │           [comet.state]          │
    │  Shard 1: [seg0][seg1]→          │ ← Process 1
    │           [comet.state]          │
    │  Shard 2: [seg0][seg1][seg2]→    │ ← Process 0
    │           [comet.state]          │
    │  Shard 3: [seg0][seg1]→          │ ← Process 1
    │           [comet.state]          │
    │  ...                             │
    │                                  │
    │  ↓ segments deleted by retention │
    └──────────────────────────────────┘