SkillAgentSearch skills...

Comet

Embedded segmented log with 2μs writes and multi-process coordination. Optimized for edge and single-node deployments.

Install / Use

/learn @orbiterhq/Comet

README

☄️ Comet

High-performance embedded segmented log for edge observability. Built for single-digit microsecond latency and bounded resources.

Architecture Guide | Performance Guide | Troubleshooting | Security | API Reference

[!NOTE] This is very much an experiment in vibe coding. While the ideas are sound and the test coverage is robust, you may want to keep that in mind before using it for now.

What is Comet?

Comet is a segmented append-only log optimized for observability data (metrics, logs, traces) at edge locations. It implements the same pattern as Kafka's storage engine - append-only segments with time/size-based retention - but embedded directly in your service with aggressive deletion policies for resource-constrained environments.

Each shard maintains a series of immutable segment files that are rotated at size boundaries and deleted based on retention policies, ensuring predictable resource usage without the complexity of circular buffers or in-place overwrites.

Comet requires local filesystems (ext4, xfs, etc.) for its microsecond latency guarantees. It is unapologetically local. If you need distributed storage, use a proper distributed system like NATS JetStream or Kafka instead.

The Edge Storage Problem

Edge deployments need local observability buffering, but other solutions fall short:

  • Kafka: Requires clusters, complex ops, ~1-5ms latency
  • RocksDB: Single-threaded writes, 50-200μs writes, 100ms+ during compaction stalls
  • Redis: Requires separate server, memory-only without persistence config
  • Ring buffers: No persistence, no compression, data loss on overflow
  • Files + rotation: No indexing, no consumer tracking, manual everything

The gap: No embedded solution with Kafka's reliability at microsecond latencies.

Features

  • Ultra-low latency: Up to 12,634,200 entries/sec with batching (2.4M ops/sec with optimal sharding)
    • Comet uses periodic checkpoints (default: every 1000 writes or 1 second) to persist data to disk. Between checkpoints, writes are acknowledged after being written to the OS page cache.
  • Predictable performance: No compaction stalls or write amplification like LSM-trees
  • True multi-process support: Hybrid coordination (mmap + file locks), crash-safe rotation, real OS processes
  • O(log n) lookups: Binary searchable index with bounded memory usage
  • Lock-free reads: Atomic pointers, zero-copy via mmap with memory safety
  • Automatic retention: Time and size-based cleanup, protects unconsumed data
  • Production ready: Crash recovery, built-in metrics, extensive testing
  • Smart sharding: Consistent hashing, automatic discovery, batch optimizations
  • Optional zstd compression: ~37% storage savings when needed

Multi-Process Coordination

Unlike other embedded solutions, Comet enables true multi-process coordination through memory-mapped state files. Perfect for prefork web servers and multi-process deployments.

  • Automatic shard ownership - Each process owns specific shards based on shardID % processCount == processID
  • Per-shard state files - Each shard has its own comet.state for metrics and recovery
  • Memory-mapped coordination - Lock-free operations through atomic memory access
  • Crash-safe design - State files enable automatic recovery on restart

How Does Comet Compare?

| Feature | Comet | Kafka | Redis Streams | RocksDB | Proof | | -------------------- | -------------------------- | ------------------- | ------------------ | ------------------ | ------------------------------------- | | Write Latency | 1.7μs (33μs multi-process) | 1-5ms | 50-100μs | 50-200μs | Benchmarks | | Multi-Process | ✅ Real OS processes | ✅ Distributed | ❌ Single process | ⚠️ Mutex locks | Tests | | Resource Bounds | ✅ Time & size limits | ⚠️ JVM heap | ⚠️ Memory only | ⚠️ Manual compact | Retention | | Crash Recovery | ✅ Automatic | ✅ Replicas | ⚠️ AOF/RDB | ✅ WAL | Recovery | | Zero Copy Reads | ✅ mmap | ❌ Network | ❌ Serialization | ❌ Deserialization | Reader | | Storage Overhead | ~12 bytes/entry | ~50 bytes/entry | ~20 bytes/entry | ~30 bytes/entry | Format | | Sharding | ✅ Built-in | ✅ Partitions | ❌ Manual | ❌ Manual | Client | | Compression | ✅ Optional zstd | ✅ Multiple codecs | ❌ None | ✅ Multiple | Config | | Embedded | ✅ Native | ❌ Requires cluster | ❌ Requires server | ✅ Native | - |

Quick Start

The Easy Way™

Step 1: Create a client

client, err := comet.NewClient("/var/lib/comet")
defer client.Close()

Step 2: Write your events

// Pick which shard to write to based on a key (for consistent routing)
// High cardinality keys (e.g. uuid) are recommended for consistent routing
stream := client.PickShardStream("events:v1", event.ID, 256)
// This returns something like "events:v1:shard:00A7" based on hash(event.ID) % 256

ids, err := client.Append(ctx, stream, [][]byte{
    []byte(event.ToJSON()),
})

Step 3: Process events

consumer := comet.NewConsumer(client, comet.ConsumerOptions{
    Group: "my-processor",
})

// Process() is the main API - it handles everything for you!
// By default, it discovers and processes ALL shards automatically
err = consumer.Process(ctx, func(ctx context.Context, messages []comet.StreamMessage) error {
    for _, msg := range messages {
        processEvent(msg.Data)  // Your logic here
    }
    return nil  // Success = automatic progress tracking
})

That's it! Comet handles:

  • Compression - Large events compressed automatically
  • Sharding - Load distributed across 16 shards
  • Retries - Failed batches retry automatically
  • Progress - Consumer offsets tracked per shard
  • Cleanup - Old data deleted automatically
  • Recovery - Crash? Picks up where it left off

Want more control? Scale horizontally:

// Deploy this same code across 3 processes:
err = consumer.Process(ctx, processEvents,
    comet.WithStream("events:v1:shard:*"),
    comet.WithConsumerAssignment(workerID, numWorkers),  // This worker + total count
)
// Each worker processes different shards automatically!
// No coordination needed - Comet handles it

Production-Ready Example

// Define your processing function with context support
processEvents := func(ctx context.Context, messages []comet.StreamMessage) error {
    for _, msg := range messages {
        // Check for cancellation
        if ctx.Err() != nil {
            return ctx.Err()
        }
        // Process each message
        if err := handleEvent(msg.Data); err != nil {
            return err // Will trigger retry
        }
    }
    return nil
}

err = consumer.Process(ctx, processEvents,
    comet.WithStream("events:v1:shard:*"),
    comet.WithBatchSize(1000),
    comet.WithPollInterval(50 * time.Millisecond),

    // Optional: Add observability
    comet.WithErrorHandler(func(err error, retryCount int) {
        metrics.Increment("comet.errors", 1)
        log.Printf("Retry %d: %v", retryCount, err)
    }),
    comet.WithBatchCallback(func(size int, duration time.Duration) {
        metrics.Histogram("comet.batch.size", float64(size))
        metrics.Histogram("comet.batch.duration_ms", duration.Milliseconds())
    }),
)

Need to tweak something?

// Only override what you need:
config := comet.DefaultCometConfig()
config.Retention.MaxAge = 24 * time.Hour  // Keep data longer
client, err := comet.NewClient("/var/lib/comet", config)

Configuration Structure

type CometConfig struct {
    Compression CompressionConfig  // Controls compression behavior
    Indexing    IndexingConfig     // Controls indexing and lookup
    Storage     StorageConfig      // Controls file storage
    Concurrency ConcurrencyConfig  // Controls multi-process behavior
    Retention   RetentionConfig    // Controls data retention
}

Architecture

┌─────────────────┐     ┌─────────────────┐
│   Your Service  │     │  Your Service   │
│    Process 0    │     │    Process 1    │
│  ┌───────────┐  │     │  ┌───────────┐  │
│  │   Comet   │  │     │  │   Comet   │  │
│  │  Client   │  │     │  │  Client   │  │
│  └─────┬─────┘  │     │  └─────┬─────┘  │
└────────┼────────┘     └────────┼────────┘
         │                       │
         ▼                       ▼
    ┌──────────────────────────────────┐
    │      Segmented Log Storage       │
    │                                  │
    │  Shard 0: [seg0][seg1][seg2]→    │ ← Process 0
    │           [comet.state]          │
    │  Shard 1: [seg0][seg1]→          │ ← Process 1
    │           [comet.state]          │
    │  Shard 2: [seg0][seg1][seg2]→    │ ← Process 0
    │           [comet.state]          │
    │  Shard 3: [seg0][seg1]→          │ ← Process 1
    │           [comet.state]          │
    │  ...                             │
    │                                  │
    │  ↓ segments deleted by retention │
    └──────────────────────────────────┘

Performance Optimizations

Comet achieves microsecond-level latency through careful optimization:

  1. Lock-Free Reads: Memory-mapped file

Related Skills

View on GitHub
GitHub Stars4
CategoryOperations
Updated3mo ago
Forks0

Languages

Go

Security Score

87/100

Audited on Dec 29, 2025

No findings