Snakepit
High-performance, generalized process pooler and session manager for external language integrations. Orchestrates and supervises languages like Python and Javascript from Elixir.
Install / Use
/learn @nshkrdotcom/SnakepitREADME
Snakepit
<div align="center"> <img src="assets/snakepit-logo.svg" alt="Snakepit Logo" width="200" height="200"> </div>A high-performance, generalized process pooler and session manager for external language integrations in Elixir
Features
- High-performance process pooling with concurrent worker initialization
- Session affinity for stateful operations across requests (hint by default, strict modes available)
- gRPC streaming for real-time progress updates and large data transfers
- Bidirectional tool bridge allowing Python to call Elixir functions and vice versa
- Production-ready process management with automatic orphan cleanup
- Hardware detection for ML accelerators (CUDA, MPS, ROCm)
- Fault tolerance with circuit breakers, retry policies, and crash barriers
- Comprehensive telemetry with OpenTelemetry support
- Dual worker profiles (process isolation or threaded parallelism)
- Zero-copy data interop via DLPack and Arrow
Installation
Add snakepit to your dependencies in mix.exs:
def deps do
[
{:snakepit, "~> 0.13.0"}
]
end
Then run:
mix deps.get
mix snakepit.setup # Install Python dependencies and generate gRPC stubs
mix snakepit.doctor # Verify environment is correctly configured
Using with SnakeBridge (Recommended)
For higher-level Python integration with compile-time type generation, use SnakeBridge instead of snakepit directly. SnakeBridge handles Python environment setup automatically at compile time.
def deps do
[{:snakebridge, "~> 0.16.0"}]
end
def project do
[
...
compilers: [:snakebridge] ++ Mix.compilers()
]
end
Quick Start
# Execute a command on any available worker
{:ok, result} = Snakepit.execute("ping", %{})
# Execute with session affinity (prefer the same worker for related requests)
{:ok, result} = Snakepit.execute_in_session("session_123", "process_data", %{input: data})
# Stream results for long-running operations
Snakepit.execute_stream("batch_process", %{items: items}, fn chunk ->
IO.puts("Progress: #{chunk["progress"]}%")
end)
Configuration
Simple Configuration
# config/config.exs
config :snakepit,
pooling_enabled: true,
adapter_module: Snakepit.Adapters.GRPCPython,
adapter_args: ["--adapter", "your_adapter_module"],
pool_size: 10,
log_level: :error
In legacy single-pool mode, if both top-level :pool_size and
pool_config.pool_size are configured, the top-level :pool_size value wins.
Multi-Pool Configuration (v0.6+)
config :snakepit,
pools: [
%{
name: :default,
worker_profile: :process,
pool_size: 10,
adapter_module: Snakepit.Adapters.GRPCPython,
adapter_args: ["--adapter", "my_app.adapters.MainAdapter"]
},
%{
name: :compute,
worker_profile: :thread,
pool_size: 4,
threads_per_worker: 8,
adapter_args: ["--adapter", "my_app.adapters.ComputeAdapter"]
}
]
If your adapter defines command_timeout/2, timeout selection is resolved from the
checked-out worker's pool adapter_module. The global :adapter_module value is
used only as a fallback when a pool does not declare one.
Logging Configuration
Snakepit is silent by default (errors only):
config :snakepit, log_level: :error # Default - errors only
config :snakepit, log_level: :info # Include info messages
config :snakepit, log_level: :debug # Verbose debugging
config :snakepit, log_level: :none # Complete silence
# Filter to specific categories
config :snakepit, log_level: :debug, log_categories: [:grpc, :pool]
gRPC Listener Configuration
By default, Snakepit runs an internal-only gRPC listener on an ephemeral port and publishes the assigned port to Python workers at runtime:
config :snakepit,
grpc_listener: %{
mode: :internal
}
Explicit external bindings are opt-in and require host/port configuration:
config :snakepit,
grpc_listener: %{
mode: :external,
host: "localhost",
bind_host: "0.0.0.0",
port: 50051
}
For multi-instance deployments sharing a host, use the pooled external mode:
config :snakepit,
grpc_listener: %{
mode: :external_pool,
host: "localhost",
bind_host: "0.0.0.0",
base_port: 50051,
pool_size: 32
}
To isolate process registry state when sharing a deployment directory, set an explicit instance name, instance token, and data directory:
config :snakepit,
instance_name: "my-app-a",
instance_token: "node-a-01",
data_dir: "/var/lib/snakepit"
instance_name identifies an environment (for example prod-us-east-1).
instance_token identifies one running instance inside that environment.
When running multiple Snakepit VMs from the same checkout or host at the same
time, each VM must use a unique instance_token so cleanup logic never targets
another live instance.
Environment variables are also supported:
SNAKEPIT_INSTANCE_NAME=my-app SNAKEPIT_INSTANCE_TOKEN=job_1 mix run --no-start script_a.exs
SNAKEPIT_INSTANCE_NAME=my-app SNAKEPIT_INSTANCE_TOKEN=job_2 mix run --no-start script_b.exs
Runtime Configurable Defaults
All hardcoded timeout and sizing values are now configurable via Application.get_env/3.
Values are read at runtime, allowing configuration changes without recompilation.
# config/runtime.exs - Example customization
config :snakepit,
# Timeouts (all in milliseconds)
default_command_timeout: 30_000, # Default timeout for commands
pool_request_timeout: 60_000, # Pool execute timeout
pool_streaming_timeout: 300_000, # Pool streaming timeout
pool_startup_timeout: 10_000, # Worker startup timeout
pool_queue_timeout: 5_000, # Queue timeout
checkout_timeout: 5_000, # Worker checkout timeout
grpc_worker_execute_timeout: 30_000, # GRPCWorker execute timeout
grpc_worker_stream_timeout: 300_000, # GRPCWorker streaming timeout
grpc_worker_health_check_timeout_ms: 5_000, # Periodic worker health RPC timeout
graceful_shutdown_timeout_ms: 6_000, # Python process shutdown timeout
# Pool sizing
pool_max_queue_size: 1000, # Max pending requests in queue
pool_max_workers: 150, # Maximum workers per pool
pool_startup_batch_size: 10, # Workers started per batch
pool_startup_batch_delay_ms: 500, # Delay between startup batches
# Pool recovery
pool_reconcile_interval_ms: 1_000, # Reconcile worker count interval (0 disables)
pool_reconcile_batch_size: 2, # Max workers respawned per tick
# Worker supervisor restart intensity
worker_starter_max_restarts: 3,
worker_starter_max_seconds: 5,
worker_supervisor_max_restarts: 3,
worker_supervisor_max_seconds: 5,
# Retry policy
retry_max_attempts: 3,
retry_backoff_sequence: [100, 200, 400, 800, 1600],
retry_max_backoff_ms: 30_000,
retry_jitter_factor: 0.25,
# Circuit breaker
circuit_breaker_failure_threshold: 5,
circuit_breaker_reset_timeout_ms: 30_000,
circuit_breaker_half_open_max_calls: 1,
# Crash barrier
crash_barrier_taint_duration_ms: 60_000,
crash_barrier_max_restarts: 1,
crash_barrier_backoff_ms: [50, 100, 200],
# Health monitor
health_monitor_check_interval: 30_000,
health_monitor_crash_window_ms: 60_000,
health_monitor_max_crashes: 10,
# Heartbeat
heartbeat_ping_interval_ms: 2_000,
heartbeat_timeout_ms: 10_000,
heartbeat_max_missed: 3,
# Session store
session_cleanup_interval: 60_000,
session_default_ttl: 3600,
session_max_sessions: 10_000,
session_warning_threshold: 0.8,
# gRPC listener
grpc_listener: %{mode: :internal},
grpc_internal_host: "127.0.0.1",
grpc_port_pool_size: 32,
grpc_listener_ready_timeout_ms: 5_000,
grpc_listener_port_check_interval_ms: 25,
grpc_listener_reuse_attempts: 3,
grpc_listener_reuse_wait_timeout_ms: 500,
grpc_listener_reuse_retry_delay_ms: 100,
grpc_num_acceptors: 20,
grpc_max_connections: 1000,
grpc_socket_backlog: 512
See Snakepit.Defaults module documentation for the complete list of configurable values.
Core API
Basic Execution
# Simple command execution
{:ok, result} = Snakepit.execute("command_name", %{param: "value"})
# With timeout
{:ok, result} = Snakepit.execute("slow_command", %{}, timeout: 30_000)
# Target specific pool
{:ok, result} = Snakepit.execute("ml_inference", %{}, pool: :compute)
Session Affinity
Sessions route related requests to the same worker when possible, enabling stateful operations:
session_id = "user_#{user.id}"
# First call establishes worker affinity
{:ok, _} = Snakepit.execute_in_session(session_id, "load_model", %{model: "gpt-4"})
# Subsequent calls prefer the same worker
{:ok, result} = Snakepit.execute_in_session(session_id, "generate", %{prompt: "Hello"})
{:ok, result} = Snakepit.execute_in_session(session_id, "generate", %{prompt: "Continue"})
By default, affinity is a hint. If the preferred worker is busy or tainted, Snakepit can fall back to another worker. For strict pinning, configure affinity modes at the pool level:
config :snakepit,
pools: [
%{name: :default, pool_size: 4, affinity: :strict_queue},
%{name: :latency_sensitive, pool_size: 4, affinity: :strict_fail_fast}
]
:strict_queuequeues requests for the preferred worker when it is busy.- `:
Related Skills
node-connect
346.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
claude-opus-4-5-migration
107.2kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
107.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
model-usage
346.4kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
