eXMC

"One does not translate a minor work. One translates the work that would not leave one alone." — Translator's Foreword

A PPL environment on the BEAM, inspired by PyMC. eXMC is a from-scratch Elixir implementation of PyMC's architecture: declarative model specification, automatic constraint transforms, NUTS sampling, and Bayesian diagnostics — all on Nx tensors with optional EXLA acceleration.

With deep respect: this project builds on the ideas, rigor, and ergonomics pioneered by the PyMC community. The goal is not to replace PyMC. The goal is to preserve correctness and usability while exploring what changes when the runtime is the BEAM.

Live Streaming Dashboard

Why A New PPL Environment?

PyMC established a high bar for statistical correctness, extensibility, and user experience. eXMC asks a focused question:

What happens if that architecture runs on a fault-tolerant, massively concurrent runtime?

The BEAM gives us lightweight processes, isolation, and message passing. That changes how we think about multi-chain sampling, streaming diagnostics, and observability. eXMC keeps PyMC's model semantics and diagnostics philosophy, while rethinking execution.

What We Preserve From PyMC

Model semantics and ergonomics: declarative RVs, clear constraints, sensible defaults.
Statistical correctness: NUTS with Stan-style three-phase warmup, ESS/R-hat, WAIC/LOO.
Composable diagnostics: traces, energy, autocorrelation, and predictive checks.

What The BEAM Enables

Concurrency without copies. Four chains are four lightweight processes sharing one compiled model. No cloudpickle, no multiprocessing.Pool, no four copies of the interpreter. Task.async_stream dispatches them across all cores.
Per-sample streaming. sample_stream/4 sends each posterior sample as a message to any BEAM process — a Scenic window, a Phoenix LiveView, a GenServer computing running statistics. Nutpie sets the standard for live MCMC UX with rich terminal progress bars (per-chain draws, divergences, step size, gradients/draw), blocking=False with pause/resume/abort, and access to incomplete traces. eXMC takes a different approach: instead of a built-in terminal display, it streams individual samples as BEAM messages, composing with whatever visualization layer you choose — Scenic for native desktop, Phoenix LiveView for browser, or a custom GenServer for online statistics.
Fault isolation. A chain that hits a numerical singularity — NaN gradient, EXLA crash, memory fault — is caught and replaced with a divergent placeholder. The other chains keep running. The supervisor tree doesn't care.
Distribution as a language primitive. Distributed.sample_chains/2 sends model IR to remote :peer nodes via :erpc. Each node compiles independently (heterogeneous hardware). If a node dies, the chain retries on the coordinator automatically. Zero external infrastructure.

Performance

Seven-model benchmark against PyMC (1-chain, 5-seed medians, 1000 warmup + 1000 draws):

                PyMC ESS/s   eXMC ESS/s   Ratio    Winner
                ──────────────────────────────────────────
simple (d=2)         576          469      0.81x    PyMC
medium (d=5)         157          298      1.90x    eXMC
stress (d=8)         185          215      1.16x    eXMC
eight_schools (d=10)   5           12      2.55x    eXMC
funnel (d=10)          6            2      0.40x    PyMC
logistic (d=21)      336           69      0.21x    PyMC
sv (d=102)             1            1      1.20x    eXMC

eXMC wins 4 models to PyMC's 3, including the canonical Eight Schools benchmark (2.55x) and 102-dimensional stochastic volatility (1.20x). PyMC wins on throughput-bound models where compiled C++ per-step speed dominates. eXMC wins on adaptation-bound models where posterior geometry is hard.

With 5-node distribution, eXMC achieves 2.88x average scaling:

                1ch ESS/s   5-node ESS/s   PyMC 4ch   Dist vs PyMC
                ─────────────────────────────────────────────────────
medium              271           841          680       1.24x eXMC
funnel              1.6           5.4          4.1       1.32x eXMC

Quick Start

alias Exmc.{Builder, Dist.Normal, Dist.HalfNormal}

# Define a hierarchical model
ir =
  Builder.new_ir()
  |> Builder.rv("mu", Normal, %{mu: Nx.tensor(0.0), sigma: Nx.tensor(5.0)})
  |> Builder.rv("sigma", HalfNormal, %{sigma: Nx.tensor(2.0)})
  |> Builder.rv("x", Normal, %{mu: "mu", sigma: "sigma"})
  |> Builder.obs("x_obs", "x",
    Nx.tensor([2.1, 1.8, 2.5, 2.0, 1.9, 2.3, 2.2, 1.7, 2.4, 2.6])
  )

# Sample
{trace, stats} = Exmc.NUTS.Sampler.sample(ir,
  %{"mu" => 2.0, "sigma" => 1.0},
  num_samples: 1000, num_warmup: 500
)

# Posterior mean
Nx.mean(trace["mu"]) |> Nx.to_number()
# => ~2.1

DSL Syntax

use Exmc.DSL

ir = model do
  rv("mu", Normal, %{mu: Nx.tensor(0.0), sigma: Nx.tensor(5.0)})
  rv("sigma", HalfNormal, %{sigma: Nx.tensor(2.0)})
  rv("x", Normal, %{mu: "mu", sigma: "sigma"})
  obs("x_obs", "x", Nx.tensor([2.1, 1.8, 2.5]))
end

Multi-Chain

# Parallel chains — compile once, run on all cores
{traces, stats_list} = Exmc.NUTS.Sampler.sample_chains(ir, 4,
  init_values: %{"mu" => 2.0, "sigma" => 1.0}
)

Streaming

# Stream samples to any process — LiveView, GenServer, Scenic window
Exmc.NUTS.Sampler.sample_stream(ir, self(), %{"mu" => 2.0, "sigma" => 1.0},
  num_warmup: 500, num_samples: 1000
)

# Receive samples as they arrive
receive do
  {:exmc_sample, point, stat} -> IO.inspect(point["mu"])
end

Distributed

# Spawn peer nodes and sample across them — zero infrastructure
{:ok, _pid, node1} = :peer.start_link(%{name: :worker1})
{:ok, _pid, node2} = :peer.start_link(%{name: :worker2})

{traces, stats_list} = Exmc.NUTS.Distributed.sample_chains(ir,
  nodes: [node(), node1, node2],
  init_values: %{"mu" => 2.0, "sigma" => 1.0}
)
# Node dies? Chain retries on coordinator automatically.

Inference Methods

| Method | Module | Use Case | |--------|--------|----------| | NUTS | Exmc.NUTS.Sampler | Gold standard. Stan-style three-phase warmup, multinomial trajectory sampling, rho-based U-turn criterion | | ADVI | Exmc.ADVI | Fast approximate posterior. Mean-field normal in unconstrained space, stochastic gradient ELBO | | SMC | Exmc.SMC | Multimodal posteriors. Likelihood tempering with Metropolis-Hastings transitions | | Pathfinder | Exmc.Pathfinder | L-BFGS path toward mode with diagonal normal fit at each step. Fast initialization for NUTS |

Distributions

| Distribution | Support | Transform | Params | |-------------|---------|-----------|--------| | Normal | R | none | mu, sigma | | HalfNormal | R+ | :log | sigma | | Exponential | R+ | :log | rate | | Gamma | R+ | :softplus | alpha, beta | | Beta | (0,1) | :logit | alpha, beta | | Uniform | (a,b) | :logit | low, high | | StudentT | R | none | nu, mu, sigma | | Cauchy | R | none | mu, sigma | | LogNormal | R+ | :log | mu, sigma | | Laplace | R | none | mu, b | | MvNormal | R^d | none | mu (vector), cov (matrix) | | GaussianRandomWalk | R^T | none | sigma | | Dirichlet | Δ^K (simplex) | :stick_breaking | alpha (vector) | | Custom | any | user-defined | user-defined closure | | Mixture | any | component-based | weights, components | | Censored | any | wraps base dist | dist, lower, upper |

Key Features

Automatic Non-Centered Parameterization. Hierarchical Normals where both mu and sigma are parent references are rewritten to z ~ N(0,1) with x = mu + sigma * z. Disable with ncp: false when data is informative.
EXLA auto-detection. When EXLA is available, value_and_grad is JIT-compiled. Falls back to BinaryBackend transparently. GPU via device: :cuda.
Vectorized observations. Pass Nx.tensor([...]) to Builder.obs — reduction is handled automatically. No need to create one RV per data point.
Model comparison. WAIC and LOO-CV via Exmc.ModelComparison.compare/1.
Prior and posterior predictive. Exmc.Predictive.prior_samples/2 and posterior_predictive/2 for model checking.
Custom distributions. Exmc.Dist.Custom takes a logpdf closure — any differentiable density. Used for Bernoulli likelihoods, random walk models, and domain-specific densities.
Fault-tolerant tree building. Four layers: IEEE 754 NaN/Inf detection, subtree early termination, trajectory-level divergence tracking, process-level crash recovery via try/rescue.
Deterministic seeding. Erlang :rand with explicit state threading. Every chain is reproducible given {seed, tuning_params, ir}.

Architecture

Builder.new_ir()                        # 1. Declare
|> Builder.rv("mu", Normal, params)     #    your model
|> Builder.rv("sigma", HalfNormal, ...) #    as an IR graph
|> Builder.obs("y", "x", data)          #
                                        #
Rewrite.run(ir, passes)                 # 2. Rewrite passes:
  # affine -> meas_obs                  #    NCP, measurable ops,
  # non-centered parameterization       #    constraint transforms
                                        #
Compiler.compile_for_sampling(ir)       # 3. Compile to:
  # => {vag_fn, step_fn, pm, ncp_info}  #    logp + gradient closure
                                        #    (EXLA JIT when available)
                                        #
Sampler.sample(ir, init, opts)          # 4. NUTS with Stan-style
  # => {trace, stats}                   #    three-phase warmup

Four layers, each a clean boundary:

| Layer | Modules | Responsibility | |-------|---------|----------------| | IR | Builder, DSL, IR, Node, Dist.* | Model as data. 16 distributions (3 vecto

EXMC

Install / Use

README