Regimetry

Unsupervised regime detection for financial time series using embeddings and clustering.

Generate Convert Improve

Install / Use

/learn @kjpou1/Regimetry

About this skill

Quality Score

0/100

README

regimetry

Mapping latent regimes in financial time series.

regimetry

📘 Overview

regimetry is a modular, unsupervised regime detection engine for financial time series — originally developed as a personal research project to explore latent structure and behavioral transitions in markets.

It combines transformer-based embeddings with clustering and regime structure analysis to help identify and label recurring phases such as trends, reversals, and volatility shifts.

While built for exploratory analysis, regimetry may evolve into a foundational component of my broader trading strategy stack.

⚙️ Tech Highlights:

Transformer encoder with positional encoding

Attention-based temporal modeling (windowed)

Spectral clustering on learned embeddings

Regime structure modeling via Markov transitions, stickiness, and entropy

🔍 What is a Regime?

In regimetry, a regime is a latent, temporally structured pattern in market behavior — characterized by combinations of volatility, trend strength, momentum shifts, and signal alignment. These are not defined by hand, but emerge from patterns discovered in the data.

Formally:

Regimes are clusters in the embedding space of overlapping market windows (e.g., 30 bars).
Each embedding is generated via a Transformer encoder that learns internal structure within each window using attention over time.
Spectral clustering then groups these embeddings into recurring behavioral states the market tends to revisit.

🧠 How It Works

1. Data Ingestion

Load daily bar data per instrument
Normalize features (Close, AHMA, LP, LC, etc.)
Features are typically sourced from ConvolutionLab,
but regimetry is not dependent on that specific pipeline — any compatible feature set can be used.
Slice into overlapping windows (default: 30 bars, stride 1)

2. Embedding Pipeline

Each rolling window is passed through a Transformer encoder that uses positional encoding to preserve temporal structure and self-attention to learn nonlinear dependencies within the window.
This produces a dense, contextualized embedding that reflects local market dynamics.
The architecture is modular and can be swapped with alternatives such as autoencoders, SimCLR, or CNN-based encoders.

3. Clustering

Standardize the embeddings
Cluster them using Spectral Clustering (or another method)
Assign each window a regime_id

4. Visualization & Interpretation

Use t-SNE or UMAP to project embeddings
Visualize regime transitions over time
Map regimes back to chart or signal data for strategy insights

🚀 Getting Started

See the full step-by-step guide: 📖 docs/GETTING_STARTED_README.md

Includes:

Git clone instructions

Poetry or manual install

Data ingestion

Embedding generation

Regime clustering

Optional Dash dashboard launch

📘 Regime Detection Window Delay

📄 See: docs/REGIME_DETECTION_README.md

Because regime labels are assigned based on rolling windows, the cluster ID for the final bars of a dataset cannot be known until the full window is complete.

For example, with a window_size = 30:

The first 29 bars will not receive a regime ID
The last 29 bars also do not reflect any future regime change, since there are no forward windows to reclassify them

This introduces a natural lag in regime detection:

New regimes will only appear after enough time has passed for the model to “observe” a full window in the new market condition.

👉 For more details, see the full explanation: REGIME_DETECTION_README.md

📚 Documentation

📘 Getting Started Step-by-step setup, from ingestion to visualization.
🧠 Regime Detection Window Logic Explains the natural lag from using rolling windows in clustering.
🧭 Regime Assignment & Label Alignment Details how Spectral Clustering labels are aligned across runs using the Hungarian algorithm, with persistent baseline mapping and cluster color stability.

📟 Command Line Usage

Run regimetry pipelines directly from the command line with optional overrides.

🔹 Ingest Data

python launch_host.py ingest \
  --signal-input-path examples/EUR_USD_processed_signals.csv

This will:

Parse the input CSV
Normalize and structure features
Save the result to artifacts/data/processed/

🔹 Generate Embeddings

python launch_host.py embed \
  --signal-input-path examples/EUR_USD_processed_signals.csv \
  --output-name EUR_USD_embeddings.npy \
  --window-size 30 \
  --stride 1 \
  --encoding-method sinusoidal \
  --encoding-style interleaved

This will:

Apply a rolling window (default: 30 bars, stride: 1 unless overridden)
Use positional encoding and Transformer to generate embeddings
Save the result to embeddings/EUR_USD_embeddings.npy

⚠️ Note: Ensure that window_size is smaller than your dataset length. If window_size >= len(data), no embeddings will be produced.

Ah — got it. Since --embedding-dim is now used for both learnable and sinusoidal, the description needs to be updated accordingly. Here's the revised table and footnote:

🛠 Available CLI Arguments for `embed`

| Argument | Description | | --------------------- | ------------------------------------------------------------------------------- | | --signal-input-path | Path to the CSV file with feature-enriched signal data | | --output-name | Optional output file name for the .npy embeddings (default: embeddings.npy) | | --window-size | Number of time steps per rolling window (default: 30) | | --stride | Step size between rolling windows (default: 1) | | --encoding-method | Positional encoding method: sinusoidal (default) or learnable | | --encoding-style | Sinusoidal encoding format: interleaved (default) or stacked | | --embedding-dim | Embedding dimension to use for both sinusoidal and learnable encodings | | --config | Optional YAML config path to override pipeline settings | | --debug | Enable debug logging |

ℹ️ Note: --embedding-dim applies to both sinusoidal and learnable encodings. For sinusoidal, it sets the generated frequency embedding size. For learnable, it defines the trainable positional embedding dimension.

🔹 Cluster Regimes

python launch_host.py cluster \
  --embedding-path embeddings/EUR_USD_embeddings.npy \
  --regime-data-path data/processed/regime_input.csv \
  --output-dir reports/EUR_USD \
  --window-size 30 \
  --n-clusters 3

This will:

Load precomputed transformer embeddings
Apply spectral clustering to assign regime IDs
Align cluster labels with original time-series dat

Related Skills

A beautifully designed, floating Pomodoro timer that respects your workspace.

roadmap

A beautifully designed, floating Pomodoro timer that respects your workspace.

progress

A beautifully designed, floating Pomodoro timer that respects your workspace.

product-manager-skills

PM skill for Claude Code, Codex, Cursor, and Windsurf: diagnose SaaS metrics, critique PRDs, plan roadmaps, run discovery, and coach PM career transitions.

kjpou1

View profile

View on GitHub

GitHub Stars12

CategoryProduct

Updated9d ago

Forks7

kjpou1/regimetry

Languages

Jupyter Notebook

Security Score

95/100

Audited on Mar 15, 2026

No findings

Regimetry

Install / Use

README

regimetry

📘 Overview

🔍 What is a Regime?

🧠 How It Works

1. Data Ingestion

2. Embedding Pipeline

3. Clustering

4. Visualization & Interpretation

🚀 Getting Started

📘 Regime Detection Window Delay

📚 Documentation

📟 Command Line Usage

🔹 Ingest Data

🔹 Generate Embeddings

🛠 Available CLI Arguments for embed

🔹 Cluster Regimes

Related Skills

🛠 Available CLI Arguments for `embed`