Mchmm

Markov Chains and Hidden Markov Models in Python

Generate Convert Improve

Install / Use

/learn @maximtrp/Mchmm

About this skill

Quality Score

0/100

README

mchmm

A Python package for statistical modeling with Markov chains and Hidden Markov models. Built on NumPy and SciPy, mchmm provides efficient implementations of core algorithms including Viterbi decoding and Baum-Welch parameter estimation. The package also includes visualization capabilities for understanding model structure and behavior.

Key Features

Discrete Markov Chains: Build transition models from sequence data with automatic state inference
Hidden Markov Models: Implement HMMs with customizable observation and state spaces
Viterbi Algorithm: Find most likely state sequences for new observations
Baum-Welch Algorithm: Learn HMM parameters from unlabeled sequence data
Statistical Testing: Built-in chi-squared tests for model validation
Visualization: Generate directed graphs of Markov models using Graphviz
Simulation: Generate synthetic sequences from trained models

Donate

If you find this package useful, please consider donating any amount of money. This will help me spend more time on supporting open-source software.

Installation

Install from PyPI:

pip install mchmm

Or install from source:

git clone https://github.com/maximtrp/mchmm.git
cd mchmm
pip install . --user

Dependencies

NumPy - Numerical computing
SciPy - Scientific computing and statistics
Graphviz - Graph visualization (optional)

Quick Start

Discrete Markov Chains

Initializing a Markov chain using some data.

import mchmm as mc
a = mc.MarkovChain().from_data('AABCABCBAAAACBCBACBABCABCBACBACBABABCBACBBCBBCBCBCBACBABABCBCBAAACABABCBBCBCBCBCBCBAABCBBCBCBCCCBABCBCBBABCBABCABCCABABCBABC')

Now, we can look at the observed transition frequency matrix:

a.observed_matrix
# array([[ 7., 18.,  7.],
#        [19.,  5., 29.],
#        [ 5., 30.,  3.]])

And the observed transition probability matrix:

a.observed_p_matrix
# array([[0.21875   , 0.5625    , 0.21875   ],
#        [0.35849057, 0.09433962, 0.54716981],
#        [0.13157895, 0.78947368, 0.07894737]])

You can visualize your Markov chain. First, build a directed graph with graph_make() method of MarkovChain object. Then render() it.

graph = a.graph_make(
    format="png",
    graph_attr=[("rankdir", "LR")],
    node_attr=[("fontname", "Roboto bold"), ("fontsize", "20")],
    edge_attr=[("fontname", "Iosevka"), ("fontsize", "12")]
)
graph.render()

Here is the result:

Markov Chain

Pandas can help us annotate columns and rows:

import pandas as pd
pd.DataFrame(a.observed_matrix, index=a.states, columns=a.states, dtype=int)
#      A   B   C
#  A   7  18   7
#  B  19   5  29
#  C   5  30   3

Viewing the expected transition frequency matrix:

a.expected_matrix
# array([[ 8.06504065, 13.78861789, 10.14634146],
#        [13.35772358, 22.83739837, 16.80487805],
#        [ 9.57723577, 16.37398374, 12.04878049]])

Calculating Nth order transition probability matrix:

a.n_order_matrix(a.observed_p_matrix, order=2)
# array([[0.2782854 , 0.34881028, 0.37290432],
#        [0.1842357 , 0.64252707, 0.17323722],
#        [0.32218957, 0.21081868, 0.46699175]])

Carrying out a chi-squared test:

a.chisquare(a.observed_matrix, a.expected_matrix, axis=None)
# Power_divergenceResult(statistic=47.89038802624337, pvalue=1.0367838347591701e-07)

Finally, let's simulate a Markov chain given our data.

ids, states = a.simulate(10, start='A', seed=np.random.randint(0, 10, 10))
ids
# array([0, 2, 1, 0, 2, 1, 0, 2, 1, 0])

states
# array(['A', 'C', 'B', 'A', 'C', 'B', 'A', 'C', 'B', 'A'], dtype='<U1')

"".join(states)
# 'ACBACBACBA'

Hidden Markov Models

Build HMMs from paired observation and state sequences. This example uses a DNA fragment with TATA box annotation:

import mchmm as mc
obs_seq = 'AGACTGCATATATAAGGGGCAGGCTG'
sts_seq = '00000000111111100000000000'
a = mc.HiddenMarkovModel().from_seq(obs_seq, sts_seq)

Unique states and observations are automatically inferred:

a.states
# ['0' '1']

a.observations
# ['A' 'C' 'G' 'T']

The transition probability matrix for all states can be accessed using tp attribute:

a.tp
# [[0.94444444 0.05555556]
#  [0.14285714 0.85714286]]

There is also ep attribute for the emission probability matrix for all states and observations.

a.ep
# [[0.21052632 0.21052632 0.47368421 0.10526316]
#  [0.57142857 0.         0.         0.42857143]]

Converting the emission matrix to Pandas DataFrame:

import pandas as pd
pd.DataFrame(a.ep, index=a.states, columns=a.observations)
#            A         C         G         T
#  0  0.210526  0.210526  0.473684  0.105263
#  1  0.571429  0.000000  0.000000  0.428571

Directed graph of the hidden Markov model:

Hidden Markov Model

Graph can be visualized using graph_make method of HiddenMarkovModel object:

graph = a.graph_make(
    format="png", graph_attr=[("rankdir", "LR"), ("ranksep", "1"), ("rank", "same")]
)
graph.render()

Viterbi Algorithm

Decode the most likely state sequence for new observations:

new_obs = "GGCATTGGGCTATAAGAGGAGCTTG"
vs, vsi = a.viterbi(new_obs)
# states sequence
print("VI", "".join(vs))
# observations
print("NO", new_obs)
# VI 0000000001111100000000000
# NO GGCATTGGGCTATAAGAGGAGCTTG

Baum-Welch Algorithm

Learn HMM parameters from unlabeled sequence data using expectation-maximization:

obs_seq = 'AGACTGCATATATAAGGGGCAGGCTG'
a = hmm.HiddenMarkovModel().from_baum_welch(obs_seq, states=['0', '1'])
# training log: KL divergence values for all iterations

a.log
# {
#     'tp': [0.008646969455670256, 0.0012397829805491124, 0.0003950986109761759],
#     'ep': [0.09078874423746826, 0.0022734816599056084, 0.0010118204023946836],
#     'pi': [0.009030829793043593, 0.016658391248503462, 0.0038894983546756065]
# }

The inferred transition (tp), emission (ep) probability matrices and initial state distribution (pi) can be accessed as shown:

a.ep, a.tp, a.pi

This model can be decoded using Viterbi algorithm:

new_obs = "GGCATTGGGCTATAAGAGGAGCTTG"
vs, vsi = a.viterbi(new_obs)
print("VI", "".join(vs))
print("NO", new_obs)
# VI 0011100001111100000001100
# NO GGCATTGGGCTATAAGAGGAGCTTG

Related Skills

node-connect

339.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

claude-opus-4-5-migration

83.9k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

frontend-design

83.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

model-usage

339.3k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

maximtrp

View profile

View on GitHub

GitHub Stars137

CategoryDevelopment

Updated27d ago

Forks23

maximtrp/mchmm

Languages

Python

Security Score

100/100

Audited on Mar 1, 2026

No findings