AutoMixRegPY

AutoMixRegPY is a Python-based framework for automated mixed-effects regression, designed to model complex hierarchical and grouped data efficiently.

Generate Convert Improve

Install / Use

/learn @Fadhaa/AutoMixRegPY

About this skill

Quality Score

0/100

README

📊 ِAutoMixRegPY

AutoMixReg is an R package for simulating and fitting Mixture of Linear Regressions using the Expectation-Maximization (EM) algorithm. It includes functionality for:

Generating synthetic data from known mixture models Fitting mixture regression models using EM Automatically selecting the optimal number of components using Bayesian Information Criterion (BIC) Running a full model selection and fitting pipeline in a single line This package is useful for unsupervised regression modeling, model-based clustering, and statistical learning in high-dimensional settings.

A lightweight Python package for fitting finite mixture models of linear regression using the Expectation-Maximization (EM) algorithm — built entirely from scratch.

This is useful for modeling data with latent subpopulations, such as in clinical electronic health records (EHR), stratified outcomes, or hidden classes in regression problems.

🔧 Installation

Install the package directly from GitHub:

pip install git+https://github.com/Fadhaa/mixture_regression.git

📦 Module Structure
generate: Generate synthetic regression mixture data

model: Core EM algorithm (MixtureOfLinearRegressions)

bic_selector: Automatically selects the best number of components using BIC

runner: Full pipeline to generate, fit, select, and print results

📘 Full Application Example: Mixture of Linear Regressions with BIC Selection
🧮 Step 1: Import Modules
We begin by importing the necessary modules from the mixture_regression package:

###python

from mixture_regression import generate, model
from mixture_regression import bic_selector, runner
🎯 Step 2: Define Custom Regression Coefficients
Each cluster (component) will have its own set of regression coefficients (betas), including an intercept term. Here we define three clusters with known coefficients:

###python

custom_betas = [
    [0.5, 1.5, 2, 3],   # Cluster 0
    [1, 2, 3, 4],       # Cluster 1
    [2, 3, 4, 6]        # Cluster 2
]
🧪 Step 3: Generate Simulated Data
We generate 1,000 observations with 3 features and 3 clusters using the MixtureRegressionDataGenerator:

###python

generator = generate.MixtureRegressionDataGenerator(
    n_samples=1000,
    n_features=3,
    n_clusters=3,
    cluster_probs=[0.4, 0.3, 0.3],  # Probability of each cluster
    noise_std=1.0,
    random_state=42,
    betas=custom_betas
)

df = generator.generate()
print(generator.get_true_betas())
Output:

[[0.5, 1.5, 2, 3], [1, 2, 3, 4], [2, 3, 4, 6]]
✅ This confirms that the true underlying structure has been used to simulate the data.

🧠 Step 4: Fit the Model and Select Best Number of Components
Now we fit the mixture model and let the algorithm automatically select the best number of components using the Bayesian Information Criterion (BIC):

###python

runner = runner.MixtureModelRunner(df)
runner.run()
🖨️ Output: Model Fitting and BIC Scores


Components: 1, BIC: 4138.50
Components: 2, BIC: 3774.48
Components: 3, BIC: 3731.61
Components: 4, BIC: 3752.48
Components: 5, BIC: 3771.93

✅ Best number of components based on BIC: 3

📊 Final Model (k = 3)
BIC: 3770.84

Component 1:
  Weight (π):     0.2815
  Coefficients (β): [1.9249101  3.10512153 3.91060425 6.04186013]
  Std Dev (σ):     0.9201
----------------------------------------
Component 2:
  Weight (π):     0.4283
  Coefficients (β): [0.42579598 1.46916441 1.98903705 3.1043965 ]
  Std Dev (σ):     0.9366
----------------------------------------
Component 3:
  Weight (π):     0.2901
  Coefficients (β): [1.19721701 1.99912275 3.15040332 3.97245209]
  Std Dev (σ):     0.9920
🔍 Interpretation
✅ The algorithm correctly identifies 3 components as the optimal number using BIC.

📉 BIC decreases until 3 components, then increases, indicating the optimal model complexity.

🧮 The estimated coefficients (β) are close to the true betas you defined:

[0.5, 1.5, 2, 3]

[1, 2, 3, 4]

[2, 3, 4, 6]

🧪 The weights (π) indicate the mixture proportions (close to your original [0.4, 0.3, 0.3]).

📐 The standard deviations (σ) reflect noise in each sub-model.

Related Skills

diffs

337.3k

Use the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.

clearshot

Structured screenshot analysis for UI implementation and critique. Analyzes every UI screenshot with a 5×5 spatial grid, full element inventory, and design system extraction — facts and taste together, every time. Escalates to full implementation blueprint when building. Trigger on any digital interface image file (png, jpg, gif, webp — websites, apps, dashboards, mockups, wireframes) or commands like 'analyse this screenshot,' 'rebuild this,' 'match this design,' 'clone this.' Skip for non-UI images (photos, memes, charts) unless the user explicitly wants to build a UI from them. Does NOT trigger on HTML source code, CSS, SVGs, or any code pasted as text.

openpencil

1.8k

The world's first open-source AI-native vector design tool and the first to feature concurrent Agent Teams. Design-as-Code. Turn prompts into UI directly on the live canvas. A modern alternative to Pencil.

HappyColorBlend

HappyColorBlendVibe Project Guidelines Project Overview HappyColorBlendVibe is a Figma plugin for color palette generation with advanced tint/shade blending capabilities. It allows designers to

Fadhaa

View profile

View on GitHub

GitHub Stars6

CategoryDesign

Updated3mo ago

Forks0

Fadhaa/AutoMixRegPY

Languages

Python

Security Score

67/100

Audited on Dec 11, 2025

No findings