AutoMixRegPY
AutoMixRegPY is a Python-based framework for automated mixed-effects regression, designed to model complex hierarchical and grouped data efficiently.
Install / Use
/learn @Fadhaa/AutoMixRegPYREADME
📊 ِAutoMixRegPY
AutoMixReg is an R package for simulating and fitting Mixture of Linear Regressions using the Expectation-Maximization (EM) algorithm. It includes functionality for:
Generating synthetic data from known mixture models Fitting mixture regression models using EM Automatically selecting the optimal number of components using Bayesian Information Criterion (BIC) Running a full model selection and fitting pipeline in a single line This package is useful for unsupervised regression modeling, model-based clustering, and statistical learning in high-dimensional settings.
A lightweight Python package for fitting finite mixture models of linear regression using the Expectation-Maximization (EM) algorithm — built entirely from scratch.
This is useful for modeling data with latent subpopulations, such as in clinical electronic health records (EHR), stratified outcomes, or hidden classes in regression problems.
🔧 Installation
Install the package directly from GitHub:
pip install git+https://github.com/Fadhaa/mixture_regression.git
📦 Module Structure
generate: Generate synthetic regression mixture data
model: Core EM algorithm (MixtureOfLinearRegressions)
bic_selector: Automatically selects the best number of components using BIC
runner: Full pipeline to generate, fit, select, and print results
📘 Full Application Example: Mixture of Linear Regressions with BIC Selection
🧮 Step 1: Import Modules
We begin by importing the necessary modules from the mixture_regression package:
###python
from mixture_regression import generate, model
from mixture_regression import bic_selector, runner
🎯 Step 2: Define Custom Regression Coefficients
Each cluster (component) will have its own set of regression coefficients (betas), including an intercept term. Here we define three clusters with known coefficients:
###python
custom_betas = [
[0.5, 1.5, 2, 3], # Cluster 0
[1, 2, 3, 4], # Cluster 1
[2, 3, 4, 6] # Cluster 2
]
🧪 Step 3: Generate Simulated Data
We generate 1,000 observations with 3 features and 3 clusters using the MixtureRegressionDataGenerator:
###python
generator = generate.MixtureRegressionDataGenerator(
n_samples=1000,
n_features=3,
n_clusters=3,
cluster_probs=[0.4, 0.3, 0.3], # Probability of each cluster
noise_std=1.0,
random_state=42,
betas=custom_betas
)
df = generator.generate()
print(generator.get_true_betas())
Output:
[[0.5, 1.5, 2, 3], [1, 2, 3, 4], [2, 3, 4, 6]]
✅ This confirms that the true underlying structure has been used to simulate the data.
🧠 Step 4: Fit the Model and Select Best Number of Components
Now we fit the mixture model and let the algorithm automatically select the best number of components using the Bayesian Information Criterion (BIC):
###python
runner = runner.MixtureModelRunner(df)
runner.run()
🖨️ Output: Model Fitting and BIC Scores
Components: 1, BIC: 4138.50
Components: 2, BIC: 3774.48
Components: 3, BIC: 3731.61
Components: 4, BIC: 3752.48
Components: 5, BIC: 3771.93
✅ Best number of components based on BIC: 3
📊 Final Model (k = 3)
BIC: 3770.84
Component 1:
Weight (π): 0.2815
Coefficients (β): [1.9249101 3.10512153 3.91060425 6.04186013]
Std Dev (σ): 0.9201
----------------------------------------
Component 2:
Weight (π): 0.4283
Coefficients (β): [0.42579598 1.46916441 1.98903705 3.1043965 ]
Std Dev (σ): 0.9366
----------------------------------------
Component 3:
Weight (π): 0.2901
Coefficients (β): [1.19721701 1.99912275 3.15040332 3.97245209]
Std Dev (σ): 0.9920
🔍 Interpretation
✅ The algorithm correctly identifies 3 components as the optimal number using BIC.
📉 BIC decreases until 3 components, then increases, indicating the optimal model complexity.
🧮 The estimated coefficients (β) are close to the true betas you defined:
[0.5, 1.5, 2, 3]
[1, 2, 3, 4]
[2, 3, 4, 6]
🧪 The weights (π) indicate the mixture proportions (close to your original [0.4, 0.3, 0.3]).
📐 The standard deviations (σ) reflect noise in each sub-model.
Related Skills
diffs
337.3kUse the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.
clearshot
Structured screenshot analysis for UI implementation and critique. Analyzes every UI screenshot with a 5×5 spatial grid, full element inventory, and design system extraction — facts and taste together, every time. Escalates to full implementation blueprint when building. Trigger on any digital interface image file (png, jpg, gif, webp — websites, apps, dashboards, mockups, wireframes) or commands like 'analyse this screenshot,' 'rebuild this,' 'match this design,' 'clone this.' Skip for non-UI images (photos, memes, charts) unless the user explicitly wants to build a UI from them. Does NOT trigger on HTML source code, CSS, SVGs, or any code pasted as text.
openpencil
1.8kThe world's first open-source AI-native vector design tool and the first to feature concurrent Agent Teams. Design-as-Code. Turn prompts into UI directly on the live canvas. A modern alternative to Pencil.
HappyColorBlend
HappyColorBlendVibe Project Guidelines Project Overview HappyColorBlendVibe is a Figma plugin for color palette generation with advanced tint/shade blending capabilities. It allows designers to
