Pysr3
SciKit-Learn compatible library for training mixed-effects models.
Install / Use
/learn @aksholokhov/Pysr3README
Quickstart with pysr3
SR3 is a relaxation method designed for accurate feature selection. It currently supports:
- Linear Models (L0, LASSO, A-LASSO, CAD, SCAD)
- Linear Mixed-Effect Models (L0, LASSO, A-LASSO, CAD, SCAD)
Installation
pysr3 can be installed via
pip install pysr3>=0.3.5
from pysr3.__about__ import __version__
print(f"This tutorial was generated using PySR3 v{__version__}\n"
"You might see slightly different numerical results if you are using a different version of the library.")
This tutorial was generated using PySR3 v0.3.5
You might see slightly different numerical results if you are using a different version of the library.
Requirements
Make sure that Python 3.6 or higher is installed. The package has the following dependencies, as listed in requirements.txt:
- numpy>=1.21.1
- pandas>=1.3.1
- scipy>=1.7.1
- PyYAML>=5.4.1
- scikit_learn>=0.24.2
Usage
pysr3 models are fully compatible to sklearn standards, so you can use them as you normally would use a sklearn model.
Linear Models
A simple example of using SR3-empowered LASSO for feature selection is shown below.
import numpy as np
from pysr3.linear.problems import LinearProblem
# Create a sample dataset
seed = 42
num_objects = 300
num_features = 500
np.random.seed(seed)
# create a vector of true model's coefficients
true_x = np.random.choice(2, size=num_features, p=np.array([0.9, 0.1]))
# create sample data
a = 10 * np.random.randn(num_objects, num_features)
b = a.dot(true_x) + np.random.randn(num_objects)
print(f"The dataset has {a.shape[0]} objects and {a.shape[1]} features; \n"
f"The vector of true parameters contains {sum(true_x != 0)} non-zero elements out of {num_features}.")
The dataset has 300 objects and 500 features;
The vector of true parameters contains 55 non-zero elements out of 500.
First, let's fit a model with a fixed parameter lambda:
from pysr3.linear.models import LinearL1ModelSR3
from sklearn.metrics import confusion_matrix
lam = 0.1*np.max(np.abs(a.T.dot(b)))
model = LinearL1ModelSR3(lam=lam, el=1e5)
%%timeit
model.fit(a, b)
38.6 ms ± 236 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
maybe_x = model.coef_['x']
tn, fp, fn, tp = confusion_matrix(true_x, np.abs(maybe_x) > np.sqrt(model.tol_solver)).ravel()
print(f"The model found {tp} out of {tp + fn} features correctly, but also chose {fp} out of {tn+fp} extra irrelevant features. \n")
The model found 55 out of 55 features correctly, but also chose 5 out of 445 extra irrelevant features.
Now let's see if we can improve it by adding grid-search:
# Automatic features selection using information criterion
from pysr3.linear.models import LinearL1ModelSR3
from sklearn.model_selection import RandomizedSearchCV
from sklearn.utils.fixes import loguniform
# Here we use SR3-empowered LASSO, but many other popular regularizers are also available
# See the glossary of models for more details.
model = LinearL1ModelSR3()
# We will search for the best model over the range of strengths for the regularizer
params = {
"lam": loguniform(1e-1, 1e2)
}
selector = RandomizedSearchCV(estimator=model,
param_distributions=params,
n_iter=50,
# The function below evaluates an information criterion
# on the test portion of CV-splits.
scoring=lambda clf, x, y: -clf.get_information_criterion(x, y, ic='bic'))
selector.fit(a, b)
maybe_x = selector.best_estimator_.coef_['x']
tn, fp, fn, tp = confusion_matrix(true_x, np.abs(maybe_x) > np.sqrt(model.tol_solver)).ravel()
print(f"The model found {tp} out of {tp + fn} features correctly, but also chose {fp} out of {tn+fp} extra irrelevant features. \n"
f"The best parameter is {selector.best_params_}")
The model found 55 out of 55 features correctly, but also chose 1 out of 445 extra irrelevant features.
The best parameter is {'lam': 0.15055187290939537}
Note that the discovered coefficients will be biased downwards due to L1 regularization.
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
indep = list(range(num_features))
ax.plot(indep, maybe_x, label='Discovered Coefficients')
ax.plot(indep, true_x, alpha=0.5, label='True Coefficients')
ax.legend(bbox_to_anchor=(1.05, 1))
plt.show()

You can get rid of the bias by refitting the model using only features that were selected.
Linear Mixed-Effects Models
Below we show how to use Linear Mixed-Effects (LME) models for simultaneous selection of fixed and random effects.
from pysr3.lme.models import L1LmeModelSR3
from pysr3.lme.problems import LMEProblem, LMEStratifiedShuffleSplit
# Here we generate a random linear mixed-effects problem.
# To use your own dataset check LMEProblem.from_dataframe and LMEProblem.from_x_y
problem, true_parameters = LMEProblem.generate(
groups_sizes=[10] * 8, # 8 groups, 10 objects each
features_labels=["fixed+random"] * 20, # 20 features, each one having both fixed and random components
beta=np.array([0, 1] * 10), # True beta (fixed effects) has every other coefficient active
gamma=np.array([0, 0, 0, 1] * 5), # True gamma (variances of random effects) has every fourth coefficient active
obs_var=0.1, # The errors have standard errors of sqrt(0.1) ~= 0.33
seed=seed # random seed, for reproducibility
)
# LMEProblem provides a very convenient representation
# of the problem. See the documentation for more details.
# It also can be converted to a more familiar representation
x, y, columns_labels = problem.to_x_y()
# columns_labels describe the roles of the columns in x:
# fixed effect, random effect, or both of those, as well as groups labels and observation standard deviation.
# You can also convert it to pandas dataframe if you'd like.
pandas_dataframe = problem.to_dataframe()
# We use SR3-empowered LASSO model, but many other popular models are also available.
# See the glossary of models for more details.
model = L1LmeModelSR3(practical=True)
# We're going to select features by varying the strength of the prior
# and choosing the model that yields the best information criterion
# on the validation set.
params = {
"lam": loguniform(1e-3, 1e2),
"ell": loguniform(1e-1, 1e2)
}
# We use standard functionality of sklearn to perform grid-search.
selector = RandomizedSearchCV(estimator=model,
param_distributions=params,
n_iter=30, # number of points from parameters space to sample
# the class below implements CV-splits for LME models
cv=LMEStratifiedShuffleSplit(n_splits=2, test_size=0.5,
random_state=seed,
columns_labels=columns_labels),
# The function below will evaluate the information criterion
# on the test-sets during cross-validation.
# We use cAIC from Vaida, but other options (BIC, Muller's IC) are also available
scoring=lambda clf, x, y: -clf.get_information_criterion(x,
y,
columns_labels=columns_labels,
ic="vaida_aic"),
random_state=seed,
n_jobs=20
)
selector.fit(x, y, columns_labels=columns_labels)
best_model = selector.best_estimator_
maybe_beta = best_model.coef_["beta"]
maybe_gamma = best_model.coef_["gamma"]
# Since the solver stops witin sqrt(tol) from the minimum, we use it as a criterion for whether the feature
# is selected or not
ftn, ffp, ffn, ftp = confusion_matrix(y_true=true_parameters["beta"],
y_pred=abs(maybe_beta) > np.sqrt(best_model.tol_solver)
).ravel()
rtn, rfp, rfn, rtp = confusion_matrix(y_true=true_parameters["gamma"],
y_pred=abs(maybe_gamma) > np.sqrt(best_model.tol_solver)
).ravel()
print(
f"The model found {ftp} out of {ftp + ffn} correct fixed features, and also chose {ffp} out of {ftn + ffp} extra irrelevant fixed features. \n"
f"It also identified {rtp} out of {rtp + rfn} random effects correctly, and got {rfp} out of {rtn + rfp} non-present random effects. \n"
f"The best sparsity parameter is {selector.best_params_}")
The model found 10 out of 10 correct fixed features, and also chose
Related Skills
node-connect
351.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
351.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
351.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
