Distfit

distfit is a python library for probability density fitting.

Generate Convert Improve

Install / Use

/learn @erdogant/Distfit

About this skill

Quality Score

0/100

README

<div> <a href="https://erdogant.github.io/distfit/"><img src="https://github.com/erdogant/distfit/blob/master/docs/figs/logo.png" width="250" align="left" /></a> distfit is a Python package for probability density fitting of univariate distributions for random variables. The distfit library can determine the best fit for over 90 theoretical distributions. The goodness-of-fit test is used to score for the best fit and after finding the best-fitted theoretical distribution, the loc, scale, and arg parameters are returned. It can be used for parametric, non-parametric, and discrete distributions. ⭐️Star it if you like it⭐️ </div>

Key Features

| Feature | Description | Medium | Gumroad+Podcast | |---------|-------------|--------|-----------------| | Parametric Fitting | Fit distributions on empirical data X. | Link | Link | | Non-Parametric Fitting | Fit distributions on empirical data X using non-parametric approaches (quantile, percentiles). | - | - | | Multivariate Fitting | Fit multivariate distributions on empirical data X that contains multiple columns. | - | - | | Discrete Fitting | Fit distributions on empirical data X using binomial distribution. | - | - | | Predict | Compute probabilities for response variables y. | - | - | | Outlier Detection | Detect anomalies using fitted distributions. | Link | Link | | Synthetic Data | Generate synthetic data. | Link | Link | | Plots | Various plotting functionalities. | - | - |

Resources and Links

Example Notebooks: Examples
Medium Blogs Medium
Gumroad Blogs with podcast: GumRoad
Documentation: Website
Bug Reports and Feature Requests: GitHub Issues

Background

For the parametric approach, The distfit library can determine the best fit across 89 theoretical distributions. To score the fit, one of the scoring statistics for the good-of-fitness test can be used used, such as RSS/SSE, Wasserstein, Kolmogorov-Smirnov (KS), or Energy. After finding the best-fitted theoretical distribution, the loc, scale, and arg parameters are returned, such as mean and standard deviation for normal distribution.
For the non-parametric approach, the distfit library contains two methods, the quantile and percentile method. Both methods assume that the data does not follow a specific probability distribution. In the case of the quantile method, the quantiles of the data are modeled whereas for the percentile method, the percentiles are modeled.
In case the dataset contains discrete values, the distift library contains the option for discrete fitting. The best fit is then derived using the binomial distribution.

Installation

Install distfit from PyPI

pip install distfit

Install from Github source

pip install git+https://github.com/erdogant/distfit

Imort Library

import distfit
print(distfit.__version__)

# Import library
from distfit import distfit

<hr>

Examples

Example: Quick start to find best fit for your input data


# [distfit] >INFO> fit
# [distfit] >INFO> transform
# [distfit] >INFO> [norm      ] [0.00 sec] [RSS: 0.00108326] [loc=-0.048 scale=1.997]
# [distfit] >INFO> [expon     ] [0.00 sec] [RSS: 0.404237] [loc=-6.897 scale=6.849]
# [distfit] >INFO> [pareto    ] [0.00 sec] [RSS: 0.404237] [loc=-536870918.897 scale=536870912.000]
# [distfit] >INFO> [dweibull  ] [0.06 sec] [RSS: 0.0115552] [loc=-0.031 scale=1.722]
# [distfit] >INFO> [t         ] [0.59 sec] [RSS: 0.00108349] [loc=-0.048 scale=1.997]
# [distfit] >INFO> [genextreme] [0.17 sec] [RSS: 0.00300806] [loc=-0.806 scale=1.979]
# [distfit] >INFO> [gamma     ] [0.05 sec] [RSS: 0.00108459] [loc=-1862.903 scale=0.002]
# [distfit] >INFO> [lognorm   ] [0.32 sec] [RSS: 0.00121597] [loc=-110.597 scale=110.530]
# [distfit] >INFO> [beta      ] [0.10 sec] [RSS: 0.00105629] [loc=-16.364 scale=32.869]
# [distfit] >INFO> [uniform   ] [0.00 sec] [RSS: 0.287339] [loc=-6.897 scale=14.437]
# [distfit] >INFO> [loggamma  ] [0.12 sec] [RSS: 0.00109042] [loc=-370.746 scale=55.722]
# [distfit] >INFO> Compute confidence intervals [parametric]
# [distfit] >INFO> Compute significance for 9 samples.
# [distfit] >INFO> Multiple test correction method applied: [fdr_bh].
# [distfit] >INFO> Create PDF plot for the parametric method.
# [distfit] >INFO> Mark 5 significant regions
# [distfit] >INFO> Estimated distribution: beta [loc:-16.364265, scale:32.868811]

Example: Plot summary of the tested distributions

The distfit library provides multivariate distribution fitting that enables modeling complex dependencies between multiple variables using copula-based methods.

  from distfit import distfit
  
  # Initialize with multivariate mode
  dfit = distfit(multivariate=True)
  
  # Load example data
  X = dfit.import_example(data='multi_normal')
  # X = dfit.import_example(data='multi_t')
  
  # Fit model
  dfit.fit_transform(X)
  
  # Access estimated correlation matrix (Gaussian copula)
  print(dfit.model.corr)
  
  # Evaluate joint density
  results = dfit.evaluate_pdf(X)
  print(results['score'])
  print(results['copula_density'])
  
  # Generate synthetic samples
  Xnew = dfit.generate(n=10)
  
  # Detect multivariate outliers
  bool_outliers = dfit.predict_outliers(X)

<p align="left"> <a href="https://erdogant.github.io/distfit/pages/html/multivariate.html"> <img src="https://github.com/erdogant/distfit/blob/master/docs/figs/copulaDensity_uniformB.png" width="800" /> </a> </p <p align="left"> <a href="https://erdogant.github.io/distfit/pages/html/multivariate.html"> <img src="https://github.com/erdogant/distfit/blob/master/docs/figs/jointDensity.png" width="800" /> </a> </p

Example: Plot summary of the tested distributions

After we have a fitted model, we can make some predictions using the theoretical distributions. After making some predictions, we can plot again but now the predictions are automatically included.

<p align="left"> <a href="https://erdogant.github.

Related Skills

node-connect

338.7k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

83.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

summarize

338.7k

Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).

feishu-doc

338.7k