SkillAgentSearch skills...

Statistics

PHP package that provides functions for calculating mathematical statistics of numeric data.

Install / Use

/learn @Hi-Folks/Statistics
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<p align="center"> <img src="https://repository-images.githubusercontent.com/445609326/e2539776-0f8f-4556-be1d-887ea2368813" alt="PHP package for Statistics"> </p> <h1 align="center"> Statistics PHP package </h1> <p align=center> <a href="https://packagist.org/packages/hi-folks/statistics"> <img src="https://img.shields.io/packagist/v/hi-folks/statistics.svg?style=for-the-badge" alt="Latest Version on Packagist"> </a> <a href="https://packagist.org/packages/hi-folks/statistics"> <img src="https://img.shields.io/packagist/dt/hi-folks/statistics.svg?style=for-the-badge" alt="Total Downloads"> </a> <br> <a href="https://github.com/Hi-Folks/statistics/blob/main/.github/workflows/static-code-analysis.yml"> <img src="https://img.shields.io/badge/PHPStan-level%208-brightgreen.svg?style=for-the-badge" alt="Static Code analysis"> </a> <img src="https://img.shields.io/packagist/l/hi-folks/statistics?style=for-the-badge" alt="Packagist License"> <br> <img src="https://img.shields.io/packagist/php-v/hi-folks/statistics?style=for-the-badge" alt="Packagist PHP Version Support"> <img src="https://img.shields.io/github/last-commit/hi-folks/statistics?style=for-the-badge" alt="GitHub last commit"> </p> <p align=center> <a href="https://github.com/hi-folks/statistics/actions/workflows/run-tests.yml"> <img src="https://github.com/hi-folks/statistics/actions/workflows/run-tests.yml/badge.svg?branch=main&style=for-the-badge" alt="Tests"> </a> </p> <p align=center> <i> A PHP package for descriptive statistics, normal distribution, outlier detection, and streaming analytics on numeric data. </i> </p>

This package provides a comprehensive set of statistical functions for PHP: descriptive statistics (mean, median, mode, standard deviation, variance, quantiles), robust measures (trimmed mean, weighted median, median absolute deviation), distribution modelling (normal distribution with PDF, CDF, and inverse CDF), outlier detection (z-score and IQR-based), z-scores, percentiles, coefficient of variation, frequency tables, correlation, regression (linear, logarithmic, power, and exponential), kernel density estimation, and O(1) memory streaming statistics.

It works with any numeric dataset — from sports telemetry and sensor data to race results, survey responses, and financial time series.

Articles and resources:

This package is inspired by the Python statistics module

Installation

You can install the package via composer:

composer require hi-folks/statistics

Usage

Stat class

Stat class has methods to calculate an average or typical value from a population or sample. This class provides methods for calculating mathematical statistics of numeric data. The various mathematical statistics are listed below:

| Mathematical Statistic | Description | | ---------------------- | ----------- | | mean() | arithmetic mean or "average" of data | | fmean() | floating-point arithmetic mean, with optional weighting and precision | | trimmedMean() | trimmed (truncated) mean — mean after removing outliers from each side | | median() | median or "middle value" of data | | weightedMedian() | weighted median — median with weights, where each value has a different importance | | medianLow() | low median of data | | medianHigh() | high median of data | | medianGrouped() | median of grouped data, using interpolation | | mode() | single mode (most common value) of discrete or nominal data | | multimode() | list of modes (most common values) of discrete or nominal data | | quantiles() | cut points dividing the range of a probability distribution into continuous intervals with equal probabilities (supports exclusive and inclusive methods) | | thirdQuartile() | 3rd quartile, is the value at which 75 percent of the data is below it | | firstQuartile() | first quartile, is the value at which 25 percent of the data is below it | | percentile() | value at any percentile (0–100) with linear interpolation | | pstdev() | Population standard deviation | | stdev() | Sample standard deviation | | sem() | Standard error of the mean (SEM) — measures precision of the sample mean | | meanAbsoluteDeviation() | mean absolute deviation (MAD) — average distance from the mean | | medianAbsoluteDeviation() | median absolute deviation — median distance from the median, robust to outliers | | pvariance() | variance for a population (supports pre-computed mean via mu) | | variance() | variance for a sample (supports pre-computed mean via xbar) | | skewness() | adjusted Fisher-Pearson sample skewness | | pskewness() | population (biased) skewness | | kurtosis() | excess kurtosis (sample formula, 0 for normal distribution) | | coefficientOfVariation() | coefficient of variation (CV%), relative dispersion as percentage | | zscores() | z-scores for each value — how many standard deviations from the mean | | outliers() | outlier detection based on z-score threshold | | iqrOutliers() | outlier detection based on IQR method (box plot whiskers), robust for skewed data | | geometricMean() | geometric mean | | harmonicMean() | harmonic mean | | correlation() | Pearson’s or Spearman’s rank correlation coefficient for two inputs | | covariance() | the sample covariance of two inputs | | linearRegression() | return the slope and intercept of simple linear regression parameters estimated using ordinary least squares (supports proportional: true for regression through the origin) | | logarithmicRegression() | logarithmic regression — fits y = a × ln(x) + b, ideal for diminishing returns patterns (e.g., athletic improvement, learning curves) | | powerRegression() | power regression — fits y = a × x^b, useful for power law relationships | | exponentialRegression() | exponential regression — fits y = a × e^(b×x), useful for exponential growth or decay | | rSquared() | coefficient of determination (R²) — proportion of variance explained by linear regression | | confidenceInterval() | confidence interval for the mean using the normal (z) distribution | | zTest() | one-sample Z-test — tests whether the sample mean differs significantly from a hypothesized population mean | | tTest() | one-sample t-test — like z-test but appropriate for small samples where the population standard deviation is unknown | | tTestTwoSample() | two-sample independent t-test (Welch's) — compares the means of two independent groups without assuming equal variances | | tTestPaired() | paired t-test — tests whether the mean difference between paired observations is significantly different from zero | | kde() | kernel density estimation — returns a closure that estimates the probability density (or CDF) at any point | | kdeRandom() | random sampling from a kernel density estimate — returns a closure that generates random floats from the KDE distribution |

Stat::mean( array $data )

Return the sample arithmetic mean of the array $data. The arithmetic mean is the sum of the data divided by the number of data points. It is commonly called “the average”, although it is only one of many mathematical averages. It is a measure of the central location of the data.

use HiFolks\Statistics\Stat;
$mean = Stat::mean([1, 2, 3, 4, 4]);
// 2.8
$mean = Stat::mean([-1.0, 2.5, 3.25, 5.75]);
// 2.625

Stat::fmean( array $data, array|null $weights = null, int|null $precision = null )

Return the arithmetic mean of the array $data, as a float, with optional weights and precision control. This function behaves like mean() but ensures a floating-point result and supports weighted datasets. If $weights is provided, it computes the weighted average. The result is rounded to a given decimal $precision. The result is rounded to $precision decimal places. If $precision is null, no rounding is applied — this may lead to results with long or unexpected decimal expansions due to the nature of floating-point arithmetic in PHP. Using rounding helps ensure cleaner, more predictable output.

use HiFolks\Statistics\Stat;

// Unweighted mean (same as mean but always float)
$fmean = Stat::fmean([3.5, 4.0, 5.25]);
// 4.25

// Weighted mean
$fmean = Stat::fmean([3.5, 4.0, 5.25], [1, 2, 1]);
// 4.1875

// Custom precision
$fmean = Stat::fmean([3.5, 4.0, 5.25], null, 2);
// 4.25

$fmean = Stat::fmean([3.5, 4.0, 5.25], [1, 2, 1], 3);
// 4.188

If the input is empty, or weights are invalid (e.g., length mismatch or sum is zero), an exception is thrown. Use this function when you need floating-point accuracy or to apply custom weighting and rounding to your average.

Stat::trimmedMean( array $data, float $proportionToCut = 0.1, ?int $round = null )

Return the trimmed (truncated) mean of the data. Computes the mean after removing the lowest and highest fraction of values. This is a robust measure of central tendency, less sensitive to outliers than the regular mean.

The $proportionToCut parameter specifies the fraction to trim from each side (must be in the range [0, 0.5)). For example, 0.1 removes the bottom 10% and top 10%.

use HiFolks\Statistics\Stat;
$mean = Stat::trimmedMean([1, 2, 3, 4, 5, 6, 7, 8, 9, 100], 0.1);
// 5.5 (outlier 100 and lowest value 1 removed)

$mean = Stat::trimmedMean([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 0.2);
// 5.5 (removes 2 values from each side)

$mean = Stat::trimmedMean([1,

Related Skills

View on GitHub
GitHub Stars399
CategoryDevelopment
Updated8d ago
Forks29

Languages

PHP

Security Score

100/100

Audited on Mar 29, 2026

No findings