SkillAgentSearch skills...

EMpht.jl

[Julia Package] Fitting Phase-Type Distributions using an EM Algorithm

Install / Use

/learn @Pat-Laub/EMpht.jl
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

EMpht.jl

A Julia port of the EMpht.c program, used for fitting Phase-Type distributions via an EM algorithm.

The original C — which is available on Søren Asmussen's website — is well documented and has a decent performance for phase-type distributions with a small or moderate number of phases. However it is quite slow for when the number of phases is large (>= 20), and the UX is very old-school Unix.

This port is much simpler and faster. See the examples given below and the test cases.

Examples

To fit a phase-type distribution to some data:

using Distributions
using EMpht

data = rand(Exponential(1/10), 1_000)  # Generate some data to fit 
sample = EMpht.Sample(obs=data)        # Create an EMpht Sample object with this data
ph = empht(sample, p=5)                # Fit the data using p=5 phases

xGrid = range(0, 8, length=1_00)       # Create a grid to evaluate the density function over
fitPDFs = pdf.(ph, xGrid)              # The probability density function of the fitted phase-type

The default structure of the phase-type is "Coxian" (see below for details). For large values of p the "CanonicalForm1" is recommended. To impose no structure on the phase-type, use "General", though the results degrade quickly with p > 5 or so. Another available option is "GeneralisedCoxian".

phGen = empht(sample, p=20, ph_structure="General")
phCox = empht(sample, p=20, ph_structure="Coxian")
phCF1 = empht(sample, p=50, ph_structure="CanonicalForm1")

If the data is not fully observed, i.e. the data is binned (interval-censored), then the Sample object is updated like so:

# The intervals
int = [1.5 2.0; 2.0 2.5; 2.5 3.0; 3.0 3.5; 3.5 4.0; 4.0 4.5; 4.5 5.0; 5.0 5.5;
        5.5 6.0; 6.0 6.5; 6.5 7.0; 7.0 7.5]
# The number of observations falling into each interval
intweight = [4.0, 34.0, 107.0, 170.0, 202.0, 222.0, 140.0, 77.0, 24.0, 14.0,
        4.0, 2.0]
 # Create an EMpht Sample object with this data
sInt = EMpht.Sample(int=int, intweight=intweight)

# Fitting the interval-censored data
phCF1 = empht(sInt, p=100, ph_structure="CanonicalForm1")
xGrid = range(0, 8, length=1_000)
fitPDFs = pdf.(phCF1, xGrid)

To choose the algorithm used to fit the data (see papers below for details):

phunif = empht(sample, p=5, method=:unif)  # Fit using the uniformization technique (default)
phode = empht(sample, p=5, method=:ode)    # Fit using the more traditional ODE solving technique

EMpht.jl can read all necessary information from a JSON file (the number of phases to fit, the special structure of the phase-type, the sample to fit). For example, if you download the Coxian100.json file inside the test directory, the following will launch a fit based on those parameters:

ph100 = empht("Coxian100.json")

Resources

The relevant papers for the algorithms are:

  • S. Asmussen, O. Nerman & M. Olsson, Fitting phase-type distribution via the EM algorithm, Scandinavian Journal of Statistics 23, 419-441 (1996),
  • M. Olsson, Estimation of phase-type distributions from censored data, Scandinavian Journal of Statistics 23, 443-460 (1996).
  • H. Okamura, T. Dohi, K.S. Trivedi, A refined EM algorithm for PH distributions, Performance Evaluation 68, 938-954 (2011)
  • H. Okamura, T. Dohi, K.S. Trivedi, Improvement of expectation-maximization algorithm for phase-type distributions with grouped and truncated data, Appl. Stochastic Models Bus. Ind. 29, 141-156 (2013)

Some case studies using this package are:

  • S. Asmussen, P.J. Laub, H. Yang, Phase-type models in life insurance: fitting and valuation of equity-linked benefits, Risks 7(1), 17 pages (2019)
  • A. Vuorinen, The blockchain propagation process: a machine learning and matrix analytic approach, University of Melbourne Masters Thesis (2019), see website or thesis.

Related Skills

View on GitHub
GitHub Stars13
CategoryDevelopment
Updated8mo ago
Forks5

Languages

Julia

Security Score

67/100

Audited on Jul 12, 2025

No findings