SaMUraiS
StAtistical Models for the UnsupeRvised segmentAion of tIme-Series
Install / Use
/learn @fchamroukhi/SaMUraiSQuality Score
Category
Data & AnalyticsSupported Platforms
Tags
README
SaMUraiS: StAtistical Models for the UnsupeRvised segmentAtIon of time-Series
<!-- badges: start -->samurais is an open source toolbox (available in R and in Matlab) including many original and flexible user-friendly statistical latent variable models and unsupervised algorithms to segment and represent, time-series data (univariate or multivariate), and more generally, longitudinal data which include regime changes.
Our samurais use mainly the following efficient "sword" packages to segment data: Regression with Hidden Logistic Process (RHLP), Hidden Markov Model Regression (HMMR), Piece-Wise regression (PWR), Multivariate 'RHLP' (MRHLP), and Multivariate 'HMMR' (MHMMR).
The models and algorithms are developed and written in Matlab by Faicel Chamroukhi, and translated and designed into R packages by Florian Lecocq, Marius Bartcus and Faicel Chamroukhi.
Installation
You can install the samurais package from GitHub with:
# install.packages("devtools")
devtools::install_github("fchamroukhi/SaMUraiS")
To build vignettes for examples of usage, type the command below instead:
# install.packages("devtools")
devtools::install_github("fchamroukhi/SaMUraiS",
build_opts = c("--no-resave-data", "--no-manual"),
build_vignettes = TRUE)
Use the following command to display vignettes:
browseVignettes("samurais")
Usage
library(samurais)
<details> <summary>RHLP</summary>
# Application to a toy data set
data("univtoydataset")
x <- univtoydataset$x
y <- univtoydataset$y
K <- 5 # Number of regimes (mixture components)
p <- 3 # Dimension of beta (order of the polynomial regressors)
q <- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter = 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE
rhlp <- emRHLP(X = x, Y = y, K, p, q, variance_type, n_tries,
max_iter, threshold, verbose, verbose_IRLS)
#> EM: Iteration : 1 || log-likelihood : -2119.27308534609
#> EM: Iteration : 2 || log-likelihood : -1149.01040321999
#> EM: Iteration : 3 || log-likelihood : -1118.20384281234
#> EM: Iteration : 4 || log-likelihood : -1096.88260636121
#> EM: Iteration : 5 || log-likelihood : -1067.55719357295
#> EM: Iteration : 6 || log-likelihood : -1037.26620122646
#> EM: Iteration : 7 || log-likelihood : -1022.71743069484
#> EM: Iteration : 8 || log-likelihood : -1006.11825447077
#> EM: Iteration : 9 || log-likelihood : -1001.18491883952
#> EM: Iteration : 10 || log-likelihood : -1000.91250763556
#> EM: Iteration : 11 || log-likelihood : -1000.62280600209
#> EM: Iteration : 12 || log-likelihood : -1000.3030988811
#> EM: Iteration : 13 || log-likelihood : -999.932334880131
#> EM: Iteration : 14 || log-likelihood : -999.484219706691
#> EM: Iteration : 15 || log-likelihood : -998.928118038989
#> EM: Iteration : 16 || log-likelihood : -998.234244664472
#> EM: Iteration : 17 || log-likelihood : -997.359536276056
#> EM: Iteration : 18 || log-likelihood : -996.152654857298
#> EM: Iteration : 19 || log-likelihood : -994.697863447307
#> EM: Iteration : 20 || log-likelihood : -993.186583974542
#> EM: Iteration : 21 || log-likelihood : -991.81352379631
#> EM: Iteration : 22 || log-likelihood : -990.611295217008
#> EM: Iteration : 23 || log-likelihood : -989.539226273251
#> EM: Iteration : 24 || log-likelihood : -988.55311887915
#> EM: Iteration : 25 || log-likelihood : -987.539963690533
#> EM: Iteration : 26 || log-likelihood : -986.073920116541
#> EM: Iteration : 27 || log-likelihood : -983.263549878169
#> EM: Iteration : 28 || log-likelihood : -979.340492188909
#> EM: Iteration : 29 || log-likelihood : -977.468559852711
#> EM: Iteration : 30 || log-likelihood : -976.653534236095
#> EM: Iteration : 31 || log-likelihood : -976.5893387433
#> EM: Iteration : 32 || log-likelihood : -976.589338067237
rhlp$summary()
#> ---------------------
#> Fitted RHLP model
#> ---------------------
#>
#> RHLP model with K = 5 components:
#>
#> log-likelihood nu AIC BIC ICL
#> -976.5893 33 -1009.589 -1083.959 -1083.176
#>
#> Clustering table (Number of observations in each regimes):
#>
#> 1 2 3 4 5
#> 100 120 200 100 150
#>
#> Regression coefficients:
#>
#> Beta(K = 1) Beta(K = 2) Beta(K = 3) Beta(K = 4) Beta(K = 5)
#> 1 6.031875e-02 -5.434903 -2.770416 120.7699 4.027542
#> X^1 -7.424718e+00 158.705091 43.879453 -474.5888 13.194261
#> X^2 2.931652e+02 -650.592347 -94.194780 597.7948 -33.760603
#> X^3 -1.823560e+03 865.329795 67.197059 -244.2386 20.402153
#>
#> Variances:
#>
#> Sigma2(K = 1) Sigma2(K = 2) Sigma2(K = 3) Sigma2(K = 4) Sigma2(K = 5)
#> 1.220624 1.110243 1.079394 0.9779734 1.028332
rhlp$plot()
<img src="man/figures/README-unnamed-chunk-6-1.png" style="display: block; margin: auto;" /><img src="man/figures/README-unnamed-chunk-6-2.png" style="display: block; margin: auto;" /><img src="man/figures/README-unnamed-chunk-6-3.png" style="display: block; margin: auto;" />
# Application to a real data set
data("univrealdataset")
x <- univrealdataset$x
y <- univrealdataset$y2
K <- 5 # Number of regimes (mixture components)
p <- 3 # Dimension of beta (order of the polynomial regressors)
q <- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter = 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE
rhlp <- emRHLP(X = x, Y = y, K, p, q, variance_type, n_tries,
max_iter, threshold, verbose, verbose_IRLS)
#> EM: Iteration : 1 || log-likelihood : -3321.6485760125
#> EM: Iteration : 2 || log-likelihood : -2286.48632282875
#> EM: Iteration : 3 || log-likelihood : -2257.60498391374
#> EM: Iteration : 4 || log-likelihood : -2243.74506764308
#> EM: Iteration : 5 || log-likelihood : -2233.3426635247
#> EM: Iteration : 6 || log-likelihood : -2226.89953345319
#> EM: Iteration : 7 || log-likelihood : -2221.77999023589
#> EM: Iteration : 8 || log-likelihood : -2215.81305295291
#> EM: Iteration : 9 || log-likelihood : -2208.25998029539
#> EM: Iteration : 10 || log-likelihood : -2196.27872403055
#> EM: Iteration : 11 || log-likelihood : -2185.40049009242
#> EM: Iteration : 12 || log-likelihood : -2180.13934245387
#> EM: Iteration : 13 || log-likelihood : -2175.4276274402
#> EM: Iteration : 14 || log-likelihood : -2170.86113669353
#> EM: Iteration : 15 || log-likelihood : -2165.34927170608
#> EM: Iteration : 16 || log-likelihood : -2161.12419211511
#> EM: Iteration : 17 || log-likelihood : -2158.63709280617
#> EM: Iteration : 18 || log-likelihood : -2156.19846850913
#> EM: Iteration : 19 || log-likelihood : -2154.04107470071
#> EM: Iteration : 20 || log-likelihood : -2153.24544245686
#> EM: Iteration : 21 || log-likelihood : -2151.74944795242
#> EM: Iteration : 22 || log-likelihood : -2149.90781423151
#> EM: Iteration : 23 || log-likelihood : -2146.40042232588
#> EM: Iteration : 24 || log-likelihood : -2142.37530025533
#> EM: Iteration : 25 || log-likelihood : -2134.85493291884
#> EM: Iteration : 26 || log-likelihood : -2129.67399002071
#> EM: Iteration : 27 || log-likelihood : -2126.44739300481
#> EM: Iteration : 28 || log-likelihood : -2124.94603052064
#> EM: Iteration : 29 || log-likelihood : -2122.51637426267
#> EM: Iteration : 30 || log-likelihood : -2121.01493646146
#> EM: Iteration : 31 || log-likelihood : -2118.45402063643
#> EM: Iteration : 32 || log-likelihood : -2116.9336204919
#> EM: Iteration : 33 || log-likelihood : -2114.34424563452
#> EM: Iteration : 34 || log-likelihood : -2112.84844186712
#> EM: Iteration : 35 || log-likelihood : -2110.34494568025
#> EM: Iteration : 36 || log-likelihood : -2108.81734757025
#> EM: Iteration : 37 || log-likelihood : -2106.26527191053
#> EM: Iteration : 38 || log-likelihood : -2104.96591147986
#> EM: Iteration : 39 || log-likelihood : -2102.43927829964
#> EM: Iteration : 40 || log-likelihood : -2101.27820194404
#> EM: Iteration : 41 || log-likelihood : -2098.81151697567
#> EM: Iteration : 42 || log-likelihood : -2097.48008514591
#> EM: Iteration : 43 || log-likelihood : -2094.98259556552
#> EM: Iteration : 44 || log-likelihood : -2093.66517040802
#> EM: Iteration : 45 || log-likelihood : -2091.23625905564
#> EM: Iteration : 46 || log-likelihood : -2089.91118603989
#> EM: Iteration : 47 || log-likelihood : -2087.67388435026
#> EM: Iteration : 48 || log-likelihood : -2086.11373786756
#> EM: Iteration : 49 || log-likelihood : -2083.84931461869
#> EM: Iteration : 50 || log-likelihood : -2082.16175664198
#> EM: Iteration : 51 || log-likelihood : -2080.45137011098
#> EM: Iteration : 52 || log-likelihood : -2078.37066132008
#> EM: Iteration : 53 || log-likelihood : -2077.06827662071
#> EM: Iteration : 54 || log-likelihood : -2074.66718553694
#> EM: Iteration : 55 || log-likelihood : -2073.68137124781
#> EM: Iteration : 56 || log-likelihood : -2071.20390017789
#> EM: Iteration : 57 || log-likelihood : -2069.88260759288
#> EM: Iteration : 58 || log-likelihood : -2067.30246728287
#> EM: Iteration : 59 || log-likelihood : -2066.08897944236
#
Related Skills
feishu-drive
341.2k|
things-mac
341.2kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
341.2kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
postkit
PostgreSQL-native identity, configuration, metering, and job queues. SQL functions that work with any language or driver
