{msm.stacked}

The {msm.stacked} package can be used to simplify the calculation of state transition probabilities over time and the creation of stacked probabilities plot from multi-state model fits from the {msm} package.

Installation

You can install the development version of {msm.stacked} from GitHub with:

# install.packages("devtools")
devtools::install_github("RedDoorAnalytics/msm.stacked", build_vignettes = TRUE)

Example

To illustrate the functionality of {msm.stacked}, we build upon the documentation for the msm::msm() function. We will be using the heart transplant data:

library(msm)
head(cav)
#>    PTNUM      age    years dage sex pdiag cumrej state firstobs statemax
#> 1 100002 52.49589 0.000000   21   0   IHD      0     1        1        1
#> 2 100002 53.49863 1.002740   21   0   IHD      2     1        0        1
#> 3 100002 54.49863 2.002740   21   0   IHD      2     2        0        2
#> 4 100002 55.58904 3.093151   21   0   IHD      2     2        0        2
#> 5 100002 56.49589 4.000000   21   0   IHD      3     2        0        2
#> 6 100002 57.49315 4.997260   21   0   IHD      3     3        0        3

Further details on this example dataset are included in the vignette of the {msm} package.

We start with a matrix of possible transitions:

twoway4.q <- rbind(
  c(-0.5, 0.25, 0, 0.25),
  c(0.166, -0.498, 0.166, 0.166),
  c(0, 0.25, -0.5, 0.25),
  c(0, 0, 0, 0)
)
twoway4.q
#>        [,1]   [,2]   [,3]  [,4]
#> [1,] -0.500  0.250  0.000 0.250
#> [2,]  0.166 -0.498  0.166 0.166
#> [3,]  0.000  0.250 -0.500 0.250
#> [4,]  0.000  0.000  0.000 0.000

This is then used to provide starting values for the model without additional covariates:

cav.msm <- msm(
  formula = state ~ years,
  subject = PTNUM,
  data = cav,
  qmatrix = twoway4.q,
  deathexact = 4
)
cav.msm
#> 
#> Call:
#> msm(formula = state ~ years, subject = PTNUM, data = cav, qmatrix = twoway4.q,     deathexact = 4)
#> 
#> Maximum likelihood estimates
#> 
#> Transition intensities
#>                   Baseline                    
#> State 1 - State 1 -0.17037 (-0.19027,-0.15255)
#> State 1 - State 2  0.12787 ( 0.11135, 0.14684)
#> State 1 - State 4  0.04250 ( 0.03412, 0.05294)
#> State 2 - State 1  0.22512 ( 0.16755, 0.30247)
#> State 2 - State 2 -0.60794 (-0.70880,-0.52143)
#> State 2 - State 3  0.34261 ( 0.27317, 0.42970)
#> State 2 - State 4  0.04021 ( 0.01129, 0.14324)
#> State 3 - State 2  0.13062 ( 0.07952, 0.21457)
#> State 3 - State 3 -0.43710 (-0.55292,-0.34554)
#> State 3 - State 4  0.30648 ( 0.23822, 0.39429)
#> 
#> -2 * log-likelihood:  3968.798

We can use the plot.msm() function to plot survival curves from every transient state to the final, absorbing state (e.g., a state denoting death). This is denoted in the cav dataset by state 4:

plot(cav.msm, from = 1:3, to = 4)

The {msm} package also provides functionality to calculate state transition probabilities at a given point in time. Say we are interested in estimating the probability of being in a given state, from each state, five years after baseline; we can use the pmatrix.msm() function to obtain just that:

pmatrix.msm(x = cav.msm, t = 5)
#>            State 1    State 2    State 3   State 4
#> State 1 0.51965804 0.13851775 0.09119847 0.2506257
#> State 2 0.24386420 0.13881410 0.18090731 0.4364144
#> State 3 0.06121333 0.06897186 0.16909991 0.7007149
#> State 4 0.00000000 0.00000000 0.00000000 1.0000000

This shows that, for instance, study participants in State 1 at time zero have (approximately) a 52% probability of still being in State 1 after years, 14% probability of being in State 2, 9% probability of being in State 3, and 25% probability of being in State 4.

We can repeatedly call the pmatrix.msm() function to obtain probabilities over time, but that’s a bit tedious. This is where the {msm.stacked} package comes in handy.

Specifically, we can use the stacked.data.msm() function to calculate transition probabilities over time, say, at 1 to 5 years:

library(msm.stacked)
sdd <- stacked.data.msm(model = cav.msm, tstart = 0, tforward = 5, tseqn = 6)
str(sdd)
#> 'data.frame':    96 obs. of  5 variables:
#>  $ from  : Factor w/ 4 levels "State 1","State 2",..: 1 2 3 4 1 2 3 4 1 2 ...
#>  $ to    : Factor w/ 4 levels "State 1","State 2",..: 1 1 1 1 2 2 2 2 3 3 ...
#>  $ p     : num  1 0 0 0 0 1 0 0 0 0 ...
#>  $ tstart: num  0 0 0 0 0 0 0 0 0 0 ...
#>  $ t     : num  0 0 0 0 0 0 0 0 0 0 ...

This returns a tidy dataset with all transition probabilities, from and to every state, over tseqn = 6 equally-spaced time intervals between time zero and time five. Focussing on transitions from State 1 only:

subset(sdd, sdd$from == "State 1")
#>       from      to          p tstart t
#> 1  State 1 State 1 1.00000000      0 0
#> 5  State 1 State 2 0.00000000      0 0
#> 9  State 1 State 3 0.00000000      0 0
#> 13 State 1 State 4 0.00000000      0 0
#> 17 State 1 State 1 0.85395872      0 1
#> 21 State 1 State 2 0.08836953      0 1
#> 25 State 1 State 3 0.01475543      0 1
#> 29 State 1 State 4 0.04291632      0 1
#> 33 State 1 State 1 0.74313989      0 2
#> 37 State 1 State 2 0.12669585      0 2
#> 41 State 1 State 3 0.04053779      0 2
#> 45 State 1 State 4 0.08962646      0 2
#> 49 State 1 State 1 0.65472323      0 3
#> 53 State 1 State 2 0.14064466      0 3
#> 57 State 1 State 3 0.06380519      0 3
#> 61 State 1 State 4 0.14082692      0 3
#> 65 State 1 State 1 0.58161960      0 4
#> 69 State 1 State 2 0.14256253      0 4
#> 73 State 1 State 3 0.08072247      0 4
#> 77 State 1 State 4 0.19509541      0 4
#> 81 State 1 State 1 0.51965804      0 5
#> 85 State 1 State 2 0.13851775      0 5
#> 89 State 1 State 3 0.09119847      0 5
#> 93 State 1 State 4 0.25062574      0 5

Here we see, for instance, that the probability of still being in State 1, starting from State 1, is (approximately) 85% after one year, 74% after two years, 65% after three years, 58% after four years, and 52% after five years:

subset(sdd, sdd$from == "State 1" & sdd$to == "State 1")
#>       from      to         p tstart t
#> 1  State 1 State 1 1.0000000      0 0
#> 17 State 1 State 1 0.8539587      0 1
#> 33 State 1 State 1 0.7431399      0 2
#> 49 State 1 State 1 0.6547232      0 3
#> 65 State 1 State 1 0.5816196      0 4
#> 81 State 1 State 1 0.5196580      0 5

The package also provides functionality to automatically produce stacked probabilities plots, for transition probabilities from and to every state. This is implemented in the stacked.plot.msm() function:

stacked.plot.msm(model = cav.msm, tstart = 0, tforward = 5)

This relies on {ggplot2} functionality and returns a standard ggplot object, which can of course be further customised beyond the default settings:

library(ggplot2)

stacked.plot.msm(model = cav.msm, tstart = 0, tforward = 5) +
  scale_fill_viridis_d(option = "plasma") +
  theme_minimal() +
  theme(legend.position = "bottom") +
  labs(fill = "To:")

Model with Covariates

We can of course incorporate covariates in a multi-state model and obtain predictions for a specific covariates pattern; let’s demonstrate this by incorporating sex in the model above. First, we fit a second model:

cav.msm.cov <- msm(
  formula = state ~ years,
  subject = PTNUM,
  data = cav,
  covariates = ~sex,
  qmatrix = twoway4.q,
  deathexact = 4
)
cav.msm.cov
#> 
#> Call:
#> msm(formula = state ~ years, subject = PTNUM, data = cav, qmatrix = twoway4.q,     covariates = ~sex, deathexact = 4)
#> 
#> Maximum likelihood estimates
#> Baselines are with covariates set to their means
#> 
#> Transition intensities with hazard ratios for each covariate
#>                   Baseline                        
#> State 1 - State 1 -0.16938 (-1.894e-01,-1.515e-01)
#> State 1 - State 2  0.12745 ( 1.108e-01, 1.466e-01)
#> State 1 - State 4  0.04193 ( 3.354e-02, 5.241e-02)
#> State 2 - State 1  0.22645 ( 1.686e-01, 3.042e-01)
#> State 2 - State 2 -0.58403 (-1.053e+00,-3.238e-01)
#> State 2 - State 3  0.33693 ( 2.697e-01, 4.209e-01)
#> State 2 - State 4  0.02065 ( 2.196e-09, 1.941e+05)
#> State 3 - State 2  0.13050 ( 7.830e-02, 2.175e-01)
#> State 3 - State 3 -0.44178 (-5.582e-01,-3.497e-01)
#> State 3 - State 4  0.31128 ( 2.425e-01, 3.996e-01)
#>                   sex                            
#> State 1 - State 1                                
#> State 1 - State 2 0.5632779 (3.333e-01,9.518e-01)
#> State 1 - State 4 1.1289701 (6.262e-01,2.035e+00)
#> State 2 - State 1 1.2905854 (4.916e-01,3.388e+00)
#> State 2 - State 2                                
#> State 2 - State 3 1.0765518 (5.194e-01,2.231e+00)
#> State 2 - State 4 0.0003805 (7.241e-65,1.999e+57)
#> State 3 - State 2 1.0965531 (1.345e-01,8.937e+00)
#> State 3 - State 3                                
#> State 3 - State 4 2.4135380 (1.176e+00,4.952e+00)
#> 
#> -2 * log-likelihood:  3954.777

Then, we can use the same functionality as before to obtain stacked probabilities plots:

stacked.plot.msm(model = cav.msm.cov, tstart = 0, tforward = 5) +
  labs(title = "Predictions for average covariates")

By default, this will set all cova

Msm.stacked

Install / Use

README

{msm.stacked}

Installation

Example

Model with Covariates