`vignettes/Dynamic_Predictions.Rmd`

`Dynamic_Predictions.Rmd`

Based on the general framework of joint models presented earlier, we are interested in deriving cumulative risk probabilities for a new subject \(j\) that has survived up to time point \(t\) and has provided longitudinal measurements \(\mathcal Y_{kj}(t) = \{ y_{kj}(t_{jl}); 0 \leq t_{jl} \leq t, l = 1, \ldots, n_j, k = 1, \ldots, K\}\), with \(K\) denoting the number of longitudinal outcomes. The probabilities of interest are \[\begin{array}{l} \pi_j(u \mid t) = \mbox{Pr}\{T_j^* \leq u \mid T_j^* > t, \mathcal Y_j(t), \mathcal D_n\}\\\\ = \displaystyle 1 - \int\int \frac{S(u \mid b_j, \theta)}{S(t \mid b_j, \theta)} \; p\{b_j \mid T_j^* > t, \mathcal Y_j(t), \theta\} \; p(\theta \mid \mathcal D_n) \; db_j d\theta, \end{array}\] where \(S(\cdot)\) denotes the survival function conditional on the random effects and \(\mathcal Y_j(t) = \{\mathcal Y_{1j}(t), \ldots, \mathcal Y_{Kj}(t)\}\). Combining the three terms in the integrand, we can devise a Monte Carlo scheme to obtain estimates of these probabilities, namely,

Sample a value \(\tilde \theta\) from the posterior of the parameters \([\theta \mid \mathcal D_n]\).

Sample a value \(\tilde b_j\) from the posterior of the random effects \([b_j \mid T_j^* > t, \mathcal Y_j(t), \tilde \theta]\).

Compute the ratio of survival probabilities \(S(u \mid \tilde b_j, \tilde \theta) \Big / S(t \mid \tilde b_j, \tilde \theta)\).

Replicating these steps \(L\) times, we can estimate the conditional cumulative risk probabilities by \[1 - \frac{1}{L} \sum_{l=1}^L \frac{S(u \mid \tilde b_j^{(l)}, \tilde \theta^{(l)})}{S(t \mid \tilde b_j^{(l)}, \tilde \theta^{(l)})},\] and their standard error by calculating the standard deviation across the Monte Carlo samples.

We will illustrate the calculation of dynamic predictions using
package **JMbayes2** from a trivariate joint model fitted
to the PBC dataset for the longitudinal outcomes `serBilir`

(continuous), `prothrombin`

time (continuous), and
`ascites`

(dichotomous). We start by fitting the univariate
mixed models. For the two continuous outcomes, we allow for nonlinear
subject-specific time effects using natural cubic splines. For
`ascites`

, we postulate linear subject-specific profiles for
the log odds. The code is:

```
fm1 <- lme(log(serBilir) ~ ns(year, 3) * sex, data = pbc2,
random = ~ ns(year, 3) | id, control = lmeControl(opt = 'optim'))
fm2 <- lme(prothrombin ~ ns(year, 2) * sex, data = pbc2,
random = ~ ns(year, 2) | id, control = lmeControl(opt = 'optim'))
fm3 <- mixed_model(ascites ~ year * sex, data = pbc2,
random = ~ year | id, family = binomial())
```

Following, we fit the Cox model for the time to either
transplantation or death. The first line defines the composite event
indicator, and the second one fits the Cox model in which we have also
included the baseline covariates `drug`

and `age`

.
The code is:

```
pbc2.id$event <- as.numeric(pbc2.id$status != "alive")
CoxFit <- coxph(Surv(years, event) ~ drug + age, data = pbc2.id)
```

The joint model is fitted with the following call to
`jm()`

:

We want to calculate predictions for the longitudinal and survival
outcomes for Patients 25 and 93. As a first step, we extract the data of
these patients and store them in the data.frame `ND`

with the
code:

We will only use the first five years of follow-up (line three) and specify that the patients were event-free up to this point (lines four and five).

We start with predictions for the longitudinal outcomes. These are
produced by the `predict()`

method for class `jm`

objects and follow the same lines as the procedure described above for
cumulative risk probabilities. The only difference is in Step 3, where
instead of calculating the cumulative risk, we calculate the predicted
values for the longitudinal outcomes. There are two options controlled
by the `type_pred`

argument, namely predictions at the scale
of the response/outcome (default) or at the linear predictor level. The
`type`

argument controls whether the predictions will be for
the mean subject (i.e., including only the fixed effects) or
subject-specific, including both the fixed and random effects. In the
`newdata`

argument we provide the available measurements of
the two patients. This will be used to sample their random effects in
Step 2, presented above. This is done with a Metropolis-Hastings
algorithm that runs for `n_mcmc`

iterations; all iterations
but the last one are discarded as burn-in. Finally, argument
`n_samples`

corresponds to the value of \(L\) defined above and specifies the number
of Monte Carlo samples:

`predLong1 <- predict(jointFit, newdata = ND, return_newdata = TRUE)`

Argument `return_newdata`

specifies that the predictions
are returned as extra columns of the `newdata`

data.frame. By
default, the 95% credible intervals are also included. Using the
`plot()`

method for objects returned by
`predict.jm(..., return_newdata = TRUE)`

, we can display the
predictions. With the following code, we do that for the first
longitudinal outcome:

`plot(predLong1)`

When we want to calculate predictions for other future time points,
we can accordingly specify the `times`

argument. In the
following example, we calculate predictions from time `t0`

to
time 12:

```
predLong2 <- predict(jointFit, newdata = ND,
times = seq(t0, 12, length.out = 51),
return_newdata = TRUE)
```

We show these predictions for the second outcome and the second
patient (i.e., Patient 93). This is achieved by suitably specifying the
`outcomes`

and `subject`

arguments of the
`plot()`

method:

`plot(predLong2, outcomes = 2, subject = 93)`

We continue with the predictions for the event outcome. To let
`predict()`

know that we want the cumulative risk
probabilities, we specify `process = "event"`

:

```
predSurv <- predict(jointFit, newdata = ND, process = "event",
times = seq(t0, 12, length.out = 51),
return_newdata = TRUE)
```

The predictions are included again as extra columns in the
corresponding data.frame. To depict the predictions of both the
longitudinal and survival outcomes combined, we provide both objects to
the `plot()`

method:

`plot(predLong2, predSurv)`

Again, by default, the plot is for the predictions of the first
subject (i.e., Patient 25) and the first longitudinal outcome (i.e.,
`log(serBilir)`

). However, the `plot()`

method has
a series of arguments that allow users to customize the plot. We
illustrate some of these capabilities with the following figure. First,
we specify that we want to depict all three outcomes using
`outcomes = 1:3`

(note: a max of three outcomes can be
simultaneously displayed). Next, we specify via the `subject`

argument that we want to show the predictions of Patient 93. Note that
for serum bilirubin, we used the log transformation in the specification
of the linear mixed model. Hence, we receive predictions on the
transformed scale. To show predictions on the original scale, we use the
`fun_long`

argument. Because we have three outcomes, this
needs to be a list of three functions. The first one, corresponding to
serum bilirubin, is the `exp()`

, and for the other two the
`identity()`

because we do not wish to transform the
predictions. Analogously, we also have the `fun_event`

argument to transform the predictions for the event outcome, and in the
example below, we set the goal of obtaining survival probabilities.
Using the arguments `bg`

, `col_points`

,
`col_line_long`

, `col_line_event`

,
`fill_CI_long`

, and `fill_CI_event`

, we have
changed the appearance of the plot to a dark theme. Finally, the
`pos_ylab_long`

specifies the relative positive of the y-axis
labels for the three longitudinal outcomes.

```
cols <- c('#F25C78', '#D973B5', '#F28322')
plot(predLong2, predSurv, outcomes = 1:3, subject = 93,
fun_long = list(exp, identity, identity),
fun_event = function (x) 1 - x,
ylab_event = "Survival Probabilities",
ylab_long = c("Serum Bilirubin", "Prothrombin", "Ascites"),
bg = '#132743', col_points = cols, col_line_long = cols,
col_line_event = '#F7F7FF', col_axis = "white",
fill_CI_long = c("#F25C7880", "#D973B580", "#F2832280"),
fill_CI_event = "#F7F7FF80",
pos_ylab_long = c(1.9, 1.9, 0.08))
```

We evaluate the discriminative capability of the model using ROC
methodology. We calculate the components of the ROC curve using
information up to year five, and we are interested in events occurring
within a three-year window. That is discriminating between patients who
will get the event in the interval `(t0, t0 + Dt]`

, (i.e., in
our case \(T_j \in (5, 8]\)) from
patients who will survive at least 8 years (i.e., \(T_j > 8\)). The calculations are
performed with the following call to `tvROC()`

:

```
pbc2$event <- as.numeric(pbc2$status != "alive")
roc <- tvROC(jointFit, newdata = pbc2, Tstart = t0, Dt = 3)
roc
#>
#> Time-dependent Sensitivity and Specificity for the Joint Model jointFit
#>
#> At time: 8
#> Using information up to time: 5 (202 subjects still at risk)
#> Accounting for censoring using model-based weights
#>
#> cut-off SN SP qSN qSP
#> 1 0.03 0.01513 0.99812 0.01023 0.620511
#> 2 0.08 0.03648 0.99812 0.02684 0.810255
#> 3 0.09 0.03648 0.99168 0.02195 0.439547
#> 4 0.11 0.10053 0.99168 0.07299 0.719773
#> 5 0.12 0.11871 0.99072 0.08707 0.732218
#> 6 0.13 0.11871 0.98428 0.08237 0.602957
#> 7 0.14 0.11871 0.97783 0.07761 0.502421
#> 8 0.16 0.15223 0.97506 0.10341 0.542026
#> 9 0.18 0.17358 0.97506 0.12139 0.580191
#> 10 0.20 0.21628 0.97506 0.15792 0.640163
#> 11 0.21 0.23763 0.97506 0.17648 0.664152
#> 12 0.25 0.25898 0.97506 0.19524 0.685143
#> 13 0.27 0.28033 0.97506 0.21420 0.703664
#> 14 0.30 0.32303 0.97506 0.25275 0.734857
#> 15 0.31 0.34438 0.96862 0.26832 0.698115
#> 16 0.33 0.37465 0.96486 0.29429 0.691398
#> 17 0.36 0.41735 0.95842 0.33127 0.676934
#> 18 0.37 0.41842 0.95230 0.32869 0.643114
#> 19 0.40 0.43977 0.95230 0.34962 0.655860
#> 20 0.44 0.47217 0.94919 0.38011 0.657861
#> 21 0.46 0.49352 0.94919 0.40170 0.668897
#> 22 0.48 0.53622 0.94919 0.44566 0.688964
#> 23 0.52 0.53622 0.94274 0.44236 0.659822
#> 24 0.53 0.53622 0.93630 0.43902 0.632345
#> 25 0.55 0.53622 0.92985 0.43564 0.606395
#> 26 0.56 0.57892 0.92985 0.48135 0.627111
#> 27 0.58 0.60027 0.92985 0.50463 0.636672
#> 28 0.65 0.60027 0.92341 0.50157 0.613208
#> 29 0.66 0.62162 0.92341 0.52526 0.622642 *
#> 30 0.67 0.62430 0.91133 0.52270 0.583445
#> 31 0.68 0.62430 0.90488 0.51968 0.563324
#> 32 0.69 0.62467 0.89855 0.51709 0.544598
#> 33 0.70 0.62467 0.89210 0.51400 0.526196
#> 34 0.71 0.65738 0.86975 0.54166 0.484122
#> 35 0.72 0.66080 0.85145 0.53703 0.444314
#> 36 0.73 0.68215 0.84501 0.56023 0.440912
#> 37 0.74 0.68626 0.83980 0.56292 0.432278
#> 38 0.75 0.68828 0.82108 0.55656 0.397622
#> 39 0.76 0.69336 0.80328 0.55438 0.369233
#> 40 0.78 0.71471 0.78394 0.57312 0.348598
#> 41 0.79 0.71977 0.77258 0.57439 0.334217
#> 42 0.80 0.72025 0.76628 0.57190 0.325547
#> 43 0.82 0.72770 0.74919 0.57360 0.305983
#> 44 0.83 0.72997 0.73699 0.57050 0.291619
#> 45 0.84 0.75693 0.71935 0.60082 0.282385
#> 46 0.85 0.77905 0.70025 0.62495 0.270479
#> 47 0.86 0.80262 0.68802 0.65628 0.267221
#> 48 0.87 0.80680 0.66995 0.65464 0.250905
#> 49 0.88 0.80680 0.66351 0.65155 0.244763
#> 50 0.89 0.83361 0.63938 0.68588 0.233199
#> 51 0.90 0.85678 0.60770 0.71356 0.215402
#> 52 0.91 0.86059 0.55729 0.69719 0.179569
#> 53 0.92 0.88371 0.52560 0.72999 0.166709
#> 54 0.93 0.88684 0.42987 0.68252 0.114110
#> 55 0.94 0.92954 0.40409 0.78434 0.114903
#> 56 0.95 0.95219 0.32715 0.81776 0.087809
#> 57 0.96 0.97587 0.24407 0.87503 0.063201
#> 58 0.97 0.97829 0.15457 0.82457 0.035157
#> 59 0.98 0.97829 0.09656 0.72589 0.018850
#> 60 0.99 0.99989 0.01286 0.98859 0.002984
```

In the first line we define the event indicator as we did in the
`pbc2.id`

data.frame. The cut-point with the asterisk on the
right maximizes the Youden’s
index. To depict the ROC curve, we use the corresponding
`plot()`

method:

The area under the ROC curve is calculated with the
`tvAUC()`

function:

```
tvAUC(roc)
#>
#> Time-dependent AUC for the Joint Model jointFit
#>
#> Estimated AUC: 0.8282
#> At time: 8
#> Using information up to time: 5 (202 subjects still at risk)
#> Accounting for censoring using model-based weights
```

This function either accepts an object of class `tvROC`

or
of class `jm`

. In the latter case, the user must also provide
the `newdata`

, `Tstart`

and `Dt`

or
`Thoriz`

arguments. Here we have used the same dataset as the
one to fit the model, but, in principle, discrimination could be
(better) assessed in another dataset.

To assess the accuracy of the predictions, we produce a calibration plot:

`calibration_plot(jointFit, newdata = pbc2, Tstart = t0, Dt = 3)`

The syntax of the `calibration_plot()`

function is almost
identical to that of `tvROC()`

. The kernel density estimation
is of the estimated probabilities \(\pi_j(t +
\Delta t \mid t) = \pi_j(8 \mid 5)\) for all individuals at risk
at year `t0`

in the data frame provided in the
`newdata`

argument. Using the
`calibration_metrics()`

function we can also calculate
metrics for the accuracy of predictions:

```
calibration_metrics(jointFit, pbc2, Tstart = 5, Dt = 3)
#> ICI E50 E90
#> 0.06123482 0.05621041 0.11553538
```

The ICI is the mean absolute difference between the observed and
predicted probabilities, E50 is the median absolute difference, and E90
is the 90% percentile of the absolute differences. Finally, we calculate
the Brier score as an overall measure of predictive performance. This is
computed with the `tvBrier()`

function:

```
tvBrier(jointFit, newdata = pbc2, Tstart = t0, Dt = 3)
#>
#> Prediction Error for the Joint Model 'jointFit'
#>
#> Estimated Brier score: 0.1242
#> At time: 8
#> For the 202 subjects at risk at time 5
#> Number of subjects with an event in [5, 8): 40
#> Number of subjects with a censored time in [5, 8): 58
#> Accounting for censoring using model-based weights
```

The Brier score evaluates the predictive accuracy at time
`Tstart + Dt`

. To summarize the predictive accuracy in the
interval `(t0, t0 + Dt]`

we can use the integrated Brier
score. The corresponding integral is approximated using the Simpson’s
rule:

```
tvBrier(jointFit, newdata = pbc2, Tstart = t0, Dt = 3, integrated = TRUE)
#>
#> Prediction Error for the Joint Model 'jointFit'
#>
#> Estimated Integrated Brier score: 0.0829
#> In the time interval: [5, 8)
#> For the 202 subjects at risk at time 5
#> Number of subjects with an event in [5, 8): 40
#> Number of subjects with a censored time in [5, 8): 58
#> Accounting for censoring using model-based weights
```

The `tvBrier()`

and `tvROC()`

also implement
inverse probability of censoring weights to account for censoring in the
interval `(t0, t0 + Dt]`

using the Kaplan-Meier estimate of
the censoring distribution (however, see the note below):

```
tvBrier(jointFit, newdata = pbc2, Tstart = t0, Dt = 3, integrated = TRUE,
type_weights = "IPCW")
#>
#> Prediction Error for the Joint Model 'jointFit'
#>
#> Estimated Integrated Brier score: 0.0834
#> In the time interval: [5, 8)
#> For the 202 subjects at risk at time 5
#> Number of subjects with an event in [5, 8): 40
#> Number of subjects with a censored time in [5, 8): 58
#> Accounting for censoring using inverse probability of censoring Kaplan-Meier weights
```

**Notes:**

- To obtain valid estimates of the predictive accuracy measures (i.e.,
time-varying sensitivity, specificity, and Brier score) we need to
account for censoring. A popular method to achieve this is via the
inverse probability of censoring weighting. For this approach to be
valid, we need the model for the weights to be correctly specified. In
standard survival analysis, this is achieved either using the
Kaplan-Meier estimator or a Cox model for the censoring distribution.
However, in the settings where joint models are used, it is often the
case that the censoring mechanism may depend on the history of the
longitudinal outcomes in a complex manner. This is especially the case
when we consider multiple longitudinal outcomes in the analysis. Also,
these outcomes may be recorded at different time points per patient and
have missing data. Because of these reasons, in these settings,
Kaplan-Meier-based or Cox-based censoring weights may be difficult to
derive or be biased. The functions in
**JMbayes2**that calculate the predictive accuracy measures use joint-model-based weights to account for censoring. These weights allow censoring to depend in any possible manner on the history of the longitudinal outcomes. However, they require that the model is appropriately calibrated. - The calibration curve, produced by
`calibration_plot()`

, and the calibration metrics, produced by`calibration_metrics())`

, are calculated using the procedure described in Austin et al., 2020.