## Recurrent events

### Introduction

JMbayes2 also provides the capability to fit joint models with a recurrent event process, possibly combined with a terminating event. Recurrent events are correlated events that may occur more than once over the follow-up period for a given subject. Our current implementation allows for multiple longitudinal markers with different distributions and various functional forms to link these markers with the risk of recurrent and terminal events. Furthermore, it enables the risk intervals to be defined in terms of the gap or calendar timescales. The two timescales use a different zero-time reference. The calendar uses the study entry, while the gap uses the end of the previous event (Figure 1). Gap assumes a renewal after each event and resets the time to zero.

The model also accommodates discontinuous risk intervals, i.e., periods in which the subject is not at risk of experiencing the recurring event (Figure 2). For example, while a patient is in the hospital, they are not at risk of being hospitalized again.

A joint model with $$p$$ normally distributed longitudinal markers, a terminal event process, and a recurrent event process can be described as follows: $\small{\require{color} \begin{cases} y_{1_i}(t)= \colorbox{#79FCFD80}{ x_{1_i}(t)^\top\beta_1 + z_{1_i}(t)^\top b_{1_i}} + \varepsilon_1(t) = \colorbox{#79FCFD80}{ \eta_{1_i}(t)} + \varepsilon_1(t) & \text{Longitudinal marker 1}\\ \vdots \\ y_{p_i}(t)= \colorbox{#FCFF5580}{ x_{p_i}(t)^\top\beta_p + z_{p_i}(t)^\top b_{p_i}} + \varepsilon_p(t) = \colorbox{#FCFF5580}{ \eta_{p_i}(t)} + \varepsilon_p(t) & \text{Longitudinal marker p}\\ h_{T_i}(t)= h_{T_0}(t)\exp\left\{ w_{T_i}(t)^\top \gamma_T + \colorbox{#79FCFD80}{\mathcal{f}_{T_1}\left\{\eta_{1_i}(t)\right\}} \alpha_{T_1} + \dots + \colorbox{#FCFF5580}{\mathcal{f}_{T_p}\left\{\eta_{p_i}(t)\right\}} \alpha_{T_p} + \colorbox{#5FFFAB80}{b_{F_i}} \alpha_{F} \right\} & \text{Terminal event}\\ h_{R_i}(t)= h_{R_0}(t)\exp\left\{ w_{R_i}(t)^\top \gamma_R + \colorbox{#79FCFD80}{\mathcal{f}_{R_1}\left\{\eta_{2_i}(t)\right\}} \alpha_{R_1} + \dots + \colorbox{#FCFF5580}{\mathcal{f}_{R_p}\left\{\eta_{p_i}(t)\right\}} \alpha_{R_p} + \colorbox{#5FFFAB80}{b_{F_i}} \right\} & \text{Recurrent event}\\ \end{cases}, }$
$\begin{pmatrix} \colorbox{#79FCFD80}{b_{1_i}} \\ \vdots \\ \colorbox{#FCFF5580}{b_{p_i}} \\ \colorbox{#5FFFAB80}{b_{F_i}}\end{pmatrix} \sim MVN \left(0, \begin{pmatrix}D & 0 \\ & \sigma^2_F\end{pmatrix}\right), \qquad \varepsilon(t) \sim N \left(0, \sigma^2\right),$

where $$i = 1, \ldots, n$$. We specify linear mixed-effects models for the longitudinal outcomes, and for the terminal and recurrence processes, we use proportional hazard models. The longitudinal and event time processes are linked via a latent structure of random effects, highlighted by the same color in the equations above. The terms $$\mathcal{f}_{R_j}\left\{\eta_{j_i}(t)\right\}$$ and $$\mathcal{f}_{R_j}\left\{\eta_{j_i}(t)\right\}$$ describe the functional forms that link the longitudinal marker $$j$$ with the risk of the recurrent and terminal events, respectively. The frailty $$b_{F_i}$$ is a random effect that accounts for the correlations in the recurrent events. The coefficient $$\alpha_{F}$$ quantifies the strength of the association between the terminal and recurrent event processes. For notational simplicity, in the formulation presented above, we have shown normally distributed longitudinal outcomes; however, JMbayes2 provides the option to consider longitudinal outcomes with different distributions.

### Example

#### Data

We simulate data from a joint model with three outcomes: one longitudinal outcome, one terminal failure time, and one recurrent failure time. We assume that the underlying value of the longitudinal outcome is associated with both risk models and use the gap timescale. The reader can easily extend this example to accommodate multiple longitudinal markers with other forms of association, including competing risks, considering only the recurrent events process, or using a different timescale.

gen_data <- function(){
n <- 500 # desired number of subjects
n_i <- 15  # number of (planned) measurements per subject
tmax <- 7 # maximum follow-up time (type I censoring)
scale <- "gap" # hazard timescale
##############################################################################
n_scl <- 1.5
n_target <- n
n <- n * n_scl
# longitudinal outcome 1/2
## param true values
betas <- c("Intercept" = 6.94, "Time1" = 1.30, "Time2" = 1.84, "Time3" = 1.82)
sigma_y <- 0.6 # measurement error sd
D <- matrix(0, 4, 4)
D[lower.tri(D, TRUE)] <- c(0.71, 0.33, 0.07, 1.26, 2.68, 3.81, 4.35, 7.62, 5.4, 8)
D <- D + t(D)
diag(D) <- diag(D) * 0.5
b <- MASS::mvrnorm(n, rep(0, nrow(D)), D)
Bkn <- c(0, 7)
kn <- c(1, 3)
remove(D)
##############################################################################
# terminal outcome
## param true values
gammas_t <- c("(Intercept)" = -9, "Group" = 0.5, "Age" = 0.05) # phi = exp(Intercept)
sigma_t <- 2
alpha_t <- 0.5 # association biomarker
alphaF <- 0.25 # association frailty
sigmaF <- 0.25 # frailty SD
frailty <- rnorm(n, mean = 0, sd = sigmaF)
## terminal data
group <- rep(0:1, each = n/2)
age <- runif(n, 30, 70)
W_t <- cbind("(Intercept)" = 1, "Group" = group, "Age" = age)
eta_t <- as.vector(W_t %*% gammas_t + alphaF * frailty)
invS_t <- function(t, u, i) {
h <- function(s) {
NS <- splines::ns(s, knots = kn, Boundary.knots = Bkn)
X <- cbind(1, NS)
Z <- cbind(1, NS)
eta_y <- as.vector(X %*% betas + rowSums(Z * b[rep(i, nrow(Z)), ]))
exp(log(sigma_t) + (sigma_t - 1) * log(s) + eta_t[i] + eta_y * alpha_t)
}
integrate(h, lower = 0, upper = t)$value + log(u) } u_t <- runif(n) ter_times <- numeric(n) for(i in seq_len(n)) { root <- try(uniroot(invS_t, interval = c(1e-05, 250), # arbitrary upper limit u = u_t[i], i = i)$root, TRUE)
ter_times[i] <- if (!inherits(root, "try-error")) root else NA
}
ter_na <- !is.na(ter_times)
if(sum(ter_na) < n_target) stop("Not enough patients. Increase 'n_scl'.")
rmv_ids <- sample(which(ter_na), sum(ter_na) - n_target)
ter_na[rmv_ids] <- FALSE # remove the excess of subjects
ter <- data.frame(id    = seq_len(n)[ter_na],
tstop = ter_times[ter_na],
group = group[ter_na],
age   = age[ter_na])
frailty <- frailty[ter_na]
b <- b[ter_na, , drop = FALSE]
cens_times <- tmax
ter$status <- as.numeric(ter$tstop <= cens_times) # event indicator
ter$tstop <- pmin(ter$tstop, cens_times) # add censoring time
remove(gammas_t, sigma_t, group, W_t, eta_t, alpha_t, invS_t, u_t, i, root,
n_target, rmv_ids, ter_times, cens_times, n, alphaF, age, ter_na,
sigmaF)
##############################################################################
# recurring outcome
## param true values
gammas_r <- c("(Intercept)" = -9+3, "Group" = 0.5, "Age" = 0.05) # phi = exp(Intercept)
sigma_r <- 2
alpha_r <- 0.5 # association biomarker
## recurring data
W_r <- cbind("(Intercept)" = 1, "Group" = ter$group, "Age" = ter$age)
eta_r <- as.vector(W_r %*% gammas_r + frailty)
if(scale == "gap") {
invS_r <- function(t, u, i, tstart) {
h <- function(s) {
NS <- splines::ns(s + tstart, knots = kn, Boundary.knots = Bkn)
X <- cbind(1, NS)
Z <- cbind(1, NS)
eta_y <- as.vector(X %*% betas + rowSums(Z * b[rep(i, nrow(Z)), ]))
exp(log(sigma_r) + (sigma_r - 1) * log(s) + eta_r[i] + eta_y * alpha_r)
}
integrate(h, lower = 0, upper = t)$value + log(u) } } else if(scale == "calendar") { invS_r <- function(t, u, i, tstart) { h <- function(s) { NS <- splines::ns(s + tstart, knots = kn, Boundary.knots = Bkn) X <- cbind(1, NS) Z <- cbind(1, NS) eta_y <- as.vector(X %*% betas + rowSums(Z * b[rep(i, nrow(Z)), ])) exp(log(sigma_r) + (sigma_r - 1) * log(s + tstart) + eta_r[i] + eta_y * alpha_r) } integrate(h, lower = 0, upper = t)$value + log(u)
}
}
stop_times <- start_times <- id_times <- list()
j <- 1
for(i in seq_along(ter$id)) { tstart <- 0 while(!is.na(tstart) & tstart < ter$tstop[i]) {
u_r <- runif(1)
root <- try(uniroot(invS_r, interval = c(1e-05, 250), # arbitrary upper limit
u = u_r, i = i, tstart = tstart)$root, TRUE) tstop <- if(!inherits(root, "try-error")) root else NA start_times[[j]] <- tstart stop_times[[j]] <- tstart + tstop dur <- runif(1, 0, 0.1) # recurrent event duration tstart <- tstart + tstop + dur id_times[[j]] <- ter$id[i]
j <- j + 1
}
}
rec <- data.frame(id     = unlist(id_times),
tstart = unlist(start_times),
tstop  = unlist(stop_times))
rec$id <- match(rec$id, unique(rec$id)) # rename IDs rec$group <- ter$group[rec$id]
rec$age <- ter$age[rec$id] rec$Stime <- ter$tstop[rec$id]
rec$status <- as.numeric(!is.na(rec$tstop) & rec$tstop < rec$Stime)  # event indicator
rec$tstop <- pmin(rec$tstop, rec$Stime, na.rm = TRUE) # add cens time rec$Stime <- NULL
ter$id <- seq_along(ter$id)
remove(gammas_r, sigma_r, W_r, eta_r, alpha_r, invS_r, stop_times, start_times,
id_times, dur, j, i, tstart, u_r, root, tstop)
##############################################################################
# longitudinal outcome 2/2
long <- data.frame(id   = rep(ter$id, each = n_i), time = c(replicate(length(ter$id), c(0, sort(runif(n_i - 1, 1, tmax))))))
X <- model.matrix(~ 1 + splines::ns(time, knots = kn, Boundary.knots = Bkn),
data = long)
Z <- model.matrix(~ 1 + splines::ns(time, knots = kn, Boundary.knots = Bkn),
data = long)
eta_y <- as.vector(X %*% betas + rowSums(Z * b[long$id, ])) long$y <- rnorm(length(eta_y), eta_y, sigma_y)
long_cens <- long$time <= rep(ter$tstop, times = rle(long$id)$lengths)
long <- long[long_cens, , drop = FALSE] # drop censored encounters
remove(kn, Bkn, X, betas, Z, b, eta_y, sigma_y, n_i, tmax, long_cens, scale)
##############################################################################
# return
list(long = long, ter = ter, rec = rec)
}
set.seed(2022); fake_data <- gen_data()
term_data <- fake_data$ter # terminal event data recu_data <- fake_data$rec # recurrent events data
lme_data <- fake_data$long # longitudial marker data We now have three data frames, each one corresponding to a different outcome. To fit the joint model, the user must organize the failure-time data in the counting process formulation by combining the data for the terminal and recurrent events. Then, a strata variable is used to distinguish between the two processes. To facilitate this, we provide in the package the rc_setup() function: cox_data <- rc_setup(rc_data = recu_data, trm_data = term_data, idVar = "id", statusVar = "status", startVar = "tstart", stopVar = "tstop", trm_censLevel = 0, nameStrata = "strata", nameStatus = "status") Each subject has as many rows in the new data frame as the number of their recurrent risk periods plus one for the terminal event. The data frame follows the counting process formulation with the risk intervals delimited by start and stop variables. The strata variable denotes the type of event, 1 if recurrent, or 2 terminal. The status equals 1 if the subject had an event and 0 otherwise. As shown below and in Figure 3, subject 1 experienced seven recurrent events during the follow-up; the terminal event censored the eighth recurrent event. cox_data[cox_data$id == 1, c("id", "tstart", "tstop", "status", "strata")]
#>   id    tstart     tstop status strata
#> 1  1 0.0000000 0.3756627      1      R
#> 2  1 0.4275324 0.7832841      1      R
#> 3  1 0.8724938 1.0863887      1      R
#> 4  1 1.1212953 1.8741434      1      R
#> 5  1 1.9372355 2.7843451      1      R
#> 6  1 2.7906559 3.4618622      1      R
#> 7  1 3.5166929 3.5830169      1      R
#> 8  1 3.6251219 4.0375415      0      R
#> 9  1 0.0000000 4.0375415      1     T1

#### Fitting the model

The user then needs to use the nlme::lme() function first to fit the linear mixed model that describes the longitudinal outcome,

lme_fit <- lme(y ~ ns(time, k =  c(1, 3), B = c(0, 7)),
random = list(id = pdDiag(form = ~ ns(time, k = c(1, 3),
B = c(0, 7)))),
data = lme_data,
control = lmeControl(opt = "optim", niterEM = 45))

Then, we use the survival::coxph() function to fit a stratified Cox model using the transformed data,

cox_fit <- coxph(Surv(tstart, tstop, status) ~ (group + age):strata(strata),
data = cox_data)

These models are then provided as arguments in the jm() function. The user specifies the desired functional forms for the mixed model in each relative-risk model. And with the recurrent argument specifying the desired timescale,

jm_fit <- jm(cox_fit, lme_fit, time_var = "time", recurrent = "gap",
functional_forms =  ~ value(y):strata)

summary(jm_fit)
#>
#> Call:
#> jm(Surv_object = cox_fit, Mixed_objects = lme_fit, time_var = "time",
#>     recurrent = "gap", functional_forms = ~value(y):strata)
#>
#> Data Descriptives:
#> Number of Groups: 500        Number of events: 2250 (81.6%)
#> Number of Observations:
#>   y: 3075
#>
#>                  DIC     WAIC       LPML
#> marginal    13149.97 12983.05  -6736.434
#> conditional 19832.17 19095.66 -10201.566
#>
#> Random-effects covariance matrix:
#>
#>                        StdDev   Corr
#> (Intr)                 0.7423 (Intr) n(,k=c(1,3),B=c(0,7))1
#> n(,k=c(1,3),B=c(0,7))1 2.0832
#> n(,k=c(1,3),B=c(0,7))2 1.4450
#> n(,k=c(1,3),B=c(0,7))3 1.7567
#>
#>
#> (Intr)                 n(,k=c(1,3),B=c(0,7))2
#> n(,k=c(1,3),B=c(0,7))1
#> n(,k=c(1,3),B=c(0,7))2
#> n(,k=c(1,3),B=c(0,7))3
#>
#> Frailty standard deviation:
#>                 Mean   2.5%  97.5%
#> sigma_frailty 0.1928 0.0212 0.3105
#>
#> Survival Outcome:
#>                             Mean  StDev    2.5%  97.5%      P   Rhat
#> group:strata(strata)R     0.4638 0.0826  0.3032 0.6233 0.0000 1.0022
#> group:strata(strata)T1    0.4348 0.1071  0.2215 0.6346 0.0000 1.0204
#> age:strata(strata)R       0.0497 0.0029  0.0442 0.0555 0.0000 1.0395
#> age:strata(strata)T1      0.0475 0.0048  0.0383 0.0570 0.0000 1.2322
#> value(y):strataR          0.5071 0.0275  0.4563 0.5605 0.0000 1.0678
#> value(y):strataT1         0.4983 0.0402  0.4222 0.5762 0.0000 1.5801
#> frailty:strata(strata)T1 -0.6146 1.1373 -3.0913 2.2689 0.3878 1.1494
#>
#> Longitudinal Outcome: y (family = gaussian, link = identity)
#>                           Mean  StDev    2.5%  97.5%      P   Rhat
#> (Intercept)             6.9873 0.0427  6.9034 7.0701 0.0000 1.0228
#> n(,k=c(1,3),B=c(0,7))1  1.3662 0.1472  1.0788 1.6537 0.0000 1.0120
#> n(,k=c(1,3),B=c(0,7))2  0.6188 0.1477  0.3321 0.9173 0.0000 1.3360
#> n(,k=c(1,3),B=c(0,7))3 -0.1523 0.2027 -0.5384 0.2605 0.4333 1.2847
#> sigma                   0.6159 0.0100  0.5960 0.6356 0.0000 1.0121
#>
#> MCMC summary:
#> chains: 3
#> iterations per chain: 3500
#> burn-in per chain: 500
#> thinning: 1
#> time: 1.5 min

One can find the association parameters between the underlying value of the longitudinal outcome and the recurrent and terminating event processes in the summary output as value(y):strataRec ($$\alpha_{R_1}$$) and value(y):strataTer ($$\alpha_{T_1}$$), respectively. $$\exp\{\alpha_{R_1}\}$$ denotes the relative increase in the risk of the next recurrent event at time $$t$$ that results from one unit increase in $$\eta_{1_i}(t)$$ since the end of the previous event1. The association parameter for the frailty term in the terminal risk model, $$\alpha_{F}$$, is identified in the output as frailty. The sigma_frailty refers to the frailty standard deviation, $$\sigma_F$$.

1. This is the time reference because we are using the gap timescale. Alternatively, if we were using the calendar timescale, it would be the entry time in the study.↩︎