vignettes/Competing_Risks.Rmd
Competing_Risks.Rmd
The first step to fit a joint model for competing events in
JMbayes2 is to prepare the data for the event process.
If there are \(K\) competing events,
then each subject needs to have \(K\)
rows, one for each possible cause. The observed event time \(T_i\) of each subject is repeated \(K\) times, and there are two indicator
variables, namely one identifying the cause, and one indicating whether
the corresponding event type is the one that occurred. Standard survival
datasets that included a single row per patient, can be easily
transformed to the competing risks long format using function
crisk_setup()
. This function accepts as main arguments the
survival data in the standard format that has a single row per patient,
the name of the status variable, and the level in this status variable
that corresponds to censoring. We illustrate the use of this function in
the PBC data, in which we treat as competing risks transplantation and
death:
pbc2.id[pbc2.id$id %in% c(1, 2, 5), c("id", "years", "status")]
#> id years status
#> 1 1 1.095170 dead
#> 2 2 14.152338 alive
#> 5 5 4.120578 transplanted
pbc2.idCR <- crisk_setup(pbc2.id, statusVar = "status", censLevel = "alive",
nameStrata = "CR")
pbc2.idCR[pbc2.idCR$id %in% c(1, 2, 5),
c("id", "years", "status", "status2", "CR")]
#> id years status status2 CR
#> 1 1 1.095170 dead 1 dead
#> 1.1 1 1.095170 dead 0 transplanted
#> 2 2 14.152338 alive 0 dead
#> 2.1 2 14.152338 alive 0 transplanted
#> 5 5 4.120578 transplanted 0 dead
#> 5.1 5 4.120578 transplanted 1 transplanted
Note that each patient is now represented by two rows (we have two
possible causes of discontinuation from the study, death and
transplantation), the event time variable years
is
identical in both rows of each patient, variable CR
denotes
the cause for the specific line of the long dataset, and variable
status2
equals 1 if the corresponding event occurred.
For the event process, we specify cause-specific relative risks
models. Using dataset pbc2.idCR
, we fit the corresponding
cause-specific Cox regressions by including the interaction terms of age
and treatment with variable CR
, which is treated as a
stratification variable using the strata()
function:
CoxFit_CR <- coxph(Surv(years, status2) ~ (age + drug) * strata(CR),
data = pbc2.idCR)
For the longitudinal process, we include two longitudinal outcomes, namely serum bilirubin and the prothrombin time. For the former we use quadratic orthogonal polynomials in the fixed- and random-effects parts, and for the latter linear evolutions:
fm1 <- lme(log(serBilir) ~ poly(year, 2) * drug, data = pbc2,
random = ~ poly(year, 2) | id)
fm2 <- lme(prothrombin ~ year * drug, data = pbc2, random = ~ year | id)
To specify that each longitudinal outcome has a separate association coefficient per competing risk, we define the corresponding functional forms:
CR_forms <- list(
"log(serBilir)" = ~ value(log(serBilir)):CR,
"prothrombin" = ~ value(prothrombin):CR
)
Finally, the competing risks joint model is fitted with the following
call to jm()
(due to the complexity of the model, we have
increased the number of MCMC iterations and the burn-in period per
chain):
jFit_CR <- jm(CoxFit_CR, list(fm1, fm2), time_var = "year",
functional_forms = CR_forms,
n_iter = 25000L, n_burnin = 5000L, n_thin = 5L)
summary(jFit_CR)
#>
#> Call:
#> jm(Surv_object = CoxFit_CR, Mixed_objects = list(fm1, fm2), time_var = "year",
#> functional_forms = CR_forms, n_iter = 25000L, n_burnin = 5000L,
#> n_thin = 5L)
#>
#> Data Descriptives:
#> Number of Groups: 312 Number of events: 169 (27.1%)
#> Number of Observations:
#> log(serBilir): 1945
#> prothrombin: 1945
#>
#> DIC WAIC LPML
#> marginal 10830.90 11531.22 -6640.711
#> conditional 15751.34 15437.79 -8250.575
#>
#> Random-effects covariance matrix:
#>
#> StdDev Corr
#> (Intr) 1.3446 (Intr) p(,2)1 p(,2)2 (Intr)
#> p(,2)1 23.2447 0.7031
#> p(,2)2 12.4067 -0.2642 -0.1414
#> (Intr) 0.7875 0.6347 0.4392 -0.3304
#> year 0.3273 0.4322 0.3383 -0.0547 0.0294
#>
#> Survival Outcome:
#> Mean StDev 2.5% 97.5% P
#> age -0.0709 0.0250 -0.1237 -0.0237 0.0023
#> drugD-penicil -0.2762 0.4104 -1.1056 0.5185 0.4935
#> age:strata(CR)dead 0.1329 0.0244 0.0861 0.1857 0.0000
#> drugD-penicil:strata(CR)dead 0.2716 0.4435 -0.5849 1.1477 0.5357
#> value(log(serBilir)):CRtransplanted 1.0383 0.2243 0.6150 1.5046 0.0000
#> value(log(serBilir)):CRdead 1.4504 0.1167 1.2351 1.6891 0.0000
#> value(prothrombin):CRtransplanted 0.1138 0.1444 -0.1833 0.3892 0.4297
#> value(prothrombin):CRdead 0.1405 0.0498 0.0376 0.2346 0.0075
#> Rhat
#> age 1.0525
#> drugD-penicil 1.0264
#> age:strata(CR)dead 1.0642
#> drugD-penicil:strata(CR)dead 1.0246
#> value(log(serBilir)):CRtransplanted 1.0508
#> value(log(serBilir)):CRdead 1.0006
#> value(prothrombin):CRtransplanted 1.2420
#> value(prothrombin):CRdead 1.0323
#>
#> Longitudinal Outcome: log(serBilir) (family = gaussian, link = identity)
#> Mean StDev 2.5% 97.5% P Rhat
#> (Intercept) 1.1975 0.1142 0.9738 1.4255 0.0000 1.0003
#> poly(year, 2)1 27.9020 3.0486 22.1186 34.1021 0.0000 1.0026
#> poly(year, 2)2 1.2566 1.7449 -2.1592 4.6013 0.4690 1.0003
#> drugD-penicil -0.1904 0.1587 -0.5033 0.1192 0.2305 1.0002
#> p(,2)1 -3.1889 3.6345 -10.3643 3.7599 0.3857 1.0003
#> p(,2)2 -1.0593 2.1715 -5.2577 3.2234 0.6148 1.0002
#> sigma 0.3021 0.0062 0.2903 0.3145 0.0000 1.0005
#>
#> Longitudinal Outcome: prothrombin (family = gaussian, link = identity)
#> Mean StDev 2.5% 97.5% P Rhat
#> (Intercept) 10.6316 0.0823 10.4727 10.7934 0.0000 1.0026
#> year 0.2954 0.0396 0.2194 0.3755 0.0000 1.0033
#> drugD-penicil -0.0936 0.1152 -0.3226 0.1284 0.4225 1.0017
#> year:drugD-penicil -0.0246 0.0517 -0.1256 0.0780 0.6350 1.0007
#> sigma 1.0551 0.0205 1.0161 1.0964 0.0000 1.0029
#>
#> MCMC summary:
#> chains: 3
#> iterations per chain: 25000
#> burn-in per chain: 5000
#> thinning: 5
#> time: 6.4 min
Based on the fitted competing risks joint model, we will illustrate how (dynamic) predictions for the cause-specific cumulative risk probabilities can be calculated. As an example, we will show these calculations for Patient 81 from the PBC dataset. First, we extract the data of this subject.
ND_long <- pbc2[pbc2$id == 81, ]
ND_event <- pbc2.idCR[pbc2.idCR$id == 81, ]
ND_event$status2 <- 0
ND <- list(newdataL = ND_long, newdataE = ND_event)
The first line extracts the longitudinal measurements, and the second
line extracts the event times per cause (i.e., death and
transplantation). This particular patient died at 6.95 years, but to
make the calculation of cause-specific cumulative risk more relevant, we
presume that she did not have the event, and we set the event status
variable status2
to zero. The last line combines the two
datasets in a list. Note: this last step is a prerequisite from
the predict()
method for competing risks joint model. That
is, the datasets provided in the arguments newdata
and
newdata2
need to be named lists with two components. The
first component needs to be named newdataL
and contain the
dataset with the longitudinal measurements, and the second component
needs to be named newdataE
and contain the dataset with the
event information.
The predictions are calculated using the predict()
method. The first call to this function calculates the prediction for
the longitudinal outcomes at the times provided in the
times
argument, and the second call calculates the
cause-specific cumulative risk probabilities. By setting the argument
return_newdata
to TRUE
in both calls, we can
use the corresponding plot()
method to depict the
predictions:
predLong <- predict(jFit_CR, newdata = ND, return_newdata = TRUE,
times = seq(6.5, 15, length = 25))
predEvent <- predict(jFit_CR, newdata = ND, return_newdata = TRUE,
process = "event")
plot(predLong, predEvent, outcomes = 1:2, ylim_long_outcome_range = FALSE,
col_line_event = c("#03BF3D", "#FF0000"),
fill_CI_event = c("#03BF3D4D", "#FF00004D"), pos_ylab_long = c(1.5, 11.5))
legend(x = 8.1, y = 0.45, legend = levels(pbc2.idCR$CR),
lty = 1, lwd = 2, col = c("#03BF3D", "#FF0000"), bty = "n", cex = 0.8)