vignettes/Time_Varying_Effects.Rmd
Time_Varying_Effects.Rmd
The basic definition of the joint model assumes the coefficients that
quantify the association between the versions of the longitudinal
outcomes and the hazard of the event are time-constant (i.e., the
proportional hazards assumption). We can relax this assumption by
specifying time-varying coefficients via the
functional_forms
argument of function
jm()
.
We will illustrate this capability using the PBC dataset. We start by fitting a Cox model for the composite event transplantation or death, including sex as a baseline covariate:
pbc2.id$status2 <- as.numeric(pbc2.id$status != 'alive')
CoxFit <- coxph(Surv(years, status2) ~ sex, data = pbc2.id)
We aim to assess the strength of the association between the risk of
the composite event and the serum bilirubin level. We will describe the
patient-specific profiles over time for this biomarker using a linear
mixed-effects model, where we include an intercept in both the fixed and
random effects, as well as the linear and quadratic time effects. In the
fixed effects, we also include the interaction of the time effect and
sex. The syntax to fit this model with lme()
is:
fm <- lme(log(serBilir) ~ poly(year, 2) * sex, data = pbc2,
random = ~ poly(year, 2) | id, control = lmeControl(opt = 'optim'))
The default call to jm()
adds the subject-specific
linear predictor of the mixed model as a time-varying covariate in the
survival relative risk model:
jointFit1 <- jm(CoxFit, fm, time_var = "year")
summary(jointFit1)
#>
#> Call:
#> jm(Surv_object = CoxFit, Mixed_objects = fm, time_var = "year")
#>
#> Data Descriptives:
#> Number of Groups: 312 Number of events: 169 (54.2%)
#> Number of Observations:
#> log(serBilir): 1945
#>
#> DIC WAIC LPML
#> marginal 4346.937 6096.897 -3271.149
#> conditional 8713.787 8448.451 -4517.738
#>
#> Random-effects covariance matrix:
#>
#> StdDev Corr
#> (Intr) 1.3091 (Intr) p(,2)1
#> p(,2)1 21.6892 0.6786
#> p(,2)2 12.1450 -0.2361 -0.1267
#>
#> Survival Outcome:
#> Mean StDev 2.5% 97.5% P Rhat
#> sexfemale -0.1531 0.2625 -0.6448 0.3834 0.5444 1.0118
#> value(log(serBilir)) 1.2974 0.0989 1.1169 1.5105 0.0000 1.0469
#>
#> Longitudinal Outcome: log(serBilir) (family = gaussian, link = identity)
#> Mean StDev 2.5% 97.5% P Rhat
#> (Intercept) 1.4897 0.2236 1.0479 1.9216 0.0000 1.0008
#> poly(year, 2)1 29.5318 5.1216 19.7020 39.8127 0.0000 1.0106
#> poly(year, 2)2 -4.8101 3.1672 -11.1911 1.2785 0.1256 1.0055
#> sexfemale -0.4703 0.2386 -0.9310 0.0002 0.0502 1.0001
#> p(,2)1 -5.0308 5.3098 -15.6799 5.3492 0.3393 1.0088
#> p(,2)2 6.3008 3.2751 -0.1080 12.8040 0.0540 1.0120
#> sigma 0.3028 0.0061 0.2909 0.3149 0.0000 1.0077
#>
#> MCMC summary:
#> chains: 3
#> iterations per chain: 3500
#> burn-in per chain: 500
#> thinning: 1
#> time: 17 sec
To specify that the association of serum bilirubin may change over
time, we include an interaction of this time-varying covariate with a
natural cubic spline of time using function ns()
from the
splines package. Important Note: For
this to work correctly, we need to explicitly specify the internal and
boundary knots for the B-splines basis, i.e., in the following example,
we set the internal knots at 3, 6, and 9 years, and the boundary knots
at 0 and 14.5 years:
form_splines <- ~ value(log(serBilir)) * ns(year, k = c(3, 6, 9), B = c(0, 14.5))
jointFit2 <- update(jointFit1, functional_forms = form_splines,
n_iter = 6500L, n_burnin = 2500L)
summary(jointFit2)
#>
#> Call:
#> jm(Surv_object = CoxFit, Mixed_objects = fm, time_var = "year",
#> functional_forms = form_splines, n_iter = 6500L, n_burnin = 2500L)
#>
#> Data Descriptives:
#> Number of Groups: 312 Number of events: 169 (54.2%)
#> Number of Observations:
#> log(serBilir): 1945
#>
#> DIC WAIC LPML
#> marginal 4341.405 6355.209 -3701.785
#> conditional 8721.121 8458.791 -4518.159
#>
#> Random-effects covariance matrix:
#>
#> StdDev Corr
#> (Intr) 1.3165 (Intr) p(,2)1
#> p(,2)1 22.1410 0.6886
#> p(,2)2 12.1107 -0.2160 -0.1051
#>
#> Survival Outcome:
#> Mean StDev
#> sexfemale -0.1687 0.2778
#> value(log(serBilir)) 1.3764 0.2307
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))1 -0.2749 0.2747
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))2 -0.0072 0.3547
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))3 -0.6068 0.7512
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))4 -1.1876 0.9046
#> 2.5% 97.5%
#> sexfemale -0.7010 0.3837
#> value(log(serBilir)) 0.9567 1.8273
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))1 -0.8186 0.2543
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))2 -0.7200 0.7006
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))3 -2.0471 0.9432
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))4 -2.7763 0.6365
#> P Rhat
#> sexfemale 0.5503 1.0000
#> value(log(serBilir)) 0.0000 1.0511
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))1 0.3257 1.0029
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))2 0.9955 1.0172
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))3 0.4217 1.0656
#> value(log(serBilir)):ns(year, k = c(3, 6, 9), B = c(0, 14.5))4 0.2075 1.1052
#>
#> Longitudinal Outcome: log(serBilir) (family = gaussian, link = identity)
#> Mean StDev 2.5% 97.5% P Rhat
#> (Intercept) 1.4928 0.2264 1.0449 1.9387 0.0000 1.0071
#> poly(year, 2)1 30.0902 5.3321 19.8109 40.8594 0.0000 1.0643
#> poly(year, 2)2 -4.4062 3.2052 -10.5846 1.9815 0.1732 1.0559
#> sexfemale -0.4713 0.2410 -0.9480 0.0006 0.0505 1.0053
#> p(,2)1 -5.3315 5.6721 -16.7089 5.5721 0.3432 1.0534
#> p(,2)2 6.0298 3.3524 -0.5961 12.6127 0.0733 1.0530
#> sigma 0.3030 0.0062 0.2907 0.3151 0.0000 1.0019
#>
#> MCMC summary:
#> chains: 3
#> iterations per chain: 6500
#> burn-in per chain: 2500
#> thinning: 1
#> time: 35 sec
The spline coefficients do not have a straightforward interpretation. We, therefore, visualize the time-varying association of log serum bilirubin with the hazard of the composite event using the following piece of code:
x_times <- seq(0.001, 12, length = 501)
X <- cbind(1, ns(x_times, knots = c(3, 6, 9), B = c(0, 14.5)))
mcmc_alphas <- do.call('rbind', jointFit2$mcmc$alphas)
log_hr <- X %*% t(mcmc_alphas)
log_hr_mean <- rowMeans(log_hr)
log_hr_low <- apply(log_hr, 1, quantile, probs = 0.025)
log_hr_upp <- apply(log_hr, 1, quantile, probs = 0.975)
matplot(x_times, cbind(exp(log_hr_mean), exp(log_hr_low), exp(log_hr_upp)),
type = "l", col = c("red", "black", "black"), lty = c(1, 2, 2), lwd = 2,
xlab = "Follow-up Time (years)", ylab = "Hazard Ratio log serum Bilirubin",
ylim = c(0.5, 6.4))
abline(h = exp(coef(jointFit1)$association), lty = 2, col = "red")
abline(h = 1, lty = 2)
legend("topright", c("time-varying coefficient", "proportional hazards"),
lty = c(1, 2), lwd = c(2, 1), col = "red", bty = "n")
We observe that the 95% credible interval for the time-varying coefficient includes the horizontal line that corresponds to proportional hazards. This is also confirmed by comparing the two models:
compare_jm(jointFit1, jointFit2)
#>
#> DIC WAIC LPML
#> jointFit1 4346.937 6096.897 -3271.149
#> jointFit2 4341.405 6355.209 -3701.785
#>
#> The criteria are calculated based on the marginal log-likelihood.
The WAIC and LPML indicate that jointFit1
is a better
model than jointFit2
. The DIC has the same magnitude for
both models.