In this vignette we illustrate the use of a number of basic generic
functions for models fitted by the `mixed_model()`

function
of package **GLMMadaptive**.

We start by simulating some data for a binary longitudinal outcome:

```
set.seed(1234)
n <- 100 # number of subjects
K <- 8 # number of measurements per subject
t_max <- 15 # maximum follow-up time
# we construct a data frame with the design:
# everyone has a baseline measurement, and then measurements at random follow-up times
DF <- data.frame(id = rep(seq_len(n), each = K),
time = c(replicate(n, c(0, sort(runif(K - 1, 0, t_max))))),
sex = rep(gl(2, n/2, labels = c("male", "female")), each = K))
# design matrices for the fixed and random effects
X <- model.matrix(~ sex * time, data = DF)
Z <- model.matrix(~ time, data = DF)
betas <- c(-2.13, -0.25, 0.24, -0.05) # fixed effects coefficients
D11 <- 0.48 # variance of random intercepts
D22 <- 0.1 # variance of random slopes
# we simulate random effects
b <- cbind(rnorm(n, sd = sqrt(D11)), rnorm(n, sd = sqrt(D22)))
# linear predictor
eta_y <- as.vector(X %*% betas + rowSums(Z * b[DF$id, ]))
# we simulate binary longitudinal data
DF$y <- rbinom(n * K, 1, plogis(eta_y))
```

We continue by fitting the mixed effects logistic regression for
`y`

assuming random intercepts and random slopes for the
random-effects part.

```
fm <- mixed_model(fixed = y ~ sex * time, random = ~ time | id, data = DF,
family = binomial())
```

As in the majority of model-fitting functions in R, the
`print()`

and `summary()`

methods display a short
and a detailed output of the fitted model, respectively. For
`'MixMod'`

objects we obtain

```
fm
#>
#> Call:
#> mixed_model(fixed = y ~ sex * time, random = ~time | id, data = DF,
#> family = binomial())
#>
#>
#> Model:
#> family: binomial
#> link: logit
#>
#> Random effects covariance matrix:
#> StdDev Corr
#> (Intercept) 0.6707
#> time 0.2504 0.6773
#>
#> Fixed effects:
#> (Intercept) sexfemale time sexfemale:time
#> -2.054036030 -1.167231806 0.261163648 0.001917382
#>
#> log-Lik: -358.81
```

and

```
summary(fm)
#>
#> Call:
#> mixed_model(fixed = y ~ sex * time, random = ~time | id, data = DF,
#> family = binomial())
#>
#> Data Descriptives:
#> Number of Observations: 800
#> Number of Groups: 100
#>
#> Model:
#> family: binomial
#> link: logit
#>
#> Fit statistics:
#> log.Lik AIC BIC
#> -358.81 731.62 749.8562
#>
#> Random effects covariance matrix:
#> StdDev Corr
#> (Intercept) 0.6707
#> time 0.2504 0.6773
#>
#> Fixed effects:
#> Estimate Std.Err z-value p-value
#> (Intercept) -2.0540 0.3135 -6.5522 < 1e-04
#> sexfemale -1.1672 0.4673 -2.4976 0.012505
#> time 0.2612 0.0562 4.6463 < 1e-04
#> sexfemale:time 0.0019 0.0782 0.0245 0.980437
#>
#> Integration:
#> method: adaptive Gauss-Hermite quadrature rule
#> quadrature points: 11
#>
#> Optimization:
#> method: hybrid EM and quasi-Newton
#> converged: TRUE
```

The output is rather self-explanatory. However, just note that the
fixed-effects coefficients are on the linear predictor scale, and hence
are the corresponding log-odds for the intercept and log-odds ratios for
the rest of the parameters. The `summary()`

only shows the
estimated coefficients, standard errors and p-values, but no confidence
intervals. These can be separately obtained using the
`confint()`

method, i.e.,

```
exp(confint(fm))
#> 2.5 % Estimate 97.5 %
#> (Intercept) 0.06935892 0.1282164 0.2370198
#> sexfemale 0.12452870 0.3112273 0.7778321
#> time 1.16299270 1.2984401 1.4496624
#> sexfemale:time 0.85955354 1.0019192 1.1678646
```

By default the confidence intervals are produced for the fixed effects. Hence, taking the exp we obtain the confidence intervals for the corresponding odds-ratios. In addition, by default, the level of the confidence intervals is 95%. The following piece of code produces 90% confidence intervals for the variances of the random intercepts and slopes, and for their covariance:

```
confint(fm, parm = "var-cov", level = 0.90)
#> 5 % Estimate 95 %
#> var.(Intercept) 0.05214979 0.44984116 3.8803049
#> cov.(Int)_time -0.03350401 0.11374228 0.9571249
#> var.time 0.02225523 0.06269547 1.8129238
```

The estimated variance-covariance matrix of the maximum likelihood
estimates of all parameters is returned using the `vcov()`

method, e.g.,

```
vcov(fm)
#> (Intercept) sexfemale time sexfemale:time
#> (Intercept) 0.098274483 -0.074499996 -0.008216948 0.0061209157
#> sexfemale -0.074499996 0.218415104 0.005918094 -0.0175394670
#> time -0.008216948 0.005918094 0.003159419 -0.0029625166
#> sexfemale:time 0.006120916 -0.017539467 -0.002962517 0.0061144712
#> D_11 -0.087119578 -0.019450999 0.008223998 0.0018335034
#> D_12 0.016309404 0.003815306 -0.001720134 -0.0004513681
#> D_22 -0.086163936 -0.019241185 0.010700522 0.0001648898
#> D_11 D_12 D_22
#> (Intercept) -0.087119578 0.0163094042 -0.0861639364
#> sexfemale -0.019450999 0.0038153060 -0.0192411853
#> time 0.008223998 -0.0017201337 0.0107005218
#> sexfemale:time 0.001833503 -0.0004513681 0.0001648898
#> D_11 0.429031451 -0.0995215920 0.5054925071
#> D_12 -0.099521592 0.0369781665 -0.2113346392
#> D_22 0.505492507 -0.2113346392 1.3616186478
```

The elements of this covariance matrix that correspond to the
elements of the covariance matrix of the random effects (i.e., the
elements `D_xx`

) are on the log-Cholesky scale.

Robust standard errors using the sandwich estimator can be obtained
by setting the logical argument `sandwich`

to
`TRUE`

, i.e.,

```
vcov(fm, sandwich = TRUE)
#> (Intercept) sexfemale time sexfemale:time
#> (Intercept) 0.099375006 -0.070392849 -0.005229360 0.0024372130
#> sexfemale -0.070392849 0.213560962 0.003080019 -0.0183177219
#> time -0.005229360 0.003080019 0.002907846 -0.0026757783
#> sexfemale:time 0.002437213 -0.018317722 -0.002675778 0.0061654496
#> D_11 -0.080682266 -0.015269283 0.006084500 0.0034781044
#> D_12 0.017266666 -0.005005000 -0.001431995 -0.0002375471
#> D_22 -0.081773200 0.032220915 0.005444180 0.0024739539
#> D_11 D_12 D_22
#> (Intercept) -0.080682266 0.0172666661 -0.081773200
#> sexfemale -0.015269283 -0.0050049996 0.032220915
#> time 0.006084500 -0.0014319950 0.005444180
#> sexfemale:time 0.003478104 -0.0002375471 0.002473954
#> D_11 0.257244771 -0.0547470851 0.235884951
#> D_12 -0.054747085 0.0217713869 -0.122326509
#> D_22 0.235884951 -0.1223265091 0.831270814
```

The use of robust standard errors via the `sandwich`

argument is also available in the `summary()`

,
`confint()`

, `anova()`

,
`marginal_coefs()`

, `effectPlotData()`

,
`predict()`

, and `simulate()`

methods.

To extract the estimated fixed effects coefficients from a fitted
mixed model, we can use the `fixef()`

method. Similarly, the
empirical Bayes estimates of the random effects are extracted using the
`ranef()`

method, and finally the `coef()`

method
returns the subject-specific coefficients, i.e., the sum of the fixed
and random effects coefficients:

```
fixef(fm)
#> (Intercept) sexfemale time sexfemale:time
#> -2.054036030 -1.167231806 0.261163648 0.001917382
```

```
head(ranef(fm))
#> (Intercept) time
#> 1 -0.3667910 -0.123026294
#> 2 0.6656132 0.289323835
#> 3 0.4836942 0.215045666
#> 4 -0.1904426 -0.008732796
#> 5 0.2477730 0.160118862
#> 6 -0.3739774 -0.183618169
```

```
head(coef(fm))
#> (Intercept) sexfemale time sexfemale:time
#> 1 -2.420827 -1.167232 0.13813735 0.001917382
#> 2 -1.388423 -1.167232 0.55048748 0.001917382
#> 3 -1.570342 -1.167232 0.47620931 0.001917382
#> 4 -2.244479 -1.167232 0.25243085 0.001917382
#> 5 -1.806263 -1.167232 0.42128251 0.001917382
#> 6 -2.428013 -1.167232 0.07754548 0.001917382
```

The fixed effects estimates in mixed models with nonlinear link
functions have an interpretation conditional on the random effects.
However, often we wish to obtain parameters with a marginal / population
averaged interpretation, which leads many researchers to use generalized
estimating equations, and dealing with potential issues with missing
data. Nonetheless, recently Hedeker et al. have
proposed a nice solution to this problem. Their approach is implemented
in function `marginal_coefs()`

. For example, for model
`fm`

we obtain the marginalized coefficients using:

```
marginal_coefs(fm)
#> (Intercept) sexfemale time sexfemale:time
#> -1.6023 -1.0968 0.1766 0.0508
```

The function calculates the marginal log odds ratios in our case
(because we have a binary outcome) using a Monte Carlo procedure with
number of samples determined by the `M`

argument.

Standard errors for the marginalized coefficients are obtained by
setting `std_errors = TRUE`

in the call to
`marginal_coefs()`

, and require a double Monte Carlo
procedure for which argument `K`

comes also into play. To
speed up computations, the outer Monte Carlo procedure is performed in
parallel using package **parallel** and number of cores
specified in the `cores`

argument (results not shown):

`marginal_coefs(fm, std_errors = TRUE, cores = 5)`

The `fitted()`

method extracts fitted values from the
fitted mixed model. These are always on the scale of the response
variable. The `type`

argument of `fitted()`

specifies the type of fitted values computed. The default is
`type = "mean_subject"`

which corresponds to the fitted
values calculated using only the fixed-effects part of the linear
predictor; hence, for the subject who has random effects values equal to
0, i.e., the “mean subject”:

Setting `type = "subject_specific"`

will calculate the
fitted values using both the fixed and random effects parts, where for
the latter the empirical Bayes estimates of the random effects are
used:

```
head(fitted(fm, type = "subject_specific"))
#> 1 2 3 4 5 6
#> 0.08159826 0.08308495 0.10108500 0.23896304 0.24390557 0.24431851
```

Finally, setting `type = "marginal"`

will calculate the
fitted values based on the multiplication of the fixed-effects design
matrix with the marginalized coefficients described above (due to the
required computing time, these fitted values are not displayed):

The `residuals()`

method simply calculates the residuals
by subtracting the fitted values from the observed repeated measurements
outcome. Hence, this method also has a `type`

argument with
exactly the same options as the `fitted()`

method.

To display the estimated longitudinal evolution of the binary outcome we can use an effect plot. This is simply predictions from the models with the corresponding 95% pointwise confidence intervals.

As a first step we create a data frame the provides the setting based
on which the plot is to be produced; function `expand.grid()`

is helpful in this regard:

Next we use the `effectPlotData()`

function that does the
heavy lifting, i.e., calculates the predictions and confidence intervals
from a fitted mixed model for the data frame provided above, i.e.,

`plot_data <- effectPlotData(fm, nDF)`

Then we can produce the plot using for example the
`xyplot()`

function from package **lattice**,
e.g.,

```
library("lattice")
xyplot(pred + low + upp ~ time | sex, data = plot_data,
type = "l", lty = c(1, 2, 2), col = c(2, 1, 1), lwd = 2,
xlab = "Follow-up time", ylab = "log odds")
expit <- function (x) exp(x) / (1 + exp(x))
xyplot(expit(pred) + expit(low) + expit(upp) ~ time | sex, data = plot_data,
type = "l", lty = c(1, 2, 2), col = c(2, 1, 1), lwd = 2,
xlab = "Follow-up time", ylab = "Subject-Specific Probabilities")
```

The `effectPlotData()`

function also allows to compute
marginal predictions using the marginalized coefficients described
above. This is achieved by setting `marginal = TRUE`

in the
respective call (results not shown):

```
plot_data_m <- effectPlotData(fm, nDF, marginal = TRUE, cores = 2)
# we put the two groups in the same panel
my.panel.bands <- function(x, y, upper, lower, fill, col, subscripts, ..., font,
fontface) {
upper <- upper[subscripts]
lower <- lower[subscripts]
panel.polygon(c(x, rev(x)), c(upper, rev(lower)), col = fill, border = FALSE, ...)
}
xyplot(expit(pred) ~ time, group = sex, data = plot_data_m,
upper = expit(plot_data_m$upp), low = expit(plot_data_m$low),
type = "l", col = c("blue", "red"),
fill = c("#0000FF80", "#FF000080"),
panel = function (x, y, ...) {
panel.superpose(x, y, panel.groups = my.panel.bands, ...)
panel.xyplot(x, y, lwd = 2, ...)
}, xlab = "Follow-up time", ylab = "Marginal Probabilities")
```

In addition to using the `effectPlotData()`

function, the
same type of effect plots can be produced by the **effects**
package. For example, based on the `fm`

model we produce the
effect plot for `time`

and for the two groups using the call
to `predictorEffect()`

and its `plot()`

method:

```
library("effects")
plot(predictorEffect("time", fm), type = "link")
```

The `type`

argument controls the scale of the y-axis,
namely, `type = "link"`

corresponds to the log-odds scale. If
we would like to obtain the figure in the probability scale, we should
set `type = "response"`

, i.e.,

`plot(predictorEffect("time", fm), type = "response")`

For the additional functionality provided by the **effects**
package, check the vignette:
`vignette("predictor-effects-gallery", package = "effects")`

.

**Note:** Effects plots via the **effects**
package are currently only supported for the `binomial()`

and
`poisson()`

families.

The `anova()`

method can be used to compare two fitted
mixed models using a likelihood ratio test. For example, we test if we
can test the null hypothesis that the covariance between the random
intercepts and slopes is equal to zero using:

```
gm <- mixed_model(fixed = y ~ sex * time, random = ~ time || id, data = DF,
family = binomial())
anova(gm, fm)
#>
#> AIC BIC log.Lik LRT df p.value
#> gm 730.94 746.57 -359.47
#> fm 731.62 749.86 -358.81 1.32 1 0.2499
```

Using the `predict()`

method we can calculate predictions
for new subjects. As an example, we treat subject 1 from the
`DF`

dataset as a new patient (in the code below we change
his `id`

variable):

```
pred_DF <- DF[DF$id == 1, ][1:4, ]
pred_DF$id <- paste0("N", as.character(pred_DF$id))
pred_DF
#> id time sex y
#> 1 N1 0.0000000 male 0
#> 2 N1 0.1424363 male 0
#> 3 N1 1.7055512 male 0
#> 4 N1 9.1391210 male 0
```

We start by computing predictions based only on the fixed-effects part of the model; because of the nonlinear link function, these predictions are of subjects with random effects value equal to zero, which is not to the average predictions:

```
predict(fm, newdata = pred_DF, type_pred = "response",
type = "mean_subject", se.fit = TRUE)
#> $pred
#> 1 2 3 4
#> 0.1136452 0.1174465 0.1667820 0.5824332
#>
#> $se.fit
#> 1 2 3 4
#> 0.3134876 0.3098351 0.2818441 0.4604009
```

Population averaged predictions can be obtained by first calculating
the marginalized coefficients (using `marginal_coefs()`

) and
multiplying them with the fixed effects design matrix; this is achieved
using the option `type = "marginal"`

(due to the required
computing time, predictions not shown):

```
predict(fm, newdata = pred_DF, type_pred = "response",
type = "marginal", se.fit = FALSE)
```

Finally, we calculate subject-specific predictions; the standard errors are calculated with a Monte Carlo scheme (for details check the online help file):

```
predict(fm, newdata = pred_DF, type_pred = "response",
type = "subject_specific", se.fit = TRUE)
#> $pred
#> 1 2 3 4
#> 0.07937672 0.08042530 0.09278823 0.17716458
#>
#> $se.fit
#> 1 2 3 4
#> 0.02788083 0.02806700 0.03210400 0.11341043
#>
#> $low
#> 1 2 3 4
#> 0.04116652 0.04187969 0.04874345 0.04567313
#>
#> $upp
#> 1 2 3 4
#> 0.1509647 0.1524805 0.1693027 0.4959877
#>
#> $success_rate
#> [1] 0.5
```

Suppose now that we want predictions at time points at which no
responses `y`

have been recorded, e.g.,

```
future_Times <- pred_DF[1:3, c("id", "time", "sex")]
future_Times$time <- c(3, 4, 10)
future_Times
#> id time sex
#> 1 N1 3 male
#> 2 N1 4 male
#> 3 N1 10 male
```

Predictions at these time points can be calculated by provide this
data frame in the `newdata2`

argument of
`predict()`

:

```
predict(fm, newdata = pred_DF, newdata2 = future_Times, type_pred = "response",
type = "subject_specific", se.fit = TRUE)
#> $pred
#> 1 2 3 4
#> 0.07937672 0.08042530 0.09278823 0.17716458
#>
#> $pred2
#> 1 2 3
#> 0.1042907 0.1140225 0.1900830
#>
#> $se.fit
#> 1 2 3 4
#> 0.02788083 0.02806700 0.03210400 0.11341043
#>
#> $se.fit2
#> 1 2 3
#> 0.03874363 0.04616593 0.12788755
#>
#> $low
#> 1 2 3 4
#> 0.04116652 0.04187969 0.04874345 0.04567313
#>
#> $upp
#> 1 2 3 4
#> 0.1509647 0.1524805 0.1693027 0.4959877
#>
#> $low2
#> 1 2 3
#> 0.05089954 0.05126488 0.04379313
#>
#> $upp2
#> 1 2 3
#> 0.1952552 0.2214955 0.5575331
```

The `simulate()`

method can be used to simulate response
outcome data from a fitted mixed model. For example, we simulate two
realization of our dichotomous outcome:

```
head(simulate(fm, nsim = 2, seed = 123), 10)
#> [,1] [,2]
#> [1,] 0 0
#> [2,] 0 0
#> [3,] 0 0
#> [4,] 1 0
#> [5,] 1 0
#> [6,] 0 1
#> [7,] 0 0
#> [8,] 1 0
#> [9,] 0 0
#> [10,] 1 0
```

By setting `acount_MLEs_var = TRUE`

in the call to
`simulate()`

we also account for the variability in the
maximum likelihood estimates in the simulation of new responses. This is
achieved by simulating each time a new response vector using a
realization for the parameters from a multivariate normal distribution
with mean the MLEs and covariance matrix the covariance matrix of the
MLEs: