February 6

Motivation

Mixed models


  • Mixed effects models have numerous applications
    • Longitudinal data
    • Clustered data
    • Multilevel data

The basic idea


  • We have a grouped outcome \({\color{blue} Y}\)
    • measurements in the same group(s) are correlated


  • Correlations are modeled via random effects \({\color{red} b}\)
    • measurements in the same group share the same random effect

Mathematical definition - 1

  • A general definition of mixed models: \[ \left \{ \begin{array}{ccl} {\color{blue} Y} \mid {\color{red} b} & \sim & \mathcal F(\theta_y)\\&&\\ {\color{red} b} & \sim & \mathcal N(0, D)\\ \end{array} \right. \]
    • the distribution \(\mathcal F\) depends on the outcome
    • the random effects most often are assumed normally distributed
    • \(\theta = (\theta_y^\top, \mbox{vech}(D)^\top)^\top\) are the model parameters

Mathematical definition - 2

  • To estimate mixed models under maximum likelihood we need the marginal log-likelihood function \[ \ell(\theta) = \sum_{i = 1}^n \log \int p(y_i \mid b_i; \theta) \, p(b_i; \theta) \; db_i \]

    To estimate mixed models we need to solve the integral

Normal vs the rest - 1

  • The random effects \(b\) are assumed to have a (multivariate) normal distribution

  • When
    • the outcome \(Y\) also has a normal distribution,
    • the random effects enter linearly in the linear predictor

the integral can be solved analytically

\({\color{blue} \Rightarrow}\) Easy to program \({\color{blue} \Rightarrow}\) Stable algorithms available in most software

Normal vs the rest - 2


  • However, in all other cases these integrals cannot be solved analytically


\({\color{red} \Rightarrow}\) Much more difficult to program

\({\color{red} \Rightarrow}\) Several solutions available in software, however, many not optimal!!

Estimation

How to solve the integrals

  • Two general solutions


  1. Approximation of the integrand
    • make it look like a normal for which we know the solution

  2. Approximation of the integral
    • finite sum over a grid

Popular available options

Which to choose

  • Penalized Quasi Likelihood: generally not good


  • Laplace: needs many repeated measurements per subject
    • it does not work well with binary data and Poisson with low counts


  • Adaptive Gaussian quadrature: Gold standard
    • computationally intensive


  • Monte Carlo: requires fine tuning
    • no clear winner algorithm

What is available in R

Mixed model packages in R - 1

  • lme4::glmer()
    • Laplace approximation as default
    • adaptive Gaussian quadrature only for scalar random effects

  • glmmTMB::glmmTMB() more flexible than lme4
    • Laplace approximation

  • glmmsr::glmm()
    • Monte Carlo importance sampling

  • MASS::glmmPQL()
    • PQL based on nlme::lme()

Does it matter? - 1

  • The Toenail dataset data("toenail", package = "HSAUR2")
    • mixed effects logistic regression
    • PQL vs Laplace vs adaptive Gaussian quadrature
fm_PQL <- glmmPQL(outcome ~ time * treatment, random = ~ 1 | patientID, 
                  data = toenail, family = binomial())

fm_Laplace <- glmer(outcome ~ time * treatment + (1 | patientID), 
                  data = toenail, family = binomial())

fm_AGQ15 <-  glmer(outcome ~ time * treatment + (1 | patientID), 
                  data = toenail, family = binomial(), nAGQ = 15)

Does it matter? - 2

  • The Toenail dataset - results
cbind(PQL = fixef(fm_PQL), Laplace = fixef(fm_Laplace), AGQ_15 = fixef(fm_AGQ15))
##                                   PQL    Laplace     AGQ_15
## (Intercept)               -0.74324709 -2.5098642 -1.6473652
## time                      -0.29469102 -0.3997340 -0.3924231
## treatmentterbinafine      -0.03480231 -0.3048303 -0.1663130
## time:treatmentterbinafine -0.10017515 -0.1371352 -0.1371755

Does it matter? - 3

  • The Salamanders dataset data("Salamanders", package = "glmmTMB")
    • mixed effects truncated Poisson
    • (spoiler alert: Code)
gm_Laplace <- glmmTMB(count ~ spp + mined + (1 | site), ziformula = ~ spp + mined, 
                      family = truncated_poisson(), data = Salamanders)

gm_AGQ25 <- mixed_model(count ~ spp + mined, random = ~ 1 | site, 
                        zi_fixed = ~ spp + mined, 
                        family = hurdle.poisson(), data = Salamanders, nAGQ = 25)

Does it matter? - 4

  • The Salamanders dataset - results
cbind(Laplace = fixef(gm_Laplace)$cond, AGQ_25 = fixef(gm_AGQ25))
##                 Laplace     AGQ_25
## (Intercept) -0.06702286 -3.1900355
## sppPR       -0.52092708  0.1976119
## sppDM        0.22457540  0.9137361
## sppEC-A     -0.19548416  0.6538284
## sppEC-L      0.64672238  1.4999425
## sppDES-L     0.60513701  0.9589269
## sppDF        0.04602476  0.3426018
## minedno      1.01446593  0.7137289

Mixed model packages in R - 2





Need for adaptive Gaussian quadrature for \({\color{red} > 1}\) random effects

A new package


  • GLMMadaptive has been written to fill this gap


  • Principles
    • User-friendly
    • Satisfy the needs of most users
    • Give advanced users the option to extend it
    • Well documented

What GLMMadaptive offers

Model fitting

  • The basic model-fitting function is GLMMadaptive::mixed_model()


  • Required arguments
    • fixed a formula for the fixed effects
    • random a formula for the random effects
    • data a data frame containing the variables to use
    • family a family object specifying the model

Available models: Standard GLMMs

  • It fits the most frequently-used GLMMs
    • mixed effects logistic regression family = binomial()
    • mixed effects Poisson regression family = poisson()
  • A basic example:

    mixed_model(fixed = y ~ time * group, random = ~ time | id, data = DF,
            family = binomial(), nAGQ = 15)
vignette("GLMMadaptive_basics", package = "GLMMadaptive")

Available models: Specialized GLMMs - 1


  • It fits some non-standard GLMMs
    • negative binomial mixed model family = negative.binomial()
    • Student's-t mixed model family = students.t()
    • Beta mixed model family = beta.fam()

Available models: Specialized GLMMs - 2

  • Zero-inflated mixed models
    • zero-inflated Poisson family = zi.poisson()
    • zero-inflated negative binomial family = zi.negative.binomial()

  • Hurdle mixed models
    • hurdle Poisson family = hurdle.poisson()
    • hurdle negative binomial family = hurdle.negative.binomial()
    • hurdle log-normal family = hurdle.lognormal()
    • hurdle Beta family = hurdle.beta.fam()
vignette("ZeroInflated_and_TwoPart_Models", package = "GLMMadaptive")

User-defined models


  • The user can define its own mixed model using a user-defined family object
    • specify log-density function
    • specify derivatives (optional)
vignette("Custom_Models", package = "GLMMadaptive")

What the user needs - 1

  • summary(): Estimated coefficients, standard errors & p-values


  • anova(): Hypothesis testing
    • Wald tests
    • likelihood ratio tests
    • AIC/BIC

What the user needs - 2


  • Effects Plots
    • build-in effectPlotData()
    • support for effects
    • support for ggeffects
vignette("Methods_MixMod", package = "GLMMadaptive")

What the user needs - 3


  • Goodness of Fit
    • support for DHARMa
    • caveat: MAR missing data
vignette("Goodness_of_Fit", package = "GLMMadaptive")

What the user needs - 4


  • Multiple comparisons
    • support for emmeans
    • support for multcomp
vignette("Multiple_Comparisons", package = "GLMMadaptive")

What the user needs - 5

  • Predictions via predict() method
    • average subject
    • marginal
    • subject-specific
    • dynamic subject-specific
    • confidence & prediction intervals
    • proper scoring rules via scoring_rules()
vignette("Methods_MixMod", package = "GLMMadaptive")

vignette("Dynamic_Predictions", package = "GLMMadaptive")

What the user needs - 6

  • Methods for standard generics
    • confint(), vcov() including the sandwich estimator
    • coef(), fixef(), ranef()
    • logLik(), nobs()
    • fitted(), residuals()
    • simulate()

  • For developers
    • model.matrix(), model.frame()
    • terms(), formula(), family()

A Few extras - 1

A Few extras - 2


  • Marginalized coefficients
    • Heagerty et al. have proposed marginalized mixed models
      \({\color{red} \Rightarrow}\) Computationally intensive to fit
    • Hedeker et al. recently proposed an easier alternative solution
    • function marginal_coefs() implements Hedeker's et al. solution with standard errors

A Few extras - 3

Documentation

What the future holds…

Future plans

  • Implement nested random effects
    • using Matrix sparse matrices classes


  • More options for the covariance matrix of the random effects
    • auto-regressive
    • compound symmetry


  • Extra models
    • Conway-Maxwell-Poisson mixed model