Chapter 6: Multilevel Modeling

“Simplicity does not precede complexity, but follows it.” — Alan Perlis

Multilevel data are more complex and don’t meet the assumptions of regular linear or generalized linear models. But with the right modeling schemes, the results can be very interpretable and actionable. Two powerful forms of multilevel modeling are:

  1. Generalized Estimating Equations (GEE)
  2. Mixed effects (ME; i.e., hierarchical linear modeling, multilevel modeling)

Several similarities and differences should be noted briefly. As for similarities, they both attempt to control for the lack of independence within clusters, although they do it in different ways. Also, they are both built on linear regression which makes them flexible and powerful at finding relationships in the data.

The differences are subtle but important. First, the interpretation is somewhat different between the two. GEE is a population-averaged (e.g., marginal) model whereas ME is subject specific. In other words, GEE is the average effect while ME is the effect found in the average person. In a linear model, these coefficients are the same but when we use different forms such as logistic or poisson, these can be quite different (although in my experience they generally tell a similar story). Second, ME models are much more complex than the GEE models and can struggle with convergence compared to the GEE. This also means that GEE’s are generally fitted much more quickly. Still the choice of the modeling technique should be driven by your hypotheses and not totally dependent on speed of the computation.

First, if we needed to, we’d reshape our data so that it is ready for the analyses (see Chapter 8 for more on reshaping). For both modeling techniques we want our data in long form20. What this implies is that each row is an observation. What this actually means about the data depends on the data. For example, if you have repeated measures, then often data is stored in wide form—a row is an individual. To make this long, we want each time point within a person to be a row—a single individual can have multiple rows but each row is a unique observation.

Currently, our data is in long form since we are working within community clusters within this data. So, each row is an observation and each cluster has multiple rows. Note that although these analyses will be within community clusters instead of within subjects (i.e. repeated measures), the overall steps will be the exact same.

This chapter certainly does not cover all of multilevel modeling in R. Entire books are dedicated to that single subject. Rather, we are introducing the methods and the packages that can be used to start using these methods.

GEE

There are two packages, intimately related, that allow us to perform GEE modeling—gee and geepack. These have some great features and make running a fairly complex model pretty simple. However, as great as they are, there are some annoying shortcomings. We’ll get to a few of them throughout this section.

GEE’s, in general, want a few pieces of information from you. First, the outcome and predictors. This is just as in linear regression and GLM’s. Second, we need to provide a correlation structure. This tells the model the approximate pattern of correlations between the time points or clusters. It also wants a variable that tells the cluster ID’s. Finally, it also wants the family (i.e. the type of distribution).

Since this is not longitudinal, but rather clustered within communities, we’ll assume for this analysis an unstructured correlation structure. It is the most flexible and we have enough power for it here.

For geepack to work, we need to filter out the missing values for the variables that will be in the model.

df2 <- df %>%
  filter(complete.cases(dep, famsize, sed, race, asthma))

Now, we’ll build the model with both packages (just for demonstration). We predict depression with asthma, family size, minutes of sedentary behavior, and the subject’s race.

library(gee)
fit_gee <- gee(dep ~ asthma + famsize + sed + race,
               data = df2,
               id = df2$sdmvstra,
               corstr = "unstructured")
##       (Intercept)      asthmaAsthma           famsize               sed 
##       2.500022059       1.356081567      -0.042132178       0.001362226 
## raceOtherHispanic         raceWhite         raceBlack         raceOther 
##       1.184995689       0.113949209       0.100536695      -0.555478773
summary(fit_gee)$coef
##                       Estimate   Naive S.E.    Naive z  Robust S.E.
## (Intercept)        2.495509790 0.2867816215  8.7017773 0.2690426648
## asthmaAsthma       1.353039007 0.1867101195  7.2467363 0.2137975620
## famsize           -0.039489294 0.0461945052 -0.8548483 0.0457474654
## sed                0.001358042 0.0003362291  4.0390382 0.0003551901
## raceOtherHispanic  1.192481318 0.3075562837  3.8772783 0.3309608614
## raceWhite          0.116185743 0.2531554533  0.4589502 0.2279687738
## raceBlack          0.096800821 0.2625826864  0.3686489 0.2360498473
## raceOther         -0.555053605 0.2809301544 -1.9757708 0.2406566044
##                     Robust z
## (Intercept)        9.2755169
## asthmaAsthma       6.3285989
## famsize           -0.8632018
## sed                3.8234244
## raceOtherHispanic  3.6030886
## raceWhite          0.5096564
## raceBlack          0.4100864
## raceOther         -2.3064133
library(geepack)
fit_geeglm <- geeglm(dep ~ asthma + famsize + sed + race,
                     data = df2,
                     id = df2$sdmvstra,
                     corstr = "unstructured")
summary(fit_geeglm)
## 
## Call:
## geeglm(formula = dep ~ asthma + famsize + sed + race, data = df2, 
##     id = df2$sdmvstra, corstr = "unstructured")
## 
##  Coefficients:
##                     Estimate    Std.err   Wald Pr(>|W|)    
## (Intercept)        2.5579361  0.2700717 89.706  < 2e-16 ***
## asthmaAsthma       1.3492892  0.2156202 39.159 3.91e-10 ***
## famsize           -0.0446716  0.0457087  0.955 0.328415    
## sed                0.0013015  0.0003548 13.454 0.000244 ***
## raceOtherHispanic  1.1750373  0.3318983 12.534 0.000400 ***
## raceWhite          0.0806377  0.2295661  0.123 0.725392    
## raceBlack          0.0642028  0.2363255  0.074 0.785875    
## raceOther         -0.5902049  0.2413379  5.981 0.014463 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Estimated Scale Parameters:
##             Estimate Std.err
## (Intercept)    19.49  0.7843
## 
## Correlation: Structure = unstructured  Link = identity 
## 
## Estimated Correlation Parameters:
##           Estimate Std.err
## alpha.1:2  0.12480 0.01654
## alpha.1:3  0.42070 0.10339
## alpha.1:4  2.89640 1.06678
## alpha.1:5 -1.85447 0.20276
## alpha.2:3  0.12238 0.06330
## alpha.2:4 -0.08935 0.20229
## alpha.2:5  0.20541 0.03720
## alpha.3:4 -0.49597 0.11227
## alpha.3:5  0.25045 0.03879
## alpha.4:5 -0.66939 0.08761
## Number of clusters:   4109   Maximum cluster size: 5

The gee package doesn’t directly provide p-values but provides the z-scores, which can be used to find the p-values. The geepack provides the p-values in the way you’ll see in the lm() and glm() functions.

These models are interpreted just as the regular GLM. It has adjusted for the correlations within the clusters and provides valid standard errors and p-values.

Mixed Effects

Mixed effects models require a bit more thinking about the effects. It is called “mixed effects” because we include both fixed and random effects into the model simultaneously. The random effects are those that we don’t necessarily care about the specific values but want to control for it and/or estimate the variance. The fixed effects are those we are used to estimating in linear models and GLM’s.

These are a bit more clear with an example. We will do the same overall model as we did with the GEE but we’ll use ME. To do so, we’ll use the lme4 package. In the model below, we predict depression with asthma, family size, minutes of sedentary behavior, and the subject’s race. We have a random intercept (which allows the intercept to vary across clusters).

library(lme4)
fit_me <- lmer(dep ~ asthma + famsize + sed + race + (1 | cluster),
               data = df2,
               REML = FALSE)
summary(fit_me)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: dep ~ asthma + famsize + sed + race + (1 | cluster)
##    Data: df2
## 
##      AIC      BIC   logLik deviance df.resid 
##    25780    25844   -12880    25760     4427 
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -1.327 -0.635 -0.355  0.272  5.435 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  cluster  (Intercept)  0.105   0.324   
##  Residual             19.389   4.403   
## Number of obs: 4437, groups:  cluster, 14
## 
## Fixed effects:
##                    Estimate Std. Error t value
## (Intercept)        2.491678   0.302768    8.23
## asthmaAsthma       1.335445   0.186618    7.16
## famsize           -0.042857   0.046341   -0.92
## sed                0.001425   0.000337    4.23
## raceOtherHispanic  1.289890   0.320595    4.02
## raceWhite          0.008348   0.259449    0.03
## raceBlack          0.171658   0.273382    0.63
## raceOther         -0.552746   0.285512   -1.94
## 
## Correlation of Fixed Effects:
##             (Intr) asthmA famsiz sed    rcOthH racWht rcBlck
## asthmaAsthm -0.042                                          
## famsize     -0.510 -0.004                                   
## sed         -0.324 -0.044  0.051                            
## rcOthrHspnc -0.556 -0.032  0.051 -0.038                     
## raceWhite   -0.680 -0.038  0.135 -0.148  0.639              
## raceBlack   -0.643 -0.057  0.094 -0.131  0.624  0.775       
## raceOther   -0.580  0.000  0.048 -0.135  0.589  0.725  0.693

You’ll see that there are no p-values provided here. This is because p-values are not well-defined in the ME framework. A good way to test it can be through the anova() function, comparing models. Let’s compare a model with and without asthma to see if the model is significantly better with it in.

fit_me1 <- lmer(dep ~ famsize + sed + race + (1 | cluster),
               data = df2,
               REML = FALSE)

anova(fit_me, fit_me1)
## Data: df2
## Models:
## fit_me1: dep ~ famsize + sed + race + (1 | cluster)
## fit_me: dep ~ asthma + famsize + sed + race + (1 | cluster)
##         Df   AIC   BIC logLik deviance Chisq Chi Df Pr(>Chisq)    
## fit_me1  9 25829 25886 -12905    25811                            
## fit_me  10 25780 25844 -12880    25760  50.9      1    9.9e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This comparison strongly suggests that asthma is a significant predictor (\(\chi^2 = 50.5\), p < .001). We can do this with both fixed and random effects, as below:

fit_me2 <- lmer(dep ~ famsize + sed + race + (1 | cluster),
               data = df2,
               REML = TRUE)
fit_me3 <- lmer(dep ~ famsize + sed + race + (1 + asthma | cluster),
               data = df2,
               REML = TRUE)
anova(fit_me2, fit_me3, refit = FALSE)
## Data: df2
## Models:
## fit_me2: dep ~ famsize + sed + race + (1 | cluster)
## fit_me3: dep ~ famsize + sed + race + (1 + asthma | cluster)
##         Df   AIC   BIC logLik deviance Chisq Chi Df Pr(>Chisq)    
## fit_me2  9 25855 25912 -12918    25837                            
## fit_me3 11 25821 25892 -12900    25799  37.3      2      8e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Here, including random slopes for asthma appears to be significant (\(\chi^2 = 36.9\), p < .001).

Linear mixed effects models converge pretty well. You’ll see that the conclusions and estimates are very similar to that of the GEE. For generalized versions of ME, the convergence can be harder and more picky. As we’ll see below, it complains about large eigenvalues and tells us to rescale some of the variables.

library(lme4)
fit_gme <- glmer(dep2 ~ asthma + famsize + sed + race + (1 | cluster),
                 data = df2,
                 family = "binomial")
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00853519
## (tol = 0.001, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
##  - Rescale variables?

After a quick check, we can see that sed is huge compared to the other variables. If we simply rescale it, using the I() function within the model formula, we can rescale it by 1,000. Here, that is all it needed to converge.

library(lme4)
fit_gme <- glmer(dep2 ~ asthma + famsize + I(sed/1000) + race + (1 | cluster),
                 data = df2,
                 family = "binomial")
summary(fit_gme)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: dep2 ~ asthma + famsize + I(sed/1000) + race + (1 | cluster)
##    Data: df2
## 
##      AIC      BIC   logLik deviance df.resid 
##     2665     2722    -1323     2647     4428 
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -0.635 -0.329 -0.295 -0.258  5.032 
## 
## Random effects:
##  Groups  Name        Variance Std.Dev.
##  cluster (Intercept) 0.0232   0.152   
## Number of obs: 4437, groups:  cluster, 14
## 
## Fixed effects:
##                   Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        -2.6316     0.2435  -10.81  < 2e-16 ***
## asthmaAsthma        0.5619     0.1281    4.39  1.1e-05 ***
## famsize            -0.0336     0.0374   -0.90   0.3696    
## I(sed/1000)         0.5835     0.2618    2.23   0.0258 *  
## raceOtherHispanic   0.7564     0.2421    3.12   0.0018 ** 
## raceWhite           0.0955     0.2159    0.44   0.6581    
## raceBlack           0.0531     0.2277    0.23   0.8155    
## raceOther          -0.4950     0.2590   -1.91   0.0560 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) asthmA famsiz I(/100 rcOthH racWht rcBlck
## asthmaAsthm -0.057                                          
## famsize     -0.491 -0.012                                   
## I(sed/1000) -0.324 -0.042  0.031                            
## rcOthrHspnc -0.653 -0.031  0.044 -0.029                     
## raceWhite   -0.715 -0.037  0.132 -0.148  0.709              
## raceBlack   -0.684 -0.064  0.088 -0.124  0.715  0.781       
## raceOther   -0.571 -0.003  0.046 -0.122  0.606  0.688  0.653

Apply It

This link contains a folder complete with an Rstudio project file, an RMarkdown file, and a few data files. Download it and unzip it to do the following steps.

Step 1

Open the Chapter4.Rproj file. This will open up RStudio for you.

Step 2

Once RStudio has started, in the panel on the lower-right, there is a Files tab. Click on that to see the project folder. You should see the data files and the Chapter4.Rmd file. Click on the Chapter4.Rmd file to open it. In this file, import the data and run each type of statistical analysis presented in this chapter (there are others that are presented in Chapters 4 and 5 that you may have done already and methods from Chapter 7 that you have not done yet).

Once that code is in the file, click the knit button. This will create an HTML file with the code and output knitted together into one nice document. This can be read into any browser and can be used to show your work in a clean document.

Conclusions

This has been a really brief introduction into a thriving, large field of statistical analyses. These are the general methods for using R to analyze multilevel data. Our next chapter will discuss more modeling techniques in R, including mediation, mixture, and structural equation modeling.


  1. We discuss what this means in much more depth and demonstrate reshaping of data in Chapter 8. It is an important tool to understand if you are working with data in various forms. Although many reshape their data by copying-and-pasting in a spreadsheet, what we present in Chapter 8 is much more efficient, cleaner, less error-prone, and replicatable.