The Multilevel Model for Change (Ch 3 of ALDA)

Posted on July 19, 2013 Updated on November 17, 2024

ALDA stands for the book, Applied Longitudinal Data Analysis, by Singer and Willett. I’ve had this lying around for a while and decided to give it another go. If you know a little about regression and want to get started with data analysis using multilevel models, this is probably a good place to start. It’s light on the math and heavy on application. Well, maybe not that heavy. They don’t actually provide anything in the way of computer code when it comes time to fit a model. They just do it and produce a table of results. Fortunately, our friends at the UCLA Statistical Consulting Group have generously provided programs to reproduce the examples in the book using R, SAS, and Stata, among others. So that’s where I turned when I wanted to replicate the example running through chapter 3.

The example involves 103 African-American infants born into low-income families. When the children were 6 months old, about half were randomly assigned to an early intervention program designed to enhance cognitive functioning. The other half received no intervention and acted as a control. Cognitive performance was measured at ages 12, 18 and 24 months by way of a test. The researches wanted to examine the effects of program participation on the test results. There’s actually a lot more to this study. Here’s the abstract. But the authors scaled it back and took a portion of the data to create a friendly introductory example to multilevel modeling. And I have to say, they succeeded. It is indeed an easy-to-follow example. The chapter can be read in one sitting and gives you a sense of what multilevel modeling is all about. In this case we want to model individual change in cognitive ability over time with a straight line,

$$ y = \beta_{0} + \beta_{1}(time)$$

but then also model the intercept and slope of that line with respect to program:

$$ \beta_{0} = \gamma_{00} + \gamma_{01}(program)$$

$$ \beta_{1} = \gamma_{10} + \gamma_{11}(program)$$

So we have two levels of modeling going on. Hence, “multilevel” modeling.

Naturally I wanted to replicate the Chapter 3 results in R. That’s why I’m reading the book, so I can learn how to do this stuff. So I hit up the UCLA site to get the necessary R code. And there at the top the page, before I can even get started, is a roadblock:

Please note that the “early_int” data file (which is used in Chapter 3) is not included among the data files. This was done at the request of the researchers who contributed this data file to ensure the privacy of the participants in the study. Although the web page shows how to obtain the results with this data file, we regret that visitors do not have access to this file to be able to replicate the results for themselves.

Seriously? C’mon. Did the authors not think about this when writing the book? Couldn’t they come up with a different example that can be replicated? Very frustrating. So I did what any honorable statistician would do and ~~gave up, turned off the computer and went outside to enjoy nature~~ Googled the name of the dataset used in the R code on the UCLA web site (earlyint_pp.txt). The very first result takes you to this Harvard web page, where the data in question is available in SAS and Stata formats. Now the authors of ALDA are both Harvard professors. And here is the data – that I’m not supposed to have access to – available on a Harvard web page for an old statistics class taught by one of the authors of the book. I guess the researchers changed their mind? Anyway, I now had the data and could try to replicate the results. And you’ll be happy to know the data contained no identifying information. Nov 2024 update: see this GitHub repo for the data.

It had four variables: (1) an ID number for the infant, (2) the test score, (3) age at test, and (4) an indicator for program participation.

I downloaded the Stata dataset because I can read that into R using the foreign package:

library(foreign)
early.int <- read.dta("earlyint_pp.dta")

With the data loaded, I was ready to start replicating the results. This required me to copy and paste the R code from the UCLA web site. I don’t usually go for copying-and-pasting, but in this case I was OK with it. It was mostly exploratory analysis. And it worked like a charm. That is until I got to the part where you actually fit a multilevel model to the data. Running the following R code produced an error:

library(nlme)
lme(cog ~ time*program, data=early.int, random = ~ time | id, , method="ML")

Error in lme.formula(cog ~ time * program, data = early.int, random = ~time |  : 
  nlminb problem, convergence error code = 1
  message = iteration limit reached without convergence (10)

Hmm. Seemed to work for the UCLA person. What to do? Long story short, I needed to specify a different optimizer. The default optimizer for lme is “nlminb”. I needed to specify “optim”, like so:

model1 <- lme(cog ~ time*program, data=early.int, random = ~ time | id, 
              method="ML", control=list(opt = "optim"))
summary(model1)

That worked! Or at least it gave me some output. Upon closer inspection the random effects were slightly different, but everything else looked about the same. Why does “optim” work and not “nlminb”? I have no idea. Obviously they go about optimization in different ways. But I’m not too worried at the moment since I’m just getting my feet wet with multilevel modeling.

In researching the error above I discovered I could also fit the multilevel model using the following code:

library(lme4)
mod1 <- lmer(cog ~ time*program + (time | id), early.int)
summary(mod1)

The output from this is also slightly different from the book as well as the output from the lme call above using “optim”. The coefficients of the fixed effects are all the same, but the random effects are slightly different.

So what does all this mean? How do you interpret the output? What is multilevel modeling? Well, that wasn’t quite the point of this post. I wanted to document for posterity how to replicate the results of ALDA Chapter 3 using R despite the message to the contrary on the UCLA site. Besides, if you’ve read this far then you're probably also working through ALDA and don't need me to repeat what's in the book. However, if you insist, I point you to this solid PowerPoint presentation that boils Chapter 3 of ALDA to its essence. It tells you everything you need to know about modeling this example and interpreting the results.

Clay Ford

10 comments

Pingback: The Multilevel Model for Change (Ch 3 of ALDA) – revisited | Statistics you can Probably Trust
John Muschelli says:

January 19, 2015 at 11:22 pm

Thanks for the data link, nice post.

In case anyone comes looking around here,
lmer from lme4 uses REML = TRUE as default, you are fitting with ML, so you may different results for both. Moreover, you may get slight differences in the SE of the beta coefficients:

> model1
> model1.lme summary(model1)
Linear mixed-effects model fit by REML
Data: early.int
AIC BIC logLik
2374.784 2404.547 -1179.392

Random effects:
Formula: ~time | id
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 11.295632 (Intr)
time 3.650461 -0.91
Residual 8.646211

Fixed effects: cog ~ time * program
Value Std.Error DF t-value p-value
(Intercept) 107.84074 2.054203 204 52.49760 0.0000
time -21.13333 1.902278 204 -11.10949 0.0000
program 6.85466 2.737461 101 2.50402 0.0139
time:program 5.27126 2.535004 204 2.07939 0.0388
Correlation:
(Intr) time progrm
time -0.638
program -0.750 0.479
time:program 0.479 -0.750 -0.638

Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.30359362 -0.55634165 0.02384933 0.56399047 2.32448601

Number of Observations: 309
Number of Groups: 103
> summary(model1.lme)
Linear mixed model fit by REML [‘lmerMod’]
Formula: cog ~ time * program + (time | id)
Data: early.int

REML criterion at convergence: 2358.7

Scaled residuals:
Min 1Q Median 3Q Max
-2.32496 -0.56664 0.02557 0.56998 2.31593

Random effects:
Groups Name Variance Std.Dev. Corr
id (Intercept) 126.76 11.259
time 10.32 3.213 -1.00
Residual 75.49 8.689
Number of obs: 309, groups: id, 103

Fixed effects:
Estimate Std. Error t value
(Intercept) 107.841 2.053 52.53
time -21.133 1.893 -11.16
program 6.855 2.736 2.51
time:program 5.271 2.523 2.09

Correlation of Fixed Effects:
(Intr) time progrm
time -0.638
program -0.750 0.479
time:progrm 0.479 -0.750 -0.638

Reply
Lynette Bikos says:

May 17, 2019 at 12:08 am

Although a few years later, I am working through ALDA. And like you, am completely bummed that the data for Chapter 3 is not available. Unfortunately, the website you linked is no longer available. Is there any chance you would share the datset? Thanks.

Reply
Lynette Bikos says:

May 17, 2019 at 10:57 am

Me again! You can disregard the request. I found the dataset on yet another wonderful teaching lab on GitHub: https://github.com/nmldesjardins/MLM/tree/master/Lab%205_growth%20models I am so grateful for all of you who provide these demonstrations.

Reply
1. Rommel Bunuan says:
  
  May 23, 2020 at 6:17 pm
  
  Do you mind sharing the data with me Lynette? I had a hard time reading it into R.
  Thanks,
  Rommel
  
  Reply
  1. Clay Ford says:
    
    May 24, 2020 at 10:58 am
    
    The data on the GitHub site appears to be in Stata format (DTA). You can read that into R using the haven package. For example, to read it in and create a data frame called “dat”:
    
    install.packages("haven") library(haven) dat <- read_dta("earlyint_pp.dta")
    
    Just make sure you have the DTA file in your working directory. If that confuses you, then you can try using RStudio's Import dialog. Click the Import Dataset button and select "From Stata".
    
    Reply
    1. Rommel Bunuan says:
      
      May 28, 2020 at 12:23 am
      
      Hi Clay,
      
      Thank you so much! I got it.
      
      Rommel
      
      Reply

Hi, this is great. Have you done anything on Ch2? I am not able to apply the code from here (https://stats.idre.ucla.edu/r/examples/alda/r-applied-longitudinal-data-analysis-ch-2/) to my own dataset, as the r code isn’t explained at all. I am trying to create plots of fitted trajectories by level of potential predictor variables (e.g., males vs female). The code provided is here:

# fitting the linear model by id, males only
tolm <- tolerance.pp[male == 1 , ]
fitmlist <- by(tolm, tolm$id, function(bydata) fitted.values(lm(tolerance ~ time, data=bydata)))
fitm <- unlist(fitmlist)

#appending the average for the whole group
lm.m <- fitted( lm(tolerance ~ time, data=tolm) )
names(lm.m) <- NULL
fit.m2 <- c(fitm, lm.m[1:5])
age.m <- c(tolm$age, seq(11,15))
id.m <- c(tolm$id, rep(111, 5))

#plotting the linear fit by id, males
#id.m=111 denotes the average value for males
interaction.plot(age.m, id.m, fit.m2, ylim=c(0, 4), xlab="AGE", ylab="TOLERANCE", lwd=1)
title(main="Males")

I'm getting lost at the numbers – 111, 11, 15 etc – there's no indication of what these mean/do in this code (and thus what I should be inputting to run this code on my own dataset). Any help would be appreciated!

Hi Amber.

Yeah, the UCLA code is rather complex, though explicit, which can be good when you’re learning this stuff. There are easier ways to do recreate Fig 2.7, which I share below. First, I’ll try to explain what each line is doing.

# create a data frame of just the males

tolm <- tolerance.pp[male == 1 , ]  
  
# fit a linear model to each person (ie, by id) and get the fitted values  

fitmlist <- by(tolm, tolm$id, function(bydata) 
                              fitted.values(lm(tolerance ~ time, data=bydata)))   

# the result of the previous line of code is a list object. The unlist()
# function turns it into a vector.

fitm <- unlist(fitmlist)     


# now fit a linear model to the entire data set of males and get the fitted values
lm.m <- fitted( lm(tolerance ~ time, data=tolm) )


# strip out the "names" from the lm.m object. To see the names, enter lm.m in
# the R console before running the next line.

names(lm.m) <- NULL


# combine the fitm vector with the first 5 values of the lm.m vector. Why only
# the first 5? Because the first 5 are all the unique values. The just repeat
# after that. The fit.m2 vector now contains fitted values for each male as well
# as the overall average.

fit.m2 <- c(fitm, lm.m[1:5])


# combine the ages of the males with an extra vector of ages for the "average"
# male. seq(11,15) returns 11, 12, 13, 14, 15
age.m <- c(tolm$age, seq(11,15))


# combine the male id numbers with an arbitrary id, 111. The makes it appear in
# the middle of the legend, generated in the next line of code.

id.m <- c(tolm$id, rep(111, 5))


# Now make the plot. The first argument is the x-axis (age), the second argument
# is the grouping variable for the separate lines (id), the third argument is
# the y-axis. This isn't really what the interaction.plot() function was
# intended to do, but it works.

interaction.plot(age.m, id.m, fit.m2, ylim=c(0, 4), 
                 xlab="AGE", ylab="TOLERANCE", lwd=1)

An easier and modern way to make the plot is as follows:

# first load the ggplot2 and dplyr packages
library(ggplot2)
library(dplyr)

# get means by age for males and females
mean_tol1 <- tolerance.pp %>% 
  group_by(male, age) %>% 
  summarise(tolerance = mean(tolerance))

# create plot, grouped by id; let ggplot handle all the fitting of linear
# models. The first geom_smooth() plots the individuals, the second plots the
# mean. The facet_wrap() function breaks the plots into male versus female.

ggplot(tolerance.pp, aes(x = age, y = tolerance, group = id)) +
  geom_smooth(color = "black", size = 0.5, se = F, method = "lm") +
  geom_smooth(aes(x = age, y = tolerance, group = 1), 
              data = mean_tol1, 
              size = 2, se = F, method = "lm") +
  facet_wrap(~ male) +
  ylim(c(0,4))

Hope this helps!

Elbee says:

May 7, 2020 at 2:31 pm

Thanks for this post! Helped me though the bit of code in chapter 3 that didn’t work.

Reply

Clay Ford

10 comments

Leave a Reply Cancel reply