25 Repeated Measures ANOVA

Learning how to use Random Intercept ANOVA Models

Author

Affiliations

Melanie Freire

Florida International University

Robert Stempel College of Public Health and Social Work

25.1 Introduction

Repeated measures ANOVA is used when you have the same measure that participants were rated on at more than two time points. With only two time points a paired \(t\)-test will be sufficient, but for more times a repeated measures ANOVA is required. (2013-) There are many complex designs that can make use of repeated measures, but throughout this guide, we will be referring to the most simple case, that of a one-way repeated measures ANOVA. This particular test requires one independent variable and one dependent variable. The dependent variable needs to be continuous (interval or ratio) and the independent variable categorical (either nominal or ordinal). (2018-)

25.2 Neccessary packages

Make sure that you have installed the following R packages:

tidyverse for data manipulation and visualization.
ggpubr for creating easily publication ready plots.
rstatix provides pipe-friendly R functions for easy statistical analyses.(2018-)
datarium contains required data sets for this chapter.

Start by loading the following R packages

# Install packages first and then load the libraries. 
# install.packages("datarium")

library(tidyverse)
library(ggpubr)
library(rstatix)

25.3 Data source and description

For this example we will be using this dataset from the datarium package that contains 10 individuals’ self-esteem score on three time points during a specific diet to determine whether their self-esteem improved.

One-way repeated measures ANOVA can be performed in order to determine the effect of time on the self-esteem score.

# Data preparation; wide format
data("selfesteem", package = "datarium")
selfesteem

# A tibble: 10 × 4
      id    t1    t2    t3
   <int> <dbl> <dbl> <dbl>
 1     1  4.01  5.18  7.11
 2     2  2.56  6.91  6.31
 3     3  3.24  4.44  9.78
 4     4  3.42  4.71  8.35
 5     5  2.87  3.91  6.46
 6     6  2.05  5.34  6.65
 7     7  3.53  5.58  6.84
 8     8  3.18  4.37  7.82
 9     9  3.51  4.40  8.47
10    10  3.04  4.49  8.58

Now we “gather” columns t1, t2, and t3 into “long” format, then convert id and time into factor variables.

selfesteem_df <- 
  selfesteem %>%
  gather(key = "time", value = "score", t1, t2, t3) %>%
  convert_as_factor(id, time)

selfesteem_df

# A tibble: 30 × 3
   id    time  score
   <fct> <fct> <dbl>
 1 1     t1     4.01
 2 2     t1     2.56
 3 3     t1     3.24
 4 4     t1     3.42
 5 5     t1     2.87
 6 6     t1     2.05
 7 7     t1     3.53
 8 8     t1     3.18
 9 9     t1     3.51
10 10    t1     3.04
# ℹ 20 more rows

The one-way repeated measures ANOVA can be used to determine whether the means self-esteem scores are significantly different between the three time points.

Note: Whilst the repeated measures ANOVA is used when you have just “one” independent variable, if you have “two” independent variables (e.g., you measured time and condition), you will need to use a two-way repeated measures ANOVA. Two and Three-way Repeated Measures ANOVA examples with this data can be found here.

25.3.1 Summary statistics

Compute some summary statistics of the self-esteem score by groups (time): mean and sd (standard deviation)

# Statistics-summary
selfesteem_df %>%
  group_by(time) %>%
  get_summary_stats(score, type = "mean_sd")

# A tibble: 3 × 5
  time  variable     n  mean    sd
  <fct> <fct>    <dbl> <dbl> <dbl>
1 t1    score       10  3.14 0.552
2 t2    score       10  4.93 0.863
3 t3    score       10  7.64 1.14

25.3.2 Visualization

Create a box plot and add points corresponding to individual values:

bxp <- ggboxplot(selfesteem_df, x = "time", y = "score", add = "point")
bxp

25.4 Test Assumptions

Before computing repeated measures ANOVA test, you need to perform some preliminary tests to check if the assumptions are met.

25.4.1 Outiliers

Outliers can be easily identified using box plot methods, implemented in the R function identify_outliers() inside the rstatix package.

selfesteem_df %>%
  group_by(time) %>%
  identify_outliers(score)

# A tibble: 2 × 5
  time  id    score is.outlier is.extreme
  <fct> <fct> <dbl> <lgl>      <lgl>     
1 t1    6      2.05 TRUE       FALSE     
2 t2    2      6.91 TRUE       FALSE

There were no extreme outliers. In the situation where we have extreme outliers, we can include the outlier in the analysis anyway if we do not believe the result will be substantially affected. This can be evaluated by comparing the result of the ANOVA with and without the outlier. It’s also possible to keep the outliers in the data and perform robust ANOVA test using the WRS2 package. WRS2 Package

25.4.2 Normality Assumption

The outcome (or dependent) variable should be approximately normally distributed in each cell of the design. This can be checked using the Shapiro-Wilk normality test (shapiro_test() in rstatix package) or by visual inspection using QQ plot (ggqqplot() in the ggpubr package). If the data is normally distributed, the \(p\)-value should be greater than 0.05.

selfesteem_df %>%
  group_by(time) %>%
  shapiro_test(score)

# A tibble: 3 × 4
  time  variable statistic     p
  <fct> <chr>        <dbl> <dbl>
1 t1    score        0.967 0.859
2 t2    score        0.876 0.117
3 t3    score        0.923 0.380

The self-esteem score was normally distributed at each time point, as assessed by Shapiro-Wilk’s test (\(p > 0.05\)).

Note that, if your sample size is greater than 50, the normal QQ plot is preferred because at larger sample sizes the Shapiro-Wilk test becomes very sensitive even to a minor deviation from normality. QQ plot draws the correlation between a given data and the normal distribution. Create QQ plots for each time point:

ggqqplot(selfesteem_df, "score", facet.by = "time")

From the plot above, as all the points fall approximately along the reference line, we can assume normality.

25.4.3 Assumption of Sphericity

The variance of the differences between groups should be equal. This can be checked using the Mauchly’s test of sphericity. This assumption will be automatically checked during the computation of the ANOVA test using the R function anova_test() in rstatix package. The Mauchly’s test is internally used to assess the sphericity assumption. Click HERE to know more about the Assumption of Sphericity and the Mauchly’s Test and to understand why is important.

By using the function get_anova_table() to extract the ANOVA table, the Greenhouse-Geisser sphericity correction is automatically applied to factors violating the sphericity assumption.

res.aov <- anova_test(
  data = selfesteem_df, 
  # Selfesteem variable
  dv = score,
  # Sample individuals
  wid = id, 
  # Independent variable time 
  within = time
)

# Get table
get_anova_table(res.aov)

ANOVA Table (type III tests)

  Effect DFn DFd      F        p p<.05   ges
1   time   2  18 55.469 2.01e-08     * 0.829

The self-esteem score was statistically significantly different at the different time points during the diet, \(F_{(2, 18)} = 55.5\), \(p < 0.0001\), \(\eta^2_g = 0.83\). where,

F Indicates that we are comparing to an \(F\)-distribution (\(F\)-test),
(2, 18) indicates the degrees of freedom in the numerator (DFn) and the denominator (DFd), respectively,
55.5 indicates the obtained \(F\)-statistic value;
p specifies the \(p\)-value, and
\(\eta^2_g\) is the generalized effect size (amount of variability due to the within-subjects factor).

25.4.4 Post-hoc test

You can perform multiple pairwise paired \(t\)-tests between the levels of the within-subjects factor (here time). We adjust \(p\)-values using the Bonferroni multiple testing correction method.

# pairwise comparisons
pwc <- pairwise_t_test(
  data = selfesteem_df,
  formula = score ~ time,
  paired = TRUE,
  p.adjust.method = "bonferroni"
)

pwc

# A tibble: 3 × 10
  .y.   group1 group2    n1    n2 statistic    df           p p.adj p.adj.signif
* <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl>       <dbl> <dbl> <chr>       
1 score t1     t2        10    10     -4.97     9 0.000772     2e-3 **          
2 score t1     t3        10    10    -13.2      9 0.000000334  1e-6 ****        
3 score t2     t3        10    10     -4.87     9 0.000886     3e-3 **

All the pairwise differences are statistically significant.

25.5 Results

We could report the results of the post-hoc test as follows: post-hoc analyses with a Bonferroni adjustment revealed that all the pairwise differences, between time points, were statistically significantly different (\(p < 0.05\)).

pwc <- pwc %>% add_xy_position(x = "time")

bxp + 
  stat_pvalue_manual(pwc) +
  labs(
    subtitle = get_test_label(res.aov, detailed = TRUE),
    caption = get_pwc_label(pwc)
  )

25.6 Conclusion

This chapter describes how to compute, interpret and report repeated measures ANOVA in R, specifically one-way repeated measures ANOVA. We also explain the assumptions made by one-way repeated measures ANOVA tests and provide practical examples of R codes to check whether the test assumptions are met.

“ANOVA for Repeated Measures Designs Related Designs (Rationale for One-Way Repeated Measures ANOVA); Between Subjects and Between Conditions Variation; Data Assumptions for Repeated Measures ANOVA; Two-Way Related Design; ANOVA Mixed Design One Repeated Measure and One Unrelated Factor ; More Complex ANOVA Designs; Effect Size and Power; A Non- Parametric Equivalent the Friedman Test for Correlated Samples; SPSS Procedures for Repeated Measures ANOVA and Friedman.” 2013. In, 528–49. Routledge. https://doi.org/10.4324/9780203769669-23.

“ANOVA Repeated Measures.” 2018. In, 222–50. SAGE Publications, Inc. https://doi.org/10.4135/9781071802625.n9.