Here we continue the caffeine ANOVA we used in the one-way ANOVA module. As I mentioned, ordinarily I would read the data using a .csv file. Dependent variable is test scores on an exam, the independent variable is caffeine consumption (group1 = control, group2=mild caffeine, group3=heavy caffeine).

Testscore <- c(75, 77, 79, 81, 83, 80, 82, 84, 86, 88, 70, 72, 74, 76, 78)
Group <- c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3)
main.data <- data.frame(Testscore, Group)
main.data$Group <- as.factor(main.data$Group) # need to have this a factor for analysis
str(main.data)
## 'data.frame':    15 obs. of  2 variables:
##  $ Testscore: num  75 77 79 81 83 80 82 84 86 88 ...
##  $ Group    : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 2 2 2 2 2 ...

Run the analysis of variance:

attach(main.data)
## The following objects are masked _by_ .GlobalEnv:
## 
##     Group, Testscore
caff.res <- aov(Testscore ~ factor(Group))

Note that we used ‘aov’ instead of ‘lm’ for the program. If you use ‘lm,’ you will get an error, because the TukeyHSD program does not know how to use the lm results. So ‘aov’ for analysis of variance. The aov program apparently produces Type I or II sums of squares, so if you have a multiple factor study with unbalance (unequal) cell sizes, you could have a problem. If you have a one-way ANOVA or an orthogonal factorial, all the types of sums of squares are equal, so no problem. Not a problem for this example.

Tukey.means <- TukeyHSD(x=caff.res, 'factor(Group)', conf.level=0.95)
Tukey.means
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Testscore ~ factor(Group))
## 
## $`factor(Group)`
##     diff         lwr        upr     p adj
## 2-1    5  -0.3357273 10.3357273 0.0670199
## 3-1   -5 -10.3357273  0.3357273 0.0670199
## 3-2  -10 -15.3357273 -4.6642727 0.0008342
plot(Tukey.means)

We have alpha = .05 by default, which is adjusted for multiple comparisons by Tukey’s test. It ouputs the confidence intervals and p-values for each comparison. It also produces a graph of differences if you ask for the plot.