Planned comparisons and post hoc tests used to be separate entities. You either did planned comparisons instead of an overall F for ANOVA, OR you used a post-hoc test and compared all the cell means, picking out the significant ones. Now it seems that these are two ends of a spectrum, where the degree of ‘planning’ varies. At any rate, you can use a number of different statistical tests here. There is a voluminous literature on this, and smart people disagree about the best approach.
In the PowerPoint slideshow, there is an example where there are four different groups, all of whom take a common test at the end of instruction. The groups are (1) control - standard lecture, (2) ctrl + computer tutor, (3) ctrl+ computer tutor + lab, and (4) ctrl+ computer tutor + lab + quiz. I would ordinarily read these data from a .csv file, but that’s a problem with using this software, so I have included the data in the file.
test <- c(22, 15, 17, 18, 26, 27, 24, 23, 28, 31, 27, 26, 21, 21, 26, 20)
instruct <- as.factor(c(rep(1,4), rep(2, 4), rep(3, 4), rep(4,4)))
PC.data <- data.frame(test, instruct)
str(PC.data)
## 'data.frame': 16 obs. of 2 variables:
## $ test : num 22 15 17 18 26 27 24 23 28 31 ...
## $ instruct: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ...
The test scores and instruction methods are contained in the PC.data object. Note that I told R that ‘instruct’ was a factor rather than an integer.
In the PowerPoint slides, I compute a garden variety ANOVA on these data, so I’ve done it here as well, even though it is not strictly necessary.
res1 <- lm(PC.data$test ~ PC.data$instruct)
anova(res1)
## Analysis of Variance Table
##
## Response: PC.data$test
## Df Sum Sq Mean Sq F value Pr(>F)
## PC.data$instruct 3 219 73 12.167 0.0005971 ***
## Residuals 12 72 6
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The results of the ANOVA agree with the PowerPoint slides.
Now we’ll set up some contrasts (representations of our planned comparisons) for R to use.
c1 <- c(.5, .5, -.5, -.5)
c2 <- c(1, -1, 0, 0)
c3 <- c(0, 0, 1, -1)
mat <- cbind(c1, c2, c3)
contrasts(PC.data$instruct) <- mat
Note that the first contrast compares groups 1 and 2 vs. groups 3 and 4. The second and third comparisons contrast 1 vs. 2 and 3 vs. 4, so that all of the comparisons are independent. We have 4 categories, so we can have at most 3 independent contrasts. I chose these for convenience rather than theory. We might have started instead with 1 vs. 2, 3 and 4 to compare the control group to the rest of the interventions, for example. Note that PC.data had the two variables ‘test’ and ‘instruct.’ We have combined the 3 contrasts into a matrixt called ‘mat’ and assigned the contrasts to the variable ‘instruct.’
Now the analysis.
model3 <- lm(PC.data$test~PC.data$instruct)
summary(model3, split=list(instruct=list('group1, 2 vs 3,4'=1,'group 1 vs 2'=2, 'group 3 vs 4'=3 )))
##
## Call:
## lm(formula = PC.data$test ~ PC.data$instruct)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.00 -1.25 -1.00 1.25 4.00
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23.2500 0.6124 37.967 7.17e-14 ***
## PC.data$instructc1 -3.5000 1.2247 -2.858 0.01441 *
## PC.data$instructc2 -3.5000 0.8660 -4.041 0.00164 **
## PC.data$instructc3 3.0000 0.8660 3.464 0.00468 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.449 on 12 degrees of freedom
## Multiple R-squared: 0.7526, Adjusted R-squared: 0.6907
## F-statistic: 12.17 on 3 and 12 DF, p-value: 0.0005971
The contrasts are all significant (interpret the t-values). The first contrast (-3.5) agrees with the PowerPoint slide exactly. The last two are equal to half of what is presented in the slides. I think that R contstrains the absolute value of the contrast coefficients to sum to one, so that, for example, my values of 1 and -1 become .5 and -.5. The significance test is the same either way.
We typically only use a few contrasts that are theoretically meaningful.