Lecture
10
Within-Subjects Analysis of Variance
When to Use
Within-Subjects ANOVA
The within-subjects ANOVA is
really just an extension of the paired t-test
discussed earlier. It is used whenever there are repeated measures or repeated
treatments of an individual, paired participants (e.g., triplets), or matched
participants (e.g., four participants are yoked according to the amount of
daily exercise they get). Daniel describes two forms of the within-subjects
design, the design he calls the "randomized complete block design"
and repeated measures. The randomized complete block design is a study in
several participants are matched or yoked together (e.g., on age), then one
member of each grouping is randomly assigned to each of the treatment
conditions. If there are three treatment conditions (e.g., three different
drugs are given), each person in a group of three would be sent to a different
condition. But in order for it to be valid to use a within-subjects analysis,
the groupings of participants has to be done on some basis. For instance,
triplets or three people matched because they have the same number of years
experience on the job. Repeated measures, on the other hand, indicates that
each person is measured several times. In general, the within-subjects ANOVA
applies to matching, repeated measures, repeated treatments, or any time
participants are yoked together.
An example of a within-subjects
design would be a hypothetical study on cholesterol which has four treatments:
high oat fiber diet, exercise, a low fat diet, and low calorie diet. In a
between-subjects version of the study, different participants would be in each
of these four conditions. In a within-subjects version of this study, each
participant might be exposed to each of these treatments for a period of time.
So, if I was in the within-subjects cholesterol study (or, to be more specific,
we would call this a repeated measures design), I would spend a month eating oatmeal
for breakfast. Then, the next month, I would stop eating oatmeal, and begin an
exercise regimen. The following month, I would stop exercising, and start
reducing fat intake. And so on.
One can think of the scores on the
cholesterol measure as blocked together according to each individual. Each
person has four scores. At the end of each month (and, hence, each treatment) a
cholesterol count is taken. In a way, each participant has a block of four
scores--so you can see the similarity to the randomized block design.
Like the between-subjects ANOVA and
the between-subjects t-test, the within-subjects ANOVA and the within-subjects
t-test are related. The within-subjects ANOVA can be used for two or more
groups, and when used with two groups the t-test and the F-test will lead to
the same conclusion ( and ).
The same benefits of the
within-subjects t-test over the between-subjects t-test still apply. We need
fewer participants overall and we have more power, because each individual (or
block) acts as its own control.
The Analysis
The analysis approach is similar to that of the paired (or within-subjects or
matched) t-test. We are intrested in the differences between scores for an
individual (remember when we calculated the difference score, d, for each
individual?). We will also use a similar ANOVA logic to the logic we used with
the between-subjects ANOVA. We will calculate the variation of the scores for
an individual relative to the mean of scores for that individual. In other
words, we want to know how much each participant's scores change from treatment
to treatment. If one or more of our cholesterol treatments had an effect, we
would expect cholesterol counts to change fairly dramatically from
month-to-month.
A Note on Notation
The notation for this analysis does not change dramatically, but we need to
extend it a little. Previously, we used a dot to indicate that we had combined
scores across individuals. For example referred to the sum of individual scores in
a group. This time we will do some adding the other way, so that we combine
scores across treatments rather than individuals. So, the sum of scores for one
participant in the study, summed across treatments is symbolized by The sum of
cholesterol counts for the second participants who has completed all four
treatment conditions would be symbolized as Thus, stands for the sum of individuals for the
second treatment, and stands for the some of treatments for participant 2. Similary,
represents
the mean cholesterol score for participant 2 across all treatments, and represents the
mean of all participants for treatment number 2.
Example
Participant # |
Oat Fiber |
Exercise |
Low Fat |
Low Calorie |
|
|
1 |
180 |
200 |
160 |
200 |
740 |
185 |
2 |
230 |
250 |
200 |
220 |
900 |
225 |
3 |
280 |
310 |
260 |
270 |
1120 |
280 |
4 |
180 |
200 |
160 |
200 |
740 |
185 |
5 |
190 |
210 |
170 |
210 |
780 |
195 |
6 |
140 |
160 |
120 |
110 |
530 |
132.5 |
7 |
270 |
300 |
250 |
260 |
1080 |
270 |
8 |
110 |
130 |
100 |
100 |
440 |
110 |
9 |
190 |
210 |
170 |
210 |
780 |
195 |
10 |
230 |
250 |
200 |
220 |
900 |
225 |
|
2000 |
2220 |
1790 |
2000 |
|
|
|
200 |
222 |
179 |
200 |
|
|
Now we have several sums of squares
to compute.
Formula |
Name |
Description |
Example |
|
Sum of Squares Treatment |
Represents variation due to treatment effect |
|
|
Sum of Squares Block |
Represents variation within an individual (within block) |
|
|
Sum of Squares Error |
Represents error variation |
|
|
Sum of Squares Total |
Represents total variation |
|
Now for the mean squares and the
computed F-value:
SSTr=924.75 |
k - 1=4 - 1=3 |
|
|
SSBl=25705.63 |
n - 1=10 - 1=9 |
not needed |
|
SSE=88067.12 |
(n - 1)(k - 1) =(10 - 1)(4 - 1)=27 |
|
|
SST=114.697.50 |
kn - 1 = 40 - 1 = 39 |
|
|
Interpretation
To check whether the calculated
F-value of 31.75 is significant, we compare that value to the critical value of
obtained
from Table G, using the table labeled .95 (for a=.05) and 3 and 27 d.f.
Our test is significant, indicating that there were differences in cholesterol
treatments that are unlikely to be due to chance. What if we had set alpha to a
lower value, say .01, would the test still be significant? To see, look up the
same d.f. values in the .99 (a = .01) table. If our calculated value
exceeds that table value, we would have significance at p <.01. Often
authors report the lowest p-value at which they would find significance (e.g.,
p < .05, p < .01, p < .001), given their calculated value. Computers
print out the exact p-value, so all one really has to do is look to see if the
p-value printed out is less than .05, .01, or .001.