T-test Review
Between-subjects t-test
The between-subjects t-test
is sometimes called the ‘between-group t-test’, or the ‘independent t-test’, or
the ‘separate t-test’.
A second example, with the same
means as the first example (given in class) but smaller variances. Think about why smaller variances lead to a
different outcome.
No Advice |
Extra Advice |
||
IRS CS Agent |
# Complaints |
IRS CS Agent |
# Complaints |
1 |
4 |
6 |
9 |
2 |
4 |
7 |
8 |
3 |
6 |
8 |
10 |
4 |
8 |
9 |
8 |
5 |
8 |
10 |
10 |
|
|
|
|
|
|
|
|
Because tcrit
with df=8 and a=.05 is equal to 2.306 the calculated t-value is
significant at p < .01. Reject the
null hypothesis that there is no difference between the means.
Conclusion: The extra advice
condition created significantly more complaints than the no advice
condition. The difference is unlikely
to be due to chance.
With-subjects t-test
The within subjects t-test
is also known as the ‘paired t-test’, the ‘dependent t-test’, or the ‘repeated
measures t-test’. The reason it has all
of these names is because it is used in several different situations: the same person is in the study twice
(longitudinal or repeated measures design) or pairs of individuals are linked
together or “yoked” (e.g., twins, or married couples) because they are
naturally linked or because the experimenter linked them as when they are
‘matched’ on some score (e.g., matched on age).
Example. We could
have conducted the same study of IRS customer service a different way—by
introducing the “extra advice” intervention half-way through the tax season for
all the customer service agents in the study (only 5 this time). In this design, we might have the usual “no
advice” condition for 40 days, and then the “extra advice” in the second 40
days. (Note: there are some
methodological problems with this design that can be overcome with some
changes, but let’s just assume this is the design for now).
Using the same numbers as in
the first between-subjects example, we have two scores for each of 5
customer service agents. Notice we are
only using half the number of cases now.
IRS CS Agent |
Before |
After |
|
|
|
1 |
2 |
7 |
-5 |
2 |
4 |
2 |
4 |
8 |
-4 |
1 |
1 |
3 |
6 |
10 |
-4 |
1 |
1 |
4 |
8 |
8 |
0 |
3 |
9 |
5 |
10 |
12 |
-2 |
1 |
1 |
|
|
|
|
|
|
Formulas:
|
|
|
df = N-1=5-1=4
tcrit at a = .05 with df = 4 is 2.776. So, the calculated t-value of 3.35 is significant at p < .05, indicating that there is a significant difference between the number of complaints before and after the extra advice intervention was introduced. Notice that the same data in example 1 yielded a non-significant difference (with twice as many cases!!). The reason the within-subjects test is more powerful is that variation due to individual differences is eliminated in the within-subjects design. Each subject serves as his/her own comparison or control.