Lecture
23
Overview: A
Hitchhiker's Guide to Analyses
At the beginning of the course, I
noted that there was really two important distinctions to make in deciding
which analysis is appropriate for which situations. One merely has to identify
the independent and dependent variable and decide whether each is dichotomous
or continuous according to my rough distinctions made in Lecture 1.
I'd like to return to that, by presenting a table that summarizes the analyses
we have covered this quarter.
This table does not include every
possible analysis, but it probably covers about 70-80% of them.
The Analyst's Guide
to the Galaxy
|
Dependent Variable |
||
Dichotomous |
Continuous |
||
Independent |
Dichotomous |
Chi-square |
t-test |
Continuous |
Logistic Regression |
Regression |
Other Analyses
This class has been a very good introduction to the most prevalent analyses in
use in most of the social sciences. Unfortunately, there is no way to cover all
possible analyses in a 10 week course. I believe that the topics covered are
the most important for understanding the majority of the research analyses out
there, but, of course, you are likely to come across other analyses or issues
that we did not discuss.
There are a couple of main areas
which we did not cover. The first area is more advanced topics. There are a
number of analysis techniques in use that are more complex mathematically and
statistically, and to learn them, one must first have a course like this. The
second area involves analyses that are used relatively infrequently and are
designed for fairly specific data situations (e.g., nonparametric analyses).
Here are some general categories and brief descriptions of some other analyses.
Multivariate analyses:
Factor
analysis - used to investigate the internal reliability of a self-report measure
to see which items are most highly correlated with one another. This analysis
attempts to identify a "structure" underlying the items of a
questionnaire, by finding groups of items that measure the same subdomain of
the larger concept being measured.
Principal
components analysis - similar to factor analysis in purpose but is a simpler
approach mathematically. It preceded (historically) more sophisticated factor
analysis approaches.
Discriminant
analysis - examines how a set of dependent variables can help classify
individuals belonging to two or more groups. For example, how do a variety of
health related practices help identify healthy and unhealthy individuals. This
analysis technique answers similar questions to those that can be answered by
logistic regression.
Cluster
analysis - findings groupings of individuals in a sample that are similar to
one another on a variable. Differs from factor analysis because it groups
individuals not items on a questionnaire.
Multivariate
Analysis of Variance (MANOVA) - an extension of ANOVA, but uses more than one
dependent variable.
Analysis
of Covariance (ANCOVA) - an ANOVA that controls for other
"covariates." Related to regression in that other predictors can be
controlled while making group comparisons of means.
Structural Equation Modeling and
Path Analysis - these techniques can be seen as an extension of regression
analysis. They attempt to analyze more complicated causal models and can
incorporate elements of factor analysis.
Nonparametric analyses - nonparametric
analyses involve variables that do not fit well into our classification of
dichotomous and continuous variables. When cases are ranked in a data set, for
instance, nonparametric tests are used. If ranks (i.e., ranking of an
individual among others in the sample) are used as scores in the analysis, the
analyses we have discussed are not appropriate. Daniel has a nice overview of
nonparametrics in the text.
Multilevel models or Hierarchical
Linear Models - an extension of regression analysis. Used with "nested"
data in which subgroupings (e.g., classes, hospitals, case managers) exist
within the data. The data is said to be structured hierarchically in these
cases.
Further Reading
Most of the following references
are accessible, and given your introduction through this course, you should be
able to understand them without too much difficulty. Although a few of these
books require a bit more understanding of mathematics than was necessary for
this course, not all due. Most are also informative, offering practical advice
and discussing conceptual issues, even if the mathematics and formulas are
glossed over by the reader.
Multivariate Analyses
Tabachnick,
B.G., & Fidell, L.S (1997). Using multivariate statistics (3rd
Ed.). New York: HarperCollins.
ANOVA
Keppel, G.
(1991). Design and analysis: A researcher's handbook. Englewood Cliffs, NJ:
Prentice Hall.
Winer,
B.J. Brown, D.R., Michels, K.M. (1991). Statistical principles in experimental
design. New York : McGraw-Hill.
Regression
Cohen, J.,
& Cohen, P. (1983). Applied multiple regression/correlation analysis for
the behavioral sciences.
Keppel,
G., & Zedeck, S. (1989). Data analysis for research designs: Analysis of
variance and multiple regression/correlation approaches.
Logistic Regression
Hosmer,
D.W., & Lemeshow, S. (1989). Applied logistic regression. New York: Wiley
& Sons.
Hierarchical Modeling
Kreft, I.
& De Leeuw, J. (1998) Introducing multilevel modeling. Thousand Oaks, CA:
Sage.
Path Analysis, Factor Analysis, and
Structural Modeling
Loehlin,
J.C. (1998). Latent variable models: An introduction to factor, path, and
structural analyses (3rd Ed.). Hillsdale, NJ: Earlbaum.