Lecture 23
Overview: A Hitchhiker's Guide to Analyses

At the beginning of the course, I noted that there was really two important distinctions to make in deciding which analysis is appropriate for which situations. One merely has to identify the independent and dependent variable and decide whether each is dichotomous or continuous according to my rough distinctions made in Lecture 1. I'd like to return to that, by presenting a table that summarizes the analyses we have covered this quarter.

This table does not include every possible analysis, but it probably covers about 70-80% of them.

The Analyst's Guide to the Galaxy

 Dependent Variable Dichotomous Continuous Independent Variable Dichotomous Chi-square Logistic Regression Phi Cramer's V t-test ANOVA Regression Point-biserial Correlation Continuous Logistic Regression Point-biserial Correlation Regression Correlation

Other Analyses
This class has been a very good introduction to the most prevalent analyses in use in most of the social sciences. Unfortunately, there is no way to cover all possible analyses in a 10 week course. I believe that the topics covered are the most important for understanding the majority of the research analyses out there, but, of course, you are likely to come across other analyses or issues that we did not discuss.

There are a couple of main areas which we did not cover. The first area is more advanced topics. There are a number of analysis techniques in use that are more complex mathematically and statistically, and to learn them, one must first have a course like this. The second area involves analyses that are used relatively infrequently and are designed for fairly specific data situations (e.g., nonparametric analyses). Here are some general categories and brief descriptions of some other analyses.

Multivariate analyses:

Factor analysis - used to investigate the internal reliability of a self-report measure to see which items are most highly correlated with one another. This analysis attempts to identify a "structure" underlying the items of a questionnaire, by finding groups of items that measure the same subdomain of the larger concept being measured.

Principal components analysis - similar to factor analysis in purpose but is a simpler approach mathematically. It preceded (historically) more sophisticated factor analysis approaches.

Discriminant analysis - examines how a set of dependent variables can help classify individuals belonging to two or more groups. For example, how do a variety of health related practices help identify healthy and unhealthy individuals. This analysis technique answers similar questions to those that can be answered by logistic regression.

Cluster analysis - findings groupings of individuals in a sample that are similar to one another on a variable. Differs from factor analysis because it groups individuals not items on a questionnaire.

Multivariate Analysis of Variance (MANOVA) - an extension of ANOVA, but uses more than one dependent variable.

Analysis of Covariance (ANCOVA) - an ANOVA that controls for other "covariates." Related to regression in that other predictors can be controlled while making group comparisons of means.

Structural Equation Modeling and Path Analysis - these techniques can be seen as an extension of regression analysis. They attempt to analyze more complicated causal models and can incorporate elements of factor analysis.

Nonparametric analyses - nonparametric analyses involve variables that do not fit well into our classification of dichotomous and continuous variables. When cases are ranked in a data set, for instance, nonparametric tests are used. If ranks (i.e., ranking of an individual among others in the sample) are used as scores in the analysis, the analyses we have discussed are not appropriate. Daniel has a nice overview of nonparametrics in the text.

Multilevel models or Hierarchical Linear Models - an extension of regression analysis. Used with "nested" data in which subgroupings (e.g., classes, hospitals, case managers) exist within the data. The data is said to be structured hierarchically in these cases.

Most of the following references are accessible, and given your introduction through this course, you should be able to understand them without too much difficulty. Although a few of these books require a bit more understanding of mathematics than was necessary for this course, not all due. Most are also informative, offering practical advice and discussing conceptual issues, even if the mathematics and formulas are glossed over by the reader.

Multivariate Analyses

Tabachnick, B.G., & Fidell, L.S (1997). Using multivariate statistics (3rd Ed.). New York: HarperCollins.

ANOVA

Keppel, G. (1991). Design and analysis: A researcher's handbook. Englewood Cliffs, NJ: Prentice Hall.

Winer, B.J. Brown, D.R., Michels, K.M. (1991). Statistical principles in experimental design. New York : McGraw-Hill.

Regression

Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences.

Keppel, G., & Zedeck, S. (1989). Data analysis for research designs: Analysis of variance and multiple regression/correlation approaches.

Logistic Regression

Hosmer, D.W., & Lemeshow, S. (1989). Applied logistic regression. New York: Wiley & Sons.

Hierarchical Modeling

Kreft, I. & De Leeuw, J. (1998) Introducing multilevel modeling. Thousand Oaks, CA: Sage.

Path Analysis, Factor Analysis, and Structural Modeling

Loehlin, J.C. (1998). Latent variable models: An introduction to factor, path, and structural analyses (3rd Ed.). Hillsdale, NJ: Earlbaum.