help cochranq ------------------------------------------------------------------------------------------------------------------------------- Title cochranq -- Cochran's Q test for stochastic dominance in blocked binary data Syntax cochranq scorevar blockvar groupvar [if] [in] [fweight] [, ma(method) es(method) noqtest nolabel wrap level(#) copyleft] cochranq options Description ------------------------------------------------------------------------------------------------------------------------- Main ma(method) which method to adjust for multiple comparisons es(method) choice of effect size calculations noqtest suppress Cochran's Q test output nolabel display groupvar values, rather than groupvar value labels wrap do not break wide tables level(#) set confidence level; default is level(95) copyleft displays the GPL license for cochranq ------------------------------------------------------------------------------------------------------------------------- fweights are allowed; see weight. Description cochranq reports the results of Cochran's Q test 1950 for stochastic dominance among b blocks binary outcomes across k groups. Cochran's omnibus Q test is analogous to a repeated measures ANOVA. The null hypothesis is that there is no difference in probability of success between the k groups; Cochran's Q can be considered a generalization of McNemar's test to an arbitrary number of groups. In the syntax diagram above, scorevar refers to the variable recording the outcome, blockvar refers to the variable denoting the units being observed (e.g. test subjects), and groupvar refers to the different treatments, exposures, tasks, etc. cochranq also calculates the non-asymptotic p-value for the Q statistic, which generally provides greater statistical power (Mielke and Berry, 1995). The use of fweights specifies the number of times an observed pattern of successes and failures across different groups is observed. The non-asymptotic statistic is distributed using a variation on the Pearson Type III distribution, and the PDF of this distribution is numerically integrated over from -2/gamma to Z with 1,000 steps in order to calculate the p-value. Mielke and Berry (1995) write that "more information is available to the nonasymptotic approach. Consequently, when the effective n is small, one cannot expect a result based on an infinite n to be appropriate. Because the Pearson type III distribution encompasses the chi-squared distribution as a special case, the nonasymptotic approach completely replaces the asymptotic approach." cochranq presents a table of all m = k(k-1)/2 post hoc pairwise tests using Cochran's Q with both groups in the pair (for the asymptotic p-values this is equivalent to McNemar's test without continuity corrections. The post hoc tests may specify multiple comparisons adjustments using ma(), and p-values (adjusted or unadjusted) for both asymptotic (top) and non-asymptotic (bottom) distributions are presented (the p-values for the non-asymptotic tests are indicated with the label na). See Remarks for consideration of situations where two or more pairwise comparisons have the same test statistic. When no discordant pairs are present in a post hoc test, missing test statistics and p-values are reported. Options nolabel causes the actual data codes to be displayed rather than the value labels in the test output. ma(method) Specifies the method of adjustment used for multiple comparisons in post hoc pairwise tests, and must take one of the following values: none, bonferroni, sidak, hs, hochberg, bh, or by. none is the default method assumed if the ma option is omitted. These methods perform as follows: none specifies no multiple comparisons adjustments be made. bonferroni specifies the family-wise error rate (FWER) "Bonferroni adjustment", calculated by multiplying the p-values for each post hoc test by m (the total number of post hoc tests), as per Dunn (1961). cochranq will report a maximum Bonferroni-adjusted p-value of 1. sidak specifies the "Sidák adjustment" so that FWER is adjusted by multiplying the p-value of each post hoc test with 1 - (1 - p)^m as per Sidák (1967). cochranq will report a maximum Sidák-adjusted p-value of 1. holm specifies the "Holm adjustment" where the FWER is controlled by sequentially adjusting the p-values of each post hoc test, ordered from smallest to largest, with p(m+1-i), where i is the ordered position, as per Holm (1979). cochranq reports a maximum Holm-adjusted p-value of 1. In sequential tests the decision to reject or not reject the null hypothesis depends both on the p-values and their ordering, so those comparisons rejected with this method at the alpha level specified by leve() (two-sided test) are underlined in the output. hs specifies the "Holm-Sidák adjustment" where the FWER is controlled by sequentially adjusting the p-values of each post hoc test, ordered from smallest to largest, with 1 - (1 - p)^(m+1-i), where i is the ordered position (see Holm, 1979). cochranq reports a maximum Holm-Sidák-adjusted p-value of 1. In sequential tests the decision to reject or not reject the null hypothesis depends both on the p-values and their ordering, so those comparisons rejected with this method at the alpha level specified by leve() (two-sided test) are underlined in the output. hochberg specifies a "Hochberg adjustment" where the FWER is adjusted sequentially by adjusting the p-values of each pairwise test as ordered from largest to smallest with p*i, where i is the position in the ordering as per Hochberg (1988). cochranq reports a maximum Hochberg-adjusted p-value of 1. In sequential tests the decision to reject the null hypothesis depends both on the p-values and their ordering, those comparisons rejected with this method at the alpha level (two-sided test) are underlined in the output. bh specifies the "Benjamini-Hochberg adjustment" where the false discovery rate (FDR) is controlled by sequentially adjusting the p-values of each post hoc test, ordered from largest to smallest, with p[m/(m+1-i)], where i is the ordered position (see Benjamini & Hochberg, 1995). cochranq reports a maximum Benjamini-Hochberg-adjusted p-value of 1. FDR-adjusted p-values are at times referred to as q-values. In sequential tests the decision to reject or not reject the null hypothesis depends both on the p-values and their ordering, so those comparisons rejected with this method at the alpha level specified by level() (two-sided test) are underlined in the output. by specifies the "Benjamini-Yekutieli adjustment" where the false discovery rate (FDR) is controlled by sequentiallyby adjusting the p-values of each pairwise test as ordered from largest to smallest with p[m/(m+1-i)]C, where i is the position in the ordering, and C = 1 + 1/2 + ... + 1/m (see Benjamini & Yekutieli, 2001). Stata will report a maximum Benjamini-Yekutieli-adjusted p-value of 1. Such FDR-adjusted p-values are sometimes referred to as q-values in the literature. Because in sequential tests the decision to reject the null hypothesis depends both on the p-values and their ordering, those comparisons rejected with this method at the alpha level (two-sided test) are underlined in the output. es(method) Specifies the method of calculation of effect size to be reported, and must take one of the following values: none, scm, or bjm. none is the default method assumed if the es option is omitted. These methods perform as follows: none specifies no effect size measure be reported. scm specifies the Serlin, Carr and Marascuillo maximum-corrected effect size, Q/[b(k-1)], be reported, as per Serlin, Carr and Marascuillo (2007). bjm specifies the Berry, Johnston and Mielke chance-corrected effect size, R = 1 - delta/mu_delta, be reported, as per Berry, Johnston and Mielke (2007). CAVEAT: The example calculation in Berry, Johnston and Mielke's paper includes the figure mu_delta = 0.4521, but Equation [7] contains a typographical error, and the first term should be 2/[b(b-1)] rather than 2/[k(k-1)] (personal correspondence with Berry). noqtest suppresses the display of the Cochran's Q test table. nolabel causes the actual data codes to be displayed rather than the value labels in the Cochran's Q test table. wrap requests that cochranq not break up wide tables to make them readable. level(#) specifies the compliment of alpha*100. The default, level(95) (or as set by set level) corresponds to alpha = 0.05. copyleft displays the copying permission statement for cochranq. cochranq is free software, licensed under the GPL. The full license can be obtained by typing: . net describe cochranq, from (http://www.doyenne.com/stata) and clicking on the click here to get link for the ancillary file. Remarks The issue of tied multiple comparisons may arise when conducting post hoc tests following Cochran's Q test. This is because the score variable is nominal, and more than one pairwise test may share a specific value of Q due to having the same number of discordant pairs of observations. This is less likely to arise when n, or k or both are large. Tied test statistic values is an issue because several of the available multiple comparison procedures are stepwise procedures, which give different adjustments based on the position in the ordering of the p-values. It is unclear what the appropriate course of action is when attempting to use either the Holm or Holm-Sidák FWER adjustments in the presence of ties. cochranq makes an arbitrary ordering of p-values when there are ties, and reports the adjusted accordingly, but users should interpret these numbers with caution. This issue does not arise when adjusting using the FDR. From Korn, et al. (2004): If the variables or p-values are discrete, there can be ties in the p-values given in (1), but this does not present a problem. Regardless of the ordering of the tied variables in (1), if the hypothesis associated with the first variable in the order is rejected, then the hypotheses associated with the other tied variables will also be rejected because the minimization (2) will be over smaller sets for the other variables. In addition, which of the tied variables is considered first for rejection will not matter, as the permutation distribution will include all of them when considering the first rejection. Also, if the first of the tied variables fails to reject, the procedure ceases and no further hypotheses are rejected, so that the situation in which the first tied variable fails to reject, and the later tied variables do reject, need not be considered. Example Setup . use diphtheria Test for stochastic dominance of culture growth by growth media . cochranq growth cases media [fw=ncases] Setup . use motorskills Test for stochastic dominance of task completion by motor skill type . cochranq score subject task Setup . use psychgrads Test for stochastic dominance of diagnosis by script, with effect size . cochranq diagnosis student script, es(bjm) Author Alexis Dinno Portland State University alexis.dinno@pdx.edu Please contact me with any questions, bug reports (which will be facilitated by including (1) a copy of the data---anonymized is fine, (2) a copy of the exact command you used, and (3) a copy of the output) or suggestions for improvement. Suggested citation Dinno A. 2014. cochranq: Cochran's Q test for stochastic dominance in blocked binary data. Stata software package. URL: http://www.doyenne.com/stata/cochranq.html Saved results cochranq saves the following in r(): Scalars r(Q) Cochran's Q statistic r(b) the number of blocks (subjects) in the test r(k) the number of treatments (groups) in the test r(df) degrees of freedom for the test r(p_asymp) p-value for the asymptotic test r(p_nonasymp) p-value for the nonasymptotic test r(Z) the standardized z test statistic r(gamma) the gamma parameter for the Pearson Type III distribution approximating the exact permutation distribution of Z r(X2) An m length vector of pairwise Q statistics (chi-squared statistics). r(P_asymp) An m length vector of asymptotic p-values for pairwise tests. r(P_nonasymp) An m length vector of non-asymptotic p-values for pairwise tests. References Benjamini, Y. and Hochberg, Y. 1995. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological). 57: 289-300. Benjamini, Y. and Yekutieli, D. 2001. The control of the false discovery rate in multiple testing under dependency. Annals of Statistics, 29: 1165–1188. Berry, K. J., Johnston, J. E., and Mielke, Jr., P. W. 2007. An alternative measure of effect size for Cochran's Q test for related proportions. Perceptual and Motor Skills. 104: 1236–1242. Cochran, W. G. 1950. The comparison of percentages. Biometrika, 37: 256–266. Dunn, O. J. 1961. Multiple comparisons among means. Journal of the American Statistical Association. 56: 52-64. Hochberg, Y. 1988. A sharper Bonferroni procedure for multiple tests of significance. Biometrika. 75: 800–802. Korn, E. L., Troendle, J. F., McShane, L. M., and Simon, R. 2004. Controlling the number of false discoveries: application to high-dimensional genomic data. Journal of Statistical Planning and Inference. 124: 379–398. Mielke, P. W. and Berry, K. J. 1995. Nonasymptotic inferences based on Cochran’s Q test. Perceptual and Motor Skills, 81: 319–322. Sidák, Z. 1967. Rectangular confidence regions for the means of multivariate normal distributions. Journal of the American Statistical Association. 62: 626-633. Serlin, R. C., Carr, J., and Marascuillo, L. A. 2007. A measure of association for selected nonparametric procedures. Psychological Bulletin. 92: 786–790. Also See Help: anova, kwallis