One criticism of statistical significance testing concerns the number of negligibly small research results that are labeled significant simply because they are statistically significant (Carver, 1993). It is well known that statistical significance is related to sample size; large samples are more likely to yield statistically significant results, even though the results may not be very substantive. Effect sizes express the difference between two means in terms of standard deviation units. Cohen (1988) proposed that effect sizes of .2, .5, and .8 be considered small, medium, and large, respectively. These are just guidelines, however, and may not be appropriate for all situations. In addition, interpreting effect sizes is a somewhat subjective process.
By itself, a test of statistical significance will not afford much information about the substantiveness of the difference between two means. For this reason, it is often helpful to supplement the information provided by such a test with an estimate of an effect size.
In the first equation, is the larger of the two sample means, and the subscripts (1 or 2) denote group membership. Sometimes effect sizes are computed by simply dividing the difference in means by the standard deviation of a comparison group (e.g., a group of peer institutions), rather than using a pooled standard deviation. Additional information on effect sizes can be found in Cohen (1988), Glass (1976), Glass, McGaw, and Smith (1981), and Hedges and Olkin (1985).
Carver, R. P. (1993). The case against statistical significance testing, revisited. Journal of Experimental
*Adapted, with the author's permission from:
Last revision 10/26/06
PBA Home | Strategic Planning |  Institutional Research & Analysis |
Budget & Finances | Questions? Comments?
15 UCB, University of Colorado Boulder, Boulder, CO 80309-0015, (303)492-8631
© 2001, The Regents of the University of Colorado