Effects of Instructor Gender and Ethnicity on FCQ Ratings Given by Students
Summary of major findings
We statistically removed from each rating the effects of class size, level (undergrad vs. grad), and individual department. This was done by entering these variables as predictors into a SAS regression modeling procedure called GLM (general linear models), and obtaining a predicted rating based on them. This predicted rating was then subtracted from the actual rating, yielding a residual. After converting each section mean rating to a residual in this fashion, a single mean residual for each instructor was then calculated, by averaging the instructor's ratings across all sections taught.
The study included ratings on FCQ item 12, course ratings, as well as item 11, instructor ratings. However, because course ratings and instructor ratings were so highly correlated (r=.91), the remainder of this report will discuss only instructor ratings. All statements and differences reported below that apply to one apply to the other also.
The table below summarizes the results from the study:
Among TTT instructors, there was little difference between other ethnic groups compared to whites, the largest being the .10 lower rating for Asians, a difference of about .26 standard deviation units (.10/.38 = .26), which is fairly small. However, among non-TTT instructors, the difference between ratings of Asians and whites was considerably larger - the mean rating for Asians was .29 below that for whites, a difference of over half a standard deviation. Non-TTT Asian instructors were scattered across 24 different departments, with only two departments having more than three; furthermore, these two departments - East Asian Languages and Literature, and Economics, with 6 non-TTT Asian instructors each - did NOT contribute much to the extremely low overall mean, since their non-TTT Asian instructors' mean ratings were .02 and -.12, respectively.
A separate analysis was done after eliminating from the population instructors who taught only 1 or 2 sections across the 6 terms. Restricting the analysis to instructors who taught at least three sections sharply attenuated the large negative difference between Asian non-TTT instructors and other groups, and also resulted in a large drop in standard deviation for that group, indicating that most of the negative effect seen in the above table was due to a few instructors who taught only one or two sections each and received extremely low ratings. Perhaps the low ratings they received is the reason they only taught one or two sections - their departments realized they were ineffective instructors and gave them no more teaching assignments. This is just speculation, however.
Other Studies in the Literature
We have not done a systematic search of the higher education literature for other studies in this area. However, a recent study by Centra and Gaubatz (2000) that specifically looked at gender effects (student, instructor, and the interaction between them) reported that results of past studies were inconclusive, with some studies finding no or exceedingly small effects, and a few finding that male students may rate female instructors lower than male instructors.
Centra and Gaubatz's own study of gender bias used data from 741 classes from a variety of institutions, all using a common evaluation form developed by the Educational Testing Service. In their analyses of students in the same class rating either a female or male instructor, they found that female instructors received higher ratings from female than from male students on 6 of 8 scales, including a global rating. The differences were statistically significant but very small (about a quarter of a standard deviation), and thus of little practical utility. Male instructors received the same ratings from male and female students.
In comparisons across classes, female students rated female instructors higher on some scales, male students rated male instructors higher on some others, but global ratings did not differ by instructor or student gender. And the differences were again very small, on the order of a quarter of a standard deviation, and thus of no practical effect.
Centra, J.A., & Gaubatz, N.B. (2000). Is there gender bias in student evaluations of teaching? Journal of Higher Education, 70 (1), 17-33.
l:\ir\fcq\studies\fcqethsum.doc -- 10/25/2002
Last revision 05/18/16
ODA Home | Institutional Research |   Reporting & Analytics |   Contact | Legal & Trademarks | Privacy
15 UCB, University of Colorado Boulder, Boulder, CO 80309-0015, (303)492-8631
© Regents of the University of Colorado