Department of History

Department of History
Last updated 7/31/1998

Knowledge and skill goals for this undergraduate degree program are recorded in the most recent CU-Boulder catalog. In some summaries of assessment activity, goals are referred to by number (e.g., K-2 is knowledge goal 2).

Outcomes assessment report, 2000-2002


Outcomes Assessment Procedures

Each year, fifteen percent of the papers submitted by senior history majors in 3000-level seminar courses are selected for grading by the Department's Undergraduate Studies Committee. These papers have evidence of the author's name and the class for which they were written removed, and are assigned a code number; then the committee readers, operating without further knowledge, rate each paper as 2 (Very Good), 1 (Satisfactory), or 0 (Unsatisfactory). Each year since the department began using this scale (1989), the average skills ratings for one-quarter to one-half of the papers reviewed have been in the highest category and all or almost all have been rated at least Satisfactory; typically half of all papers receive average knowledge ratings in the highest category. Scores have remained remarkably consistent from year to year.

In 1989-90, Professor John Muldowny of the Department of History at University of Tennessee-Knoxville reviewed and evaluated the general outcomes assessment procedure. He declared the CU-Boulder system superior to the method used by his own department at that time, which looked only at students' knowledge of historical facts with a multiple-choice computer-graded test.

The Undergraduate Studies Committee has worked to refine the rating procedure over the years, increasing rater agreement by defining the scoring categories more and more rigorously and by emphasizing measurement in terms of the department's skills and knowledge statements. From 1987 through 1994, outcomes assessment in History focused on the department's skills goals. Assessment of the knowledge goals was added in 1994-95.

In 1992-93 and 1993-94 the Undergraduate Studies Committee analyzed the assessment results to see if there was any relationship between the students' scores and the faculty rank of the course instructors. The analyses indicated no relationship that is, students in courses taught by junior faculty performed as well as those in courses taught by senior faculty. With no relationship found in either year, the analysis was discontinued in 1994-95, although to insure representativeness the committee continues to sample papers from seminars taught by instructors of all ranks.

Through 1991-92, each paper was independently read by all six members of the Undergraduate Studies Committee. Between 1992-93 and 1996-97, the pool of papers to be assessed was divided into two sets: papers from assignments stressing research skills and papers from assignments stressing other analytic skills. Each set was evaluated on the basis of the department's goals by a three-member subcommittee. Each member independently rated all the papers in the subcommittee's group.

In 1996-97, in recognition of the department's requirement that all 3000-level papers incorporate a research component, the committee revised its evaluation procedure yet again. Papers were divided into subject-matter groupings -- U.S. history, European history, and World Areas -- and read by subcommittees made up of two professors from each field. This allowed for closer monitoring of content (knowledge goal 4) and enhanced the evaluators' ability to assess the level of skills in research as well as in argument and writing (skills goals 1-4). Evaluators were also encouraged to rate papers fractionally, using a 0.0 to 2.0 scale to assign separate scores for Content (knowledge) and Skills, as well as an Overall rating. In the event that the two evaluators differed in any category by 0.5 or more, the paper was given to a third reader, who made an independent assessment.

The evaluation for 1997-98 proceeded according to the same basic rules as that of the previous year, but was restructured to address two concerns that had arisen in the previous year's evaluation exercise. First, the committee wished to determine whether the 1996-97 observation of lower performance in seminars taught by non-tenure-line faculty represented a statistical artifact or the beginning of a trend, and therefore doubled the sample size of papers from these seminars. Second, the longstanding concern that the skills of our students in historical analysis and exposition do not correspond to their comparatively stronger knowledge of historical facts prompted the committee to direct the 1997-98 evaluation specifically at the students' skill level, as measured by several components: quality of argumentation, use of evidence, ability to employ historiography in conceptualization, facility in writing expository prose, and adherence to the standard forms of scholarly writing. All of these, plus the less tangible qualities of creativity and originality, formed the basis of the 1997-98 Overall rating. (See appended rating form.)

Assessments through 1993-94 did not include honors theses. In 1994-95, an appendix on the department's honors program was added to the outcomes assessment report. This compares History to its counterpart programs in Arts and Sciences according to the level of Latin honors granted over the course of the academic year. Department requirements for honors graduation are rigorous, and virtually guarantee that senior theses represent work of very high quality; thus a cross-departmental comparison is warranted, since it would serve no purpose but to inflate the Department's overall scores to include honors essays in the outcomes assessment sample. (For the record, the requirements for honors graduation are as follow: students must have an overall grade point average of at least 3.3 and a History average of at least 3.5, must be accepted into and receive at least a B in the senior-level colloquium [History 3110] that prepares them to write their theses, must research and write a thesis under the supervision of faculty advisors, and must present their thesis to an evaluating committee of faculty and then defend it in a formal oral examination. The 3-4 member evaluating committees evaluate each thesis as if it were a professional scholarly paper. On the basis of thesis, defense, and overall academic record, the committee recommends a suitable level of honors to the Honors Council of the College of Arts and Sciences, which independently evaluates the student's record and makes the final determination.)

Latest Outcomes Results

The mean scores for the sample of nineteen papers assessed in 1997-98 were:

Argument 1.3
Evidence 1.1
Historiography 1.2
Expression 1.2
Form 1.1
Overall 1.2

(Keep in mind that in this and other tables, the Overall score is not derived as an average or other formal function of the subscores, but is assigned independently in the same fashion as each subscore). As in previous years, the evaluators noted variations between papers in the three fields (U.S. history, European history, and World Areas). This year, however, the variations were substantial and inverse to previously observed data. Overall scores in 1996-97 were highest in American history essays, and lowest in European and World Areas papers, a feature that had been thought attributable to the greater preexisting knowledge of American culture among students; but these variations were not large. The 1996-97 Overall scores in the American field departed from the department average by +.09, while the Overall scores in European and World Areas departed by -.07 and -.01, respectively. In Skills (the area that this year's evaluation exercise has disaggregated into five components), the departure from the department mean was +.12 in American history, -.10 in European, and -.02 in World Areas. These variations were small enough to be without statistical significance.

This year's papers showed quite a different distribution in Overall mean scores, with papers in American history ranking lowest, at 0.88 (a departure from the department mean of -.32). European papers rated Overall at 1.35 (+.15 from the department average), while World Areas papers' Overall score was 1.48 (+.28). These differences held in each of the subordinate areas evaluated.

Area Overall Department (N=19) American (n=8) European (n=7) World (n=4)

Argument 1.3 .91 1.43 1.60
Evidence 1.1 .68 1.22 1.65
Historiography 1.2 .83 1.26 1.75
Expression 1.2 .93 1.40 1.28
Form 1.1 .91 0.99 1.28
Overall 1.2 .88 1.35 1.48

Area American European World

Argument -.39 +.13 +.30
Evidence -.42 +.12 +.55
Historiography -.37 +.06 +.55
Expression -.27 +.20 +.08
Form -.19 -.11 +.18
Overall -.32 +.15 +.28

Five papers of the 19 evaluated, or 26.3% of the total sample, rated an unsatisfactory Overall score (less than 1.0); this compares to 2 of 15 (13.3%) in the previous year's sample. Four of these five papers were submitted in U.S. History seminars, one in World Areas. Five papers (26.3%) received Overall ratings of 1.5 or better; two (10.5%), both in World Areas seminars, rated overall at 2.0. These proportions were comparable to those observed in the previous year.

1997-98, whatever its disconcerting qualities in the overall performance among History undergraduates, was a very good year in History Honors. Eight History majors graduated with honors (about 9% of the year's History graduates). This was the seventh-largest number of honors graduates in the College of Arts and Sciences, in which thirty-three departments have active honors programs. (Each year the University of Colorado at Boulder graduates only about 2% of its seniors with Latin honors). This year's distribution of honors by degree in History was one cum laude (12.5%), three magna cum laude (37.5%), and four summa cum laude (50.0%). This was the highest proportion of summa graduates in the major since records have been kept, and the highest proportion in departments of comparable size in the College of Arts and Sciences for 1997-98. A History senior, Stacey Smith, was chosen as Outstanding Graduate in Arts and Sciences. Her cumulative and major grade point averages were 4.0 over eight consecutive semesters in residence at CU-Boulder, and her thesis, in American history, received a summa designation.


The Department continues to use the annual outcomes assessment exercise to monitor its performance and to gather information that can become the basis of programmatic and curricular improvements. In previous years, as noted above, the department looked for differences in the quality of student papers according to the rank of the faculty member, and discovered that there was no significant variation -- i.e., that courses taught by junior members produced papers virtually identical in quality to courses taught by senior faculty. Last year the department began monitoring the performance of students in classes taught by non-tenure-track faculty -- the "honorarium instructors" who carry about thirty percent of the instructional responsibilities in the department, principally in lower-division courses. This year seven papers were sampled from non-tenure-track-instructed seminars in European and American history, for comparison to papers written in those of tenure-track and tenured instructors. (This year there were no honorarium instructors offering seminars in World Areas.) The results are as follow. In the aggregate:

Line Faculty Honorarium Faculty
Area (n=12) (n=7)

Argument 1.3 1.1
Evidence 1.1 0.9
Historiography 1.2 1.1
Expression 1.2 1.2
Form 1.1 0.8
Overall 1.2 1.2

Disaggregated by field of instruction:

  American History European History
Area Line Faculty (N=4) Honorarium Faculty (N=4) Line Faculty (N=4) Honorarium Faculty (N=3)

Argument 0.88 0.94 1.48 1.32
Evidence 0.63 0.59 1.14 1.32
Historiography 0.81 0.88 1.24 1.28
Expression 0.88 0.88 1.28 1.55
Form 1.00 0.65 0.96 1.02
Overall 0.75 1.00 1.34 1.58

The good news, obviously, is the lack of difference in quality, overall, between the outcomes scores in courses taught by tenure-line faculty and honorarium faculty members. This year's larger sample size makes it seem probable that the disparity shown in last year's scores (an average Skills score of 1.45 in tenure-track-instructed seminars, versus 1.03 in honorarium- instructed seminars, and Overall scores of 1.48 versus 1.08, respectively) was in fact a fluke of that year's smaller sample size. In order to understand this fully, however, at least two more years of comparisons should be conducted.

The bad news is, of course, the difference in performance between seminars in American history and the other fields. While it may be observed that both of the evaluators in the American field were rigorous critics, and that two of the papers in the sample of American history essays were truly wretched work which scored zeroes in several categories and thus dragged down the field averages as a whole, there is obvious cause for concern in so great a disparity. While the lack of excellence in the American field did not actually depress the department's overall scores below the Satisfactory range, future evaluation exercises must carefully track variations between fields, and seek explanations in instructional quality, should substantial disparities in quality persist.

As for the meaning of the disaggregated scores measuring different components of argumentative and expressive skills, it would seem that our students know how to propound effective arguments, but that they are less skilled in using evidence in support of their positions than they should be; that they tend to be careless of the form in which they present their work, and that their acquaintance with historiography as a critical and conceptual tool is less than fully developed.

It was with precisely these concerns in mind that the Undergraduate Studies Committee last year developed a proposal for a separate writing requirement for all History majors, to be administered as a two-credit co-seminar or three-credit freestanding course, at the lower-division level. Our view was that if students were introduced to the rhetoric and method of history at a fundamental level, as freshmen or sophomores, they would be able to apply what they had learned to writing in their upper-division coursework and would be fully prepared for the challenges of the extended writing assignments in the 3000-level seminars (the papers of which form the basis of our outcomes evaluations). The lack of University funding for this initiative in the reallocation process, our severely straitened budget, and the absence of any prospect of growth in instructional funding for the foreseeable future, seem likely to prevent us from using this approach to improve undergraduate performance. Other, less costly, strategies may arise in time; until then, we will continue to track our outcomes trends with hopeful concern.

APPENDIX: Outcomes Rating Form for 1998


The History Department rates papers for purposes of its annual outcomes assessment by assigning point scores according to the following scale:

0 Unsatisfactory
1 Satisfactory/competent
2 Proficient/excellent.

You may use fractional points if you wish; please note, however, that 2.0 is the maximum number of points that may be assigned.

Please base your ratings on your opinion of the general characteristics and qualities of each paper. You should not ignore the content of the papers in cases where you have knowledge of the topic, of course; remember, however, that you are not expected to acquire detailed expertise of the papers' subjects in order to assess them. You need not make verbal comments. Should you choose to do so, please confine yourself to the space at the bottom of this sheet, and its back; do not mark the essays themselves.

Please rate each paper in all of the following categories. The Overall Rating is not a sum, and need not be an average, of the component scores.

1. Argument Does the essay propound a thesis? Does the writer support it with an adequate argument? Is the argument coherent? Convincing?
2. Evidence Does the writer use primary source evidence to support his or her argument? Does the writer demonstrate analytical and critical skills in using these sources? Does the writer take proper note of their biases?
3. Historiography Does the writer use other historians' work appropriately to frame his or her argument? Does the writer take account of interpretations that diverge from his or her own? Does the writer demonstrate critical skills in the use of secondary sources?
5. Expression Does the writer use language skillfully?
6. Form Does the writer adhere to the normal rules of citation in footnotes, bibliography, etc.? Are the citations adequate to allow the reader to form a critical opinion of the range and use of sources?
7. Overall Rating Bear in mind that this is a summary judgment of the paper's quality, and need not reflect an average of the categories above. Such factors as creativity and originality should be considered in this category.
Comments (optional):


Last revision 07/20/02

