Michael Preston

Computer-assisted study of folklore and literature was initiated shortly after World War II by Roberto Busa, S.J., who began preparing a concordance to the works of Thomas Aquinas in 1948, and Bertrand Bronson, who made use of the technology to study the traditional tunes of the Child ballads. In the pre-computer era, Bronson had worked with punched cards which he manipulated with a mechanical sorter and a card-printer. Many early efforts at producing concordances by computer were modeled on the punched card/sorter/reader-printer process, itself an attempt at mechanizing the writing of slips by hand which were then manually sorted.

At the University of Colorado at Boulder, computer-assisted work in the Humanities was initiated in 1964 by H. Lewis Sawin and Charles Nilon who envisioned the technology as a means of "integrating" the various competing incomplete and sometimes innacurate bibliographies of works written about literature, hence the Integrated Bibliography Project. This project made use of punched paper tape as its means of entering data, but the programming problems were insurmountable and no "integrated" bibliography was ever produced. The history of that project does not differ significantly from others in the 1950s and 1960s, such as those involving the automated translation of texts from one language to another. The lessons learned included: the details of the project were not adequately understood by its co-directors, available equipment was rudimentary, programming languages were not designed for work with data of that nature, and few programmers had significant interest in such projects. In certain circles there was also hostility to such efforts: those in the humanities tended to distrust the technology, and those in the sciences often considered humanities-applications to be wasteful of a precious resource.

Specific kinds of projects, however, were more readily assisted by 1960s technology, even if character-sets were inadequate because computer-printers had either an all-uppercase or upper-and-lowercase character-set that was designed to represent standard English language. Nonetheless, medievalists, despite their graphic needs, generally made the heaviest use of the technology, often to assist preparing editions of manuscripts. Such efforts, it must be understood, were better aided by the technology because editors of medieval manuscripts were concerned with the accuracy of their transcriptions, and punched cards could be corrected and re-corrected and printed and re-printed with little chance of introducing new errors, as is characteristic of retyping. In addition, medievalist editors had the task of preparing glossaries to assist the readers of their editions, and even rudimentary concordances were of considerable use in that they would provide the editor with an alphabetically arranged listing of all spellings of all forms of all words and provide both the immediate verbal contexts in which they occured and information concerning their locations in the text, such as page and line number. Not only did this assist the lexical effort, but, by revealing unusual spellings, it also served as a check on the accuracy of the edition. (See Baker et al. 1982 and Ross 1983.) A similar use of the technology was found to be appropriate to support the study of oral languages (Rood 1976, 1981) and recorded songs (Taft 1983). Whether the scholarly "product" was a dictionary, an edition, or even the concordance itself, a concern with the representation of the texts being studied, rather than a concern with "meaning" or "theme," made computer-assistance invaluable. A different but related set of considerations made the semi-automated collation of printed texts of scholarly utility. (See M. J. Preston et al. 1977.) Much of this was discussed by Preston and Coleman (1978) in the full awareness that that a "translation" from one medium of representation to another can result in a loss of or change in "meaning." (See Bell et al. 1976 and Cathy M. Orr [Preston] et al. 1976.)

We found that the computer-assisted study of language as language and the computer-assisted study of texts as texts were two different although overlapping enterprises. At the University of Colorado, a base of experience was acquired in working on texts in most of the Western languages and in many of their dialects as well as on Native American and African languages themselves, and, for folklore and literary texts, concordances were found to be of considerable utility. A verbal concordance, it must be understood, is the alphabetical listing of all forms of all words together with the verbal contexts in which those words appear; it is the bringing together of different occurrences of the same word that produces the "harmony" that the word "concordance" signifies. It is this emphasis on contexts which distinguishes concordances from indexes and indicates that they have a broader utility than being merely the raw materials from which to make glossaries and dictionaries, although they certainly can serve that function.

Developing the software to produce concordances with any of various kinds of contexts--from contexts which are poetic lines to hand-defined contexts or those based on punctuation to KWIC (Key Word In Context) contexts-- was largely a response to the differing needs of different individuals working with different kinds of texts. Studies of rhyme patterns, verbal collocations, formulaic constructions, rhetorical patterns--the stylized use of words and phrases--can be readily carried out with a properly designed concordance. In addition, concordances can assist the identification and systematic study of a whole range of interrelationships between and among texts, from plagiarism to literary allusions and "echoes." Some authors "recycle" bits of their earlier texts, and some ballads have borrowed stanzas from other ballad traditions; concordances are of direct utility in the study of such similarities between different texts in much the way that textual collation of different versions of a text allows one to identify and to study the significance of those differences.

If one does not encode one's perception of significance in a text (and thereby perhaps distort the significance of one's work), it is very difficult to make direct use of the technology to go from the graphics or the sound-waves that are the text to its "meaning." This is the much-discussed distinction between the "signifier" and what is "signified." At the University of Colorado, various approaches have been taken, some more successful than others and some more mechanized than others. An early effort of a literal nature is that of Michael J. Preston (1972); a more interpretative approach is that of Cathy M. Orr [Preston] (1976). On the level of lexicon, Eugene Irey (1981), because of advances in the technology and customized programming support, was able to impose his "reading" on the texts with which he worked and, as a result, his published concordances are far more "reader friendly" than is common. Conversely, Cathy Preston (1986) accepted the inherent limitation in the technology's application to texts, and, through complex manipulation of the ballads with which she worked, produced a variety of computer-generated aids for her more interpretive approach; she outlined her methodology in (1989). Michael Taft (1977), differing from all of those above, recognized that there was a level of significance larger than the word-unit in the blues songs he studied; his was a study of "the meaning" of verbal formulae.

All of those above were more successful in their research than the co-directors of the Integrated Bibliography Project. As a culture, folklorists and those who study literature know more about what the technology can and can't do, and, year by year, it can do more. We have learned to read our texts more closely, but difficulties remain. There is no push-of-the-button solution to most questions that informed humanists ask. Instead, there remains a technology built upon the manipulation of "signifiers," and their increasingly sophisticated representation, but most questions remain largely in the area of what is "signified." Perhaps another generation will have easier solutions to the questions we all ask.

References

  • Baker, Donald C., John L. Murphy, and Louis B. Hall, Jr. 1982. The Late Medieval Religious Plays of Bodleian Mss. Digby 133 and e Museo 160. THE EARLY ENGLISH TEXT SOCIETY 283. Oxford: Oxford University Press for the Early English Text Society.
  • Bell, Louis Michael, Cathy Makin Orr [Preston], and Michael James Preston. 1976. Urban Folklore from Colorado: Photocopy Cartoons. Research Monograph LD00079. Ann Arbor: Xerox University Microfilms.
  • Irey, Eugene F. 1981. A Concordance to Five Essays of Ralph Waldo Emerson: "Nature," "The American Scholar," "The Divinity School Address," "Self-Reliance," "Fate." CONTEXTUAL CONCORDANCES. Gen. ed., Michael J. Preston. New York: Garland Publishing, Inc.
  • Preston, Cathy Lynn. 1989. "The Way Stylized Language Means: Pattern matching in the Child Ballads." Computers and the Humanities 23: 323-332.
  • Preston, Cathy Lynn Makin. 1986. The Ballad Tradition and the Making of Meaning. Diss. University of Colorado.
  • [Preston], Cathy M. Orr. 1976. "Folk Comparisons from Colorado." Western Folklore 35, 175-208.
  • [Preston], Cathy M. Orr, and Michael J. Preston. 1976. Urban Folklore from Colorado: Typescript Broadsides. Research Monograph LD00069. Ann Arbor: Xerox University Microfilm.
  • Preston, M. J., M. G. Smith, and P. S. Smith. 1977. Chapbooks and Traditional Drama: Part I, "Alexander and the King of Egypt" Chapbooks. CECTAL Bibliographical and Special Series No. 2. Sheffield: University of Sheffield.
  • Preston, Michael J. 1972. "Chapter Four: A Statistical Approach to the Study of Oral Texts." In, The Saint George Play Tradition: Solutions to Some Textual Problems, pp. 113-150. M. A. Thesis. University of Colorado.
  • Preston, Michael J., and Samuel S. Coleman. 1978. "Some Considerations Concerning Encoding and Concording Texts." Computers and the Humanities 12, 3-12.
  • [Rood, David S.] 1976. Elementary Bilingual Dictionary: English-Lakh'ota, Lakh'ota-English. Boulder, CO: University of Colorado Lakhota Project.
  • Rood, David, S. 1981. The Siouan Languages Archive: User's Handbook. Boulder, CO: University of Colorado.
  • Ross, Thomas W. 1983. A Variorum Edition of the Works of Geoffrey Chaucer, Volume II: The Canterbury Tales, Part Three, "The Miller's Tale." Norman, OK: University of Oklahoma Press.
  • Taft, Michael. 1977. The Lyrics of Race Record Blues, Analysis of a Formulaic System. Diss. Memorial University of Newfoundland.
  • Taft, Michael. 1983. Blues Lyric Poetry: An Anthology. CONTEXTUAL CONCORDANCES. Gen. ed., Michael J. Preston. New York: Garland Publishing, Inc.