"The Letter was not nice but full of charge": Toward an Electronic Facsimile of Shakespeare Speaking Notes for a paper delivered by Kenneth B. Steele Centre for Computing in the Humanities University of Toronto at the Combined 16th Annual Association for Literary and Linguistic Computing Conference and 9th International Conference on Computers and the Humanities (ALLC/ICCH): "The Dynamic Text." University of Toronto, June 5-10, 1989. [A laptop computer and overhead projection palette were utilized for the examples.] _______________ William Shakespeare's dramatic career began four centuries ago, and his associates John Heminge and Henrie Condell tell us that, from the beginning, the texts of his works were abused by what they term "diuerse stolne, and surreptitious copies, maimed, and deformed by the frauds and stealthes of iniurious impostors, that expos'd them." In their preface "To the great Variety of Readers" in the 1623 First Folio, they assert that finally their volume presents these pirated plays, "cur'd, and perfect of their limbes; and all the rest, absolute in their numbers, as he conceived the[m]." Of course, all the early quartos made very similar claims to editorial integrity: for example, the 1603 first quarto of Hamlet purports to present the play "As it hath beene diuerse times acted by his Highnesse seruants in the Cittie of London." This assertion sounds distinctly like the publicity for a very recent edition, which declares: "After eight years' work by the world's finest Shakespeare scholars, here are the plays as they were acted by Shakespeare's company." The inescapable reality is that the entirety of the Shakespearean corpus comes to us through the mediation of scribes, editors, compositors, and proofreaders, and that, with every edition, from Renaissance quartos to modern diskettes, layer upon layer of interpretation has accumulated and obscured the texture of the author beneath the glossy paste-wax finish of emendation. Each edition claims to "correct" ostensible "corruptions" and "compositorial errors," in the process retreating still further from the original authorial manuscripts. Most damaging of all is the time-honoured editorial preference for a single, "definitive" text, rather than a respect for what Randall McLeod calls the "infinitive" text. This attitude results in rationalization of variants as "indifferent" and conflation of sometimes strikingly different plays into one. The Quarto and Folio texts of _King Lear_, for example, disagree in 400 complete lines, and thousands of variant words. Recent scholarship by Steven Urkowitz and Gary Taylor, among many others, has demonstrated the consistent and deliberate alteration of character, theme, and plot between the texts. When these distinct texts are conflated, it should hardly be surprising that many elements become contradictory, or that characters become inconsistent. Shakespeare may be responsible for the folio _King Lear_, or he may not, but certainly he never imagined the monstrous conflation which has dominated editions for several centuries. _King Lear_ is not a special case, although its seminal importance to the twentieth century has made it the most celebrated example of textual instability (or what we at this conference are calling a "dynamic text"). Seventeen of Shakespeare's 38 plays survive in 2 variant forms, and 2 more, _Romeo and Juliet_ and _Hamlet_, survive in 3. Conflation and the editorial impulse to create a single "definitive" text is an accommodation of a naturally human but rather lazy preference for the uncomplicated rather than the uncomfortable or inconvenient. I propose that Friar Lawrence's words to Friar John in _Romeo and Juliet_ have found new relevance in the twentieth century as a warning to Shakespeare scholars: The Letter was not nice but full of charge, Of deare import, and the neglecting it, May do much danger[.] (_RJ_ Q2 5.2.18) Every letter, ligature, or punctuation mark in the original texts, although occasionally annoying or puzzling, is indeed "full of charge" -- and in more than one sense, when one is talking about an electronic Shakespeare textbase. _______________ Twenty-five years ago this month, shortly after I was born, Trevor H. Howard-Hill performed his first test concording of Shakespeare's _Measure for Measure_ at the Oxford University Computing Laboratory.[1] The English Electric KDF- 9 computer he used took paper-tape input, and possessed a core memory of what we would call 187 K. By the end of the decade, though, Howard-Hill had overseen the keypunching and quintuple proofreading of the entire First Folio, several important quartos, and a number of apocryphal plays, in the course of producing the Oxford Old-Spelling Shakespeare Concordances. To this day, those files remain the bulk of the computer-readable transcripts of Shakespeare's quarto and folio texts. In the early 1970's, Howard-Hill claimed it was still "too difficult, time-consuming, and expensive for all but the most determined scholar to work with the computer."[2] Downloaded to a modern microcomputer, though, the 55 original texts of the 38 plays now occupy 10.2 megabytes of disk space, and even with their WordCruncher indexes, can fit on a 20- megabyte hard disk. For those who steadfastly prefer an edited text, the Riverside Shakespeare and the Oxford Complete Works are now available commercially on diskette for a few hundred dollars, and Oxford is considering the release of its recent old-spelling edition as well. In many ways, it is now too difficult, time-consuming, and expensive for all but the most determined scholar to work _without_ the computer. Early last year, the Centre for Computing in the Humanities began the Shakespeare Text Archive project, to make Howard-Hill's mainframe files part of an accessible and convenient textual tool for students and faculty at the University of Toronto. Although still incomplete, the textbase was formatted for use with ETC's WordCruncher software, and made available through the CCH's network. The process of verification, encoding, and keypunching continues even now, but the archive currently contains the First Folio, all of the so-called "good" quartos, and most of the "bad" quartos. Hopefully, once the Shakespearean texts are complete, the means will be found to add a collection of Shakespearean sources and perhaps even a modern edition or variorum annotation for cross-reference. Ultimately, the archive could be converted into a form of hypertext, perhaps linked with optical facsimiles of the texts in question. This form of "electronic variorum" might finally approximate a genuinely "infinitive text," and distributed on CD-ROM could be easily accessible to all Shakespeareans. Using text retrieval software such as WordCruncher or even TACT, the textbase is currently not so much useful for computerized stylistic, authorship, or collational analysis, as it is a remarkable tool to accelerate and refine more traditional approaches, such as close reading, imagery analysis, or source study. First of all, the text becomes what I liken to a "doodle pad": through casual browsing, random experiment, and plain dumb luck, the user often stumbles across something which sparks his imagination. _Reading_ Shakespeare via WordCruncher is another form of stimulus to the critical imagination: a diachronic corpus suddenly becomes synchronic, multiple plays interpenetrate on a single word, and one is reading vertically, by cross-section, rather than horizontally, from beginning to end. Working with the original texts can be complex and confusing, or "inconvenient" as some would say: the archive contains more information than many users may think they want. For literary rather than textual purposes, the user must account for a bewildering array of duplicate texts, variant spellings, typographical eccentricities, textual cruxes, and apparent compositorial "errors" familiar to anyone who has tried to read Shakespeare in facsimile. Nonetheless, with a little practice, the archive can yield better results than an electronic edition and with considerably greater textual fidelity. Another purpose of research with the Shakespeare Text Archive is lexical: the Oxford English Dictionary reports the Renaissance definition of any given word, but does not easily reveal Shakespeare's personal sense of its connotations and denotations. Examining each occurrence of the word in its Shakespearean context can reveal Shakespeare's intellectual and emotional attitude toward it: for example, Shakespeare seems to consider "dogs" fawning, subservient, low-life creatures, but "hounds" loyal, noble hunting animals. It is also possible to search for words in conjunction with the codes for title pages, stage directions, speech prefixes, prologues, etc. (There are 60,769 Cocoa-type codes in the texts). One quickly finds that, of 6276 stage directions, and 554 beds, only 6 beds occur within the original stage directions, in _2 Henry 6_, _Romeo and Juliet_ (Q1 only), _Othello_, and _Cymbeline_. [Example: eyen, eyne, eine] An examination of the fifteen occurrences of the obsolescent plural "eyen" (in its various spellings) reveals that in all cases but one (_Pericles_), the word is used primarily for purposes of rhyme (and eight times to rhyme with "mine"). Furthermore, it seems clear from Shakespeare's use of the word in _A Midsummer Night's Dream_ and _As You Like It_ that he considered it preposterous: appropriate only in circumstances such as Bottom's performance as Pyramus, or Phebe's love poem to Rosalind. [Example: Ears.byr, Ears2.byr, Ears3.byr] Another example may demonstrate the usual investigative sequence more clearly. The ear is referred to 603 times in Shakespeare's original texts. As one reads through these occurrences, 25 references seem strangely sexual -- specific evidence which is more convincing than vague Freudian generalizations. Furthermore, when one searches for co-occurrences of the ear and poison or infection, one finds seven places in which pouring poison in the ear is closely associated with some form of male sexual jealousy -- which I would argue helps to explain the symbolism of _Hamlet_ as it may have formed in Shakespeare's mind. [Example: Theatre.byr] The user can create a list of theatrical terms [act, appear, comedy, costume, globe, illusion, mask, pageant, part, perform, platform, play, revel, scaffold, scene, show, stage, theater, and tragedy] to conduct a preliminary study of metatheatricality across the corpus. My word list matches 3384 occurrences, largely in the tragedies and good quartos, and concentrated in the Elizabethan plays. The histories and the First Folio seem comparatively lean in metatheatrical references. _Hamlet_ and _A Midsummer Night's Dream_, understandably, have by far the greatest concentration of theatrical language, owing to the internal performances, and their overall thematic concerns. _______________ The preceding illustrations have been primarily literary in nature, focusing on lexicon and content rather than textual minutiae, which is the fort of the Shakespeare Text Archive. Unlike electronic _editions_ of Shakespeare, the archive aims to be an electronic facsimile of the original quarto and folio texts. Obviously, it cannot yet replace more conventional photographic facsimiles, but the coding of signatures allows easy cross-reference to the Norton facsimile or the Allen & Muir quarto facsimiles. Using text retrieval software, an electronic facsimile can provide virtually instant answers to questions previously unthinkable. An absolutely precise electronic facsimile is still years away: only some form of flawless optical process will satisfy bibliographers. Yet despite its minor imperfections, the Archive as it stands can eliminate much of the drudgery of mechanical and repetitive tasks, liberating the scholarly imagination to dream up new questions, rather than laboriously seek to answer old ones. [Example: I2.byr, I.byr] For example, the word "aye / ay / aie" [12.byr] occurs a total of 76 times in the texts. In 41 cases, it occurs in the expression of self-pity, "ay me / mee," in 27 cases it means "ever" (17 times it is "for ay / aye"), 7 times it means "yes," and in 3 other cases it is the French first person singular verb. The spelling "i" (the single letter), occurs considerably more often: 31,540 times in the texts: in 30,743 cases, it is the pronoun (in 202 cases in contracted form), and 999 times it signifies "yes" (remember, in only 7 cases was it spelled "ay / aye"). [I.byr] "I" as a single letter is never the desperate "ay me," nor is it used to mean "ever." [Example: Italics] Italic typeface is used 74,655 times in the corpus, and _Troilus and Cressida_ and _Antony and Cleopatra_ seem to have double the average at about 2000 occurrences each. This may be partially explained by the lengthy italic advertisement prefixed to the quarto _Troilus and Cressida_, and the italic Prologue which begins the folio version. WordCruncher is not, however, counting each italic letter, but only the font codes at the beginning of each italic section. The bulk of the italic occurrences seem to be caused by italic speech prefixes, intensified by lengthy stichomythia, and the compositors' scrupulous italicization of proper names in the text (and both Roman plays seem to delight in allusions to historical figures). [Example: Hyphenation] Hyphenation occurs in only 8561 words, but because these are largely single occurrences, it took my 12 MHz machine two and a half hours to come to that conclusion (fortunately I saved my results for the purposes of this fifteen-minute talk). The hyphenation, for the record, seems to be heavier in the folio than the quartos, and occurs with especial frequency in _The Merry Wives of Windsor_ (532 times). [Example: Punctuation ( , . : ; ? ! )] Punctuation study, previously almost unthinkable, is extremely quick and painless on an electronic text. In a conference held here at Toronto last year, a presenter received an ovation for having diligently counted all the commas in a Renaissance text. Within seconds, however, an electronic facsimile can determine that there are 138,198 commas in the original texts (which would take over 23,000 WordCruncher screens to view), and can offer distributions across the works. Distributions of the 104,928 periods, 26,974 colons, 5820 semi-colons, 15,785 question marks, and 1162 exclamation marks can also be examined, to determine, for example, that the Folio seems light in periods, but heavy in semi-colons and colons. WordCruncher is not a collation program, and other programs I've tried thus far seem too particular to handle the variant spellings and significant alterations in the Shakespeare texts. It is possible, however, to index together versions of a play with WordCruncher, and to output a comparison word frequency file. [Example: "type hamlet.cmp"] This is rather less interactive than WordCruncher's other functions, but it does produce some results. Of course, these results cannot finally be interpreted without division along compositor stints, which are coded in the text itself but which the current software cannot analyze. Comparing the word frequencies of Q2 and F1 of _Hamlet_, a number of patterns become immediately evident. The folio seems to double the number of prose lines (perhaps because of shorter column width). Question marks and semi-colons double, colons quintuple, and exclamation points all but disappear. The compositors of the folio _Hamlet_ overall prefer to drop silent e's, except in finde, minde, kinde, winde, and eleven other words which generally have doubled vowels or consonants. Quarto 2 demonstrates a marked preference for the spelling "o", while folio markedly prefers "o-h." The list could be endless, and is (as you can see.) _______________ Not only is the Shakespeare Text Archive textually incomplete, but the software engine which drives it, currently WordCruncher, is too limited to perform all the research which the text could ultimately support. The _ideal_ workstation, or software package, for the Shakespeare Text Archive would have to be capable of performing more complex multitasking functions, such as on-screen collation of multiple texts or comparison of alternate pairs (for example, the distributions of "go" versus "goe" (with an "e") through compositors). The text retrieval software, or a module compatible with it, would have to be able to search the textbase for rhetorical repetitions of any kind: anaphora (the repetition of a word at the beginning of syntactic units), epistrophe (end), epanalepsis (beginning and end of same unit), anadiplosis (end of one unit and beginning of next), and perhaps even alliteration and rhyme, if adequate phonemic encoding could be included. This ultimate workstation would be able to facilitate the detection of patterns in the significant differences between variant texts or compositor stints. The software would be flexible enough to reorder ranges based on the chronological order of composition, performance, typesetting, or publication. It would allow operations to be conducted on the entire corpus or on a single play or plays. Ultimately, the textual study of Shakespeare will lead to a combination of optical document and machine-readable text: a form of hypertext which would also allow computerized comparison of damaged type, press variants, etc. If Herner Schnelling's calculations in his 1987 _Literary and Linguistic Computing_ article are correct, however, a single Folio page would require 3,952 million bytes of storage to meet the CCITT [3] facsimile standard. To the average Shakespearean, optical storage is currently something "too difficult, time- consuming, and expensive" -- but as we all know, technological limitations are transitory. For today, a readily accessible ASCII text archive is a step toward the day when we, along with Prospero, can declare, "I'll drown my book." N O T E S 1. Howard-Hill describes the process in detail in his article, "The Oxford Old Spelling Concordances," Studies in Bibliography 22 (1969): 143-64. The later history of the texts is described in an anonymous note, "Shakespeare and the Computer," ALLC Bulletin 8:1 (1980): 72. 2. "Common Shakespeare Text" 54. 3. (Comite' Cosultatuf International Te'le'phonique et Te'le'- graphique). ___________________________________________________________________ The contents of this electronic file are copyright (c)1990 Kenneth B. Steele, University of Toronto. Quotation for scholarly (non-commercial) purposes is permitted, but please contact the author ( or ) to verify the material in question and advise him of your intention. Please do NOT distribute.