A Textbase of Early Tudor English
Greg Waite (Editor in Chief)
University of Otago
Waite, Greg (Editor in Chief). "A Textbase of Early Tudor English." Early Modern Literary Studies 1.1 (1995): 12.1-15 <URL: http://purl.oclc.org/emls/01-1/waitettb.html>.
Origins and Background of the Project
- In 1984 Alistair Fox and Greg Waite initiated a research project at the University of Otago to compile A Glossary of Early Tudor English (in other words, a small dictionary of Tudor language). The impulse for the project came from the realization that not only did students and others reading texts of the period 1485-1550 lack a short dictionary to aid them with unfamiliar words or senses, but also that this period of rapid change in the language had generally received less attention from linguists than the ages of Chaucer or Shakespeare which stand on either side of it. The introduction of printing, the Renaissance restoration of classical learning, the Reformation, and the literary achievements of More, Elyot, Skelton, Wyatt, Surrey and others all had their part to play in the transformation of written English, while less tangible forces were at work on the language in its spoken form.
- Since the project's inception, its scope has widened. In order to provide a sound basis for a dictionary of the period, and to establish a research tool which might contribute to further linguistic and lexicographical documentation of early Tudor English, it was decided to develop A Textbase of Early Tudor English: a machine-readable text corpus supported by software capable of executing search routines, analyzing variations in Tudor spelling, and carrying out grammatical parsing. A further impetus to the encoding of text from this period was the awareness that much of it does not exist in any modern edition. The project, therefore, can provide useful basic editions that might be used in hard-copy by those who have no particular need for string-searches or other electronic features, but have no ready access to the original sources.
- Early experiments with editing and encoding Tudor texts led to the publication of A Concordance to the Complete English Poems of John Skelton (Cornell UP, 1987). Planning for the Textbase was greatly advanced with the help and advice of Professor Ian Lancashire, who spent several weeks in Dunedin in 1987. More recently, planning has been greatly advanced by the interchange of ideas with colleagues in North America involved with SEENET (Society for Early English and Norse Electronic Texts) and TEI (Text Encoding Initiative).
Purpose of the Project
- The purpose of the project is to provide a corpus of texts which may be concorded or searched interactively for words, collocations of words, or words in particular syntactic constructions. The Textbase is intended as a resource useful to a wide range of scholars, from linguists studying phonology, morphology or syntax, to scholars of English language and literature, to cultural historians who may wish to search for key names, words, or concepts.
- Planning and implementation has been undertaken with a view to supplementing existing lexicographical resources and assisting future work toward a period dictionary or a revised and expanded Oxford English Dictionary (OED). The OED, the main body of which was completed in 1928, remains the chief source of such information for the period. While other areas of English have subsequently been more fully documented in period and regional dictionaries (such as A Dictionary of American English, The Scottish National Dictionary, and, in progress, the Dictionary of Old English, the Middle English Dictionary, and A Dictionary of the Older Scottish Tongue), the Tudor period awaits closer examination.
Scope of the Textbase
- In setting itself a chronological limitation of 1485-1550, the project aims to document the first phase of early Modern English. Historically, these years cover, more or less, the reigns of the Tudor monarchs Henry VII, Henry VIII, and Edward VI. Linguistically, 1485 is close to the dates 1475 or 1500 normally assigned to the boundary between Middle and Early Modern English. Culturally, 1485 is close to the beginning of the English Renaissance. The later Tudor and Renaissance period is receiving the attention of Ian Lancashire, Roy Flannagan and others, who have initiated a series titled Renaissance Electronic Texts, mainly post-1550.
1. Texts for Inclusion
- The Textbase will include all surviving poetry and drama of the period 1485-1550, and selected prose. The verse corpus (nearing completion) is estimated to consist of roughly 260,000 lines. We are fortunate in the recent appearance of good catalogues of the verse and drama for our period which have enabled us to be confident of our coverage. (See, for example: Ringler, William A., Jr., Bibliography and Index of English Verse in Manuscript, 1501-1558, prepared and completed by Michael Rudick and Susan J. Ringler. New York: Mansell, 1992. Bibliography and Index of English Verse Printed 1476-1558. New York: Mansell, 1988. Also used: Brown, Carleton Fairchild and R.H. Robbins, The Index of Middle English Verse. New York, 1943. R.H. Robbins and J.L. Cutler, Supplement to the Index of Middle English Verse. U of Kentucky P, 1965. Ian Lancashire, Dramatic Texts and Records of Britain: A Chronological Topography to 1558. Toronto: U of Toronto P, 1984. Alfred Harbage and S. Schoenbaum, Annals of English Drama, 975-1700. London: Methuen, 1964. For the prose, the Short Title Catalogue (STC) and the New Cambridge Bibliography of English Literature (NCBEL) can now be supplemented by R.E. Lewis, N.F. Blake, A.S.G. Edwards, Index of printed Middle English Prose. New York: Garland, 1985.)
- Many early Tudor books and manuscripts contain texts of uncertain date or origin, and others that are clearly identifiable as medieval. Inclusion of such works depends upon factors such as the significance of the work in the Tudor period (based to some extent on the apparent wideness of its circulation), the textual importance of the Tudor version (perhaps showing a significant degree of modernisation, or representing a textual tradition distinct from the generally known medieval one) and so on. The tendency will be for inclusion, rather than exclusion, so that writers whose careers span the dating limits will have their entire known corpora included. At this stage, Scottish writers are not included.
- Of the prose from the period, a body perhaps twice as large as the corpus of poetry and drama will be entered, taken from as wide a variety of literary and non-literary sources as possible in order to represent the varieties and registers of writing. Major works such as Sir Thomas Elyot's The Boke Named The Governour (1531) and many of Sir Thomas More's works have been or will be entered in their entirety. In addition to the above, we intend to include some glossaries and dictionaries (the Promptorium Parvulorum, for example, has been entered). We will seek to avoid any duplication of work on early dictionaries being carried out elsewhere.
2. Ancillary Materials
- The Textbase will be supported by a reference database indexing fields such as: Title, Author, Composition Date, First Line, Printer, Short Title Catalogue number, Ringler number (from the Bibliography and Index of English Verse), etc. We have cooperated with members of the Otago University Computing Centre in work on a parsing system that will handle early Tudor English. While much of this work remains experimental, some useful components of the parser (written in LISP) may be released--in particular, an Index of Early Tudor Spellings, and related software that will assign a Tudor spelling form to its modern Headword form or, in reverse, provide a list of potential Tudor spellings in response to the input of a modern spelling form. This tool will obviate the need to specify several strings when searching for a particular word. In the corpus of Skelton's poetry, for example, the verb shall takes the following forms: shall, shal, shalt, shalte, shold, sholde, should, shoulde, shuld, shulde, xall, xalte, xuld, xuldyst, xuldest. Enhancement of the Textbase might eventually include other features such as the addition of scanned images of the source materials.
Editorial Treatment of Texts
- Texts entered in the Textbase are taken from both hand-written and printed sources. Some of these works have been edited over the past century, and this modern editorial work often assists us in our own. Many others, however, exist only in manuscripts or printed books of the period. All texts in the Textbase are based upon fresh transcriptions of manuscript sources or early printed books. In this respect our work is quite different from the body of texts from this period available in the Chadwyck-Healey English Poetry Full-Text Database. Original spellings and punctuation are preserved. Our editorial principles are conservative, and involve a minimum of change to the source. Where obvious printer's or copyist's errors are emended, the erroneous copy text version is always recorded alongside the corrected form. Many texts exist in two or more versions which may vary from one another to a greater or lesser extent. Where this variation is extreme, both or several versions may be recorded. Some authors from the period (Wyatt, for example) present particularly difficult editorial problems. We intend to provide their corpora in two forms: copy text versions arranged in the kind of sequence typical of a modern collected edition, and complete transcriptions of manuscript miscellanies and common-place books in which their works may appear piece-meal and interspersed with the work of other authors. These collections are nevertheless of considerable interest in their own right. Usually the Textbase copy text of a particular work will be the earliest or most complete or reliable version. Other versions are normally collated, and significant variations recorded. That is to say, a word or phrase in any version which differs from the copy text version is recorded. Mere spelling variations of the same word are not noted.
- It is the intention of the editors to progress to the provision of full critical editions of many texts in the Textbase. At this stage, however, our editions are essentially diplomatic, and are not provided with critical apparatus such as notes, commentary or glossary. We do, however, provide a brief introduction for each text or group of texts outlining current knowledge about the author and his work, the sources we are dependent upon, relevant critical literature, and so forth.
- Our system was originally devised before the formation of TEI, but in the knowledge that we should use tags that conformed with SGML principles. We are at present undertaking the conversion of our markup according to the guidelines or requirements set down by bodies such as SEENET and the Oxford Text Archive, in accordance with TEI. Some preliminary testing of procedures to convert our tags into a form conforming to the SEENET DTD was undertaken by John Price-Wilkin with favourable results.
Time Frame for Completion and Release of Material
- We foresee the release of material in batches as follows, and hope shortly to be able announce the publisher:
- Verse I. (Most of the verse from 1485 to 1530, including the large bodies by Skelton, Hawes and Barclay, and selected verse from 1530 to 1550, including the works of Wyatt, Surrey and other important figures.)
- Verse II. Drama. (The remaining verse, and dramatic texts.)
- Prose I. (Including large works by Elyot, More, Barclay, and other writers of historical, legal, or literary prose.)
- Prose II. (A corpus of more utilitarian prose-writing and stray pieces.)
All going to plan, we will be in a position to release Part 1 at the end of 1995. Part 2 will be ready for release by the end of 1996. Parts 3 and 4 will be ready for release in a year or two thereafter if our present level of funding is maintained.
- Editor in Chief: Dr. Greg Waite
Editors: Dr. Seymour House, Professor Alistair Fox, Dr. Janet Wilson
Associate Editor: Dr Paul Sorrell
Department of English
University of Otago
FAX: (64) (3) 479 8558
(RGS, 12 December 1997)