The Emblem inside the Emblem Book – The Structuring and Indexing of Texts and Images
Herzog August Bibliothek, Wolfenbüttel
Opitz, Andrea. “The Emblem inside the Emblem Book – The Structuring and Indexing of Texts and Images,” Emblem Digitization: Conducting Digital Research with Renaissance Texts and Images, ed. Mara R. Wade. Early Modern Literary Studies Special Issue 20 (2012): 7. <URL: http://purl.oclc.org/emls/si-20/WADE_Opitz_EMLS.htm>.
1. Based on the experience gathered from the digitization projects of both illustrated festival books and emblem books at the Herzog August Bibliothek (HAB), this paper discusses the possibilities and problems of indexing electronically delivered books, in particular emblem books. I show here how we used the Text Encoding Initiative (TEI) for structuring and indexing in XML-format for both texts and images. The encoding of the emblem inside the emblem book receives special attention. An important question is how an emblem exchange format can be realized for cooperative projects. Other forms of cooperative research among the different emblem projects have been developed and continue to evolve. This paper traces developments in the past decade leading to the launching of the bilateral Emblematica Online project of the HAB and the University of Illinois and illuminates the various technical steps necessary to its genesis.
The Wolfenbüttel Digital Library (WDB)
2. Since 1998 the Herzog August Bibliothek has presented selected items from its collections as digital facsimiles in its Wolfenbüttel Digital Library. These items are rare, outstanding, and frequently used, or currently most relevant for research. All digitized titles may be accessed not only here, but also via the on-line catalog PICA-OPAC. The Herzog August Bibliothek guarantees the accessibility of its digital facsimile editions of imprints and manuscripts on the Web on persistent URLs. This permits reliable quotation of entire electronic editions as well as single pages from them.
3. When a library decides to digitize its collections, it is important that it provide users with complete digital versions of the books. These are then made available as electronic editions via the public library catalogue. The Herzog August Bibliothek bears the responsibility of ensuring that these electronic editions remain available under persistent URLs, which will continue to exist in the decades to come. The images must also undergo appropriate long-term archiving. Only then is it possible to ensure the reliability of electronic media sources for academic activity. We should not restrict ourselves to certain research themes but ensure that the needs of all library users are taken into account.
4. The task comprises more than just making the electronic editions available via the catalogue. Within the Wolfenbüttel Digital Library, there are a number of digitization projects of clearly defined scope, the aim of which is to allow searching the content of the documents, beyond the simple sequences of images they contain. So we have tried in various projects to encode the structure of books of particular genres so as to create a terminological and structural basis for their web-based digital publication. One example is page-related indexing.
5. One genre of book that represents a particular challenge for developing index searching is that of illustrated festival books from the project Festival Culture Online—17th-Century German Imprints of Baroque Festival Culture, funded by the German Research Foundation, since these combine both texts and graphic materials in the same volume. Within this project, the HAB has digitized the entire content of 344 imprints concerning court festival culture.
6. Another category of imprint that consists of both image and text is the emblem book, a special genre of Renaissance publication. The content of some of the emblem books held at the HAB has been digitized as a result of the project entitled Document type definitions (DTD) for the presentation and retrieval of early modern genres via the internet. These projects typify the critical experiences necessary in digitization and indexing that provided the foundations for Emblematica Online.
7. During the DTD project, also funded by the German Research Foundation, the HAB tried to develop document type definitions accessible via the internet for five Renaissance genres—one of them emblem books—and has developed the structural and terminological framework for future web-based digital publications. In the framework of this project, the HAB digitized 70 emblem books from the more than 450 emblem books available at the library.
8. A list of emblem books that have already been digitized is available for viewing at the DTD Project website. A follow-up emblem project, advancing the initial work of the DTD project, was the pilot project Emblematica Online. This project, conducted in cooperation with the University of Illinois and funded by the TransCoop programm of the Alexander von Humboldt Foundation, aimed to digitize, index, and conduct research on emblem books as well as to develop structures and standards which allow for joint access to both databases.
9. The principles acquired in the projects created the basis for the follow-up project Emblematica Online. The Transcoop grant allowed for the establishment of a corpus of digitized books, the development of a data structure specific to emblems, the development of a German-English thesaurus, the concept of a unified search structure for separately held collections, and a common metadata structure, referred to as the Spine. It also led to the creation of an XML structure for emblem encoding and of a first portal for emblematica. Since 2009 Emblematica Online has been financed jointly by the German Research Foundation (DFG) und National Endowment for The Humanities (NEH) in their Bilateral Digital Humanities Program. The project will completely digitize two premiere emblem collections of world-wide prominence; construct a German Emblem Database, a database of extensive metadata with broad functionality for the German emblems of both institutions; and develop the OpenEmblem Portal, hosted at the University of Illinois as an open access research site incorporating book-level metadata from emblem digitization projects worldwide and emblem-level metadata from the University of Illinois and the Herzog August Bibliothek.
Indexing the Emblem Books
10. Within the DTD project, we developed the guidelines for our digitization projects with regard to technology and content indexing. Right from the start, the research project placed great importance on maintaining standards that ensure the interoperability of the data. We chose to work with the XML tagging language together with TEI. In other words, the technical basis for storing and editing the indexing information is XML documents that conform to the TEI DTD. This facilitated both the generation of static html-documents, which are of particular importance for webcrawler searches, and also the dynamic generation of analytical contents pages. These give an overview of the structure of the respective imprint within a web page, together with selected index data, and link these with the digitized page images. They are also used for constructing a database that is able to handle complex enquiries.
11. The DTD-Project created comprehensive prototype content indices for ten emblem books. The TEI DTD is especially oriented toward the semantic structuring and encoding of source material and facilitates the structural description of the source itself. That means we can use the TEI DTD for indexing the entire book.
12. The majority of bibliographic metadata that was deposited in the TEI header is available from existing records from the online catalogue for both the electronic document as well as the printed source. While we have refrained from using Dublin Core (DC) Elements in the emblem books, the festival books do incorporate DC data. The structural features of the original text are marked up in the text block <body>. This means that the features have to be defined and formalized for each genre right from the start by way of a document analysis.
13. The XML-TEI document reproduces the structure of the print and adds selected indexing information on certain aspects of the content. On the page level, structural-semantic units of the print were first of all encoded in XML, comprising book-technical elements such as the cover, textual units, such as dedications or comments, and image units, such as illustrations. Text segments of particular importance (e.g. headings or mottos and subscriptiones in the case of emblems) are integrated in full text form This information is encoded as structural metadata with a specific editor. It provides the basis for a virtual table of contents to facilitate navigation in the digitized book.
14. For emblem books, the following structure emerged: the basic unit of our digitization is the entire emblem book, not the individual emblem. This project digitizes and structures each book page by page from front to back. Our orientation toward the structure of the book makes it possible to mark up accompanying texts, commentary, and the like. The language of these texts—German, Latin, and so forth—is also indicated.
15. While some of this information is possibly of little or no interest to emblem scholars, it may be of great relevance to other researchers. Because we cannot anticipate the needs and research questions of future users, we include this information now. The internet is constantly evolving and our projects intend to be as flexible as possible. What is important to emblem scholars, however, is the structure of the emblem itself within the book.
The Emblem Structure
16. This project defines the structure of the emblem as the three-part ideal type – motto, pictura, and subscriptio:
As a division the motto is included along with the details of the language in which it is written and the details of the emblem which accompanies it. The unit emblem is characterized by the identifier e plus the Emblem-Number and it is referenced by the attribute next. The attribute corresp clearly refers to the digitized facsimile, to the corresponding image.
17. The pictura is encoded through keywords, Bildschlagwörter, bsw. Engraved text in the pictura is included as a Bildstichwort. We use single term keywords, which are oriented towards Iconclass. Our DTD-project dispensed with Iconclass encoding, although this can be added. Although we did not use the iconographic classification Iconclass for the process of indexing the emblem books in the DTD project, we decided to do so for the image elements in the festival books.
18. The following example depicts how we used Iconclass for the description of a festival illustration in Curicke’s Freuden-Bezeugung der Stadt Dantzig.
Figure 1: Example of using ICONCLASS for the description of an illustration in Georg Reinhold Curicke's Freuden-Bezeugung der Stadt Dantzig, Danzig 1698. [Herzog August Bibliothek: Gm 4° 256]
Figure 2: Triumphbogen, Image 00072 of Curicke's Freuden-Bezeugung der Stadt Dantzig, Danzig 1698. [Herzog August Bibliothek: Gm 4° 256] http://diglib.hab.de/drucke/gm-4f-256/start.htm?image=00072
19. In his article, “Using Iconclass for the Iconogrpahic Indexing of Emblems,” Hans Brandhorst explains in detail how useful Iconclass is for the iconographic indexing of emblems. Iconclass is a hierarchical systematic notation with multi-language thesauri, thus all metadata from these notations will be available in both English and German, as well as other languages. The advantage of the Iconclass notation is that searches can be made independently of language and using the Iconclass browser facilitates hierarchical searches in accordance with the classification.
20. The festival book project succeeded in quickly setting up cooperation with Iconclass in the Netherlands, which enabled us to find the perfect way of accessing assigned Iconclass notations. A special page was set up for the HAB, through which it is possible to conduct easy searches for Iconclass notations. Once these are located, the notation is then passed on via Search URL to the HAB database, where it is available for search purposes.
21. In the subscriptio division for the emblem books we have incorporated the subscriptio as text with details about which language is used. Besides the original spelling, standardizations according to modern spelling and translations can also be included.
Orientation towards the Structure of the Book
22. After the analysis of numerous emblem books, we decided to align the structuring of the XML document primarily with the form of the book structure and not with the form of the individual emblem. This is important because all components of an emblem do not always follow each other spatially or consecutively. Many emblem books do not correspond to the three-part ideal type.
23. In this example from Peter Isselburg’s Emblemata Politica, the mottos with the German-language subscriptiones are summarized at the beginning of the book, with motto, pictura, and subscriptio in Latin grouped together later in the book.
Figure 3: List of mottos from Peter Isselburg, Emblemata Politica, Nürnberg 1617. [Herzog August Bibliothek: Uk 40] http://diglib.hab.de/drucke/uk-40/start.htm?image=00014
Figure 4: Emblem Nosce teipsum, in Peter Isselburg, Emblemata Politica, Nürnberg 1617. [Herzog August Bibliothek: Uk 40] http://diglib.hab.de/drucke/uk-40/start.htm?image=00026
24. The DTD project marked the relations of individual parts of the emblem in separate divisions with attribute values of reference (e, next) with the effect that all the parts that belong to this emblem are associated with each other, irrespective of where they appear in the book.
Figure 5: XML structure of the emblem Nosce teipsum, in Peter Isselburg, Emblemata Politica, Nürnberg 1617. [Herzog August Bibliothek: Uk 40]
25. Various levels of subject indexing are possible: for example, in a multilingual work, mottos can be captured in all available languages or restricted to one language. The subscriptio can be included or left out. The incorporation of accompanying text is possible as well as translations of the mottos and subscriptiones, and the determination of themes and topoi. All these things can be incorporated within the TEI DTD without having to carry out any additional expansion.
26. The concept towards which we have been working is based on the assumption that every document may be considered a full text-based TEI document. Proceeding from this ideal form, reduced forms can be set up in turn. Thus, it is a process of omission, which also allows large quantities to be managed. The encoder can fill in omitted text passages or synopses of content. This structure puts various solution models into effect.
27. Depending on the available time, human and financial resources, and the project’s goals, it is possible to carry out projects with various depths of text indexing. Nevertheless, the uniform structure should guarantee that the data can be merged and searched via joint databases.
28. In our view the TEI DTD has proved itself to be a useful way to index the structures of entire imprints. With regard to an emblem exchange format, however, indexing the emblem itself needed additional agreement on detailed categories.
29. As long as a set of this kind has not been defined, additional methods of cooperation are required. This includes, for example, agreeing on which books should be digitized. There are sure to be many reasons why the same books will be digitized at various institutions; for instance, it may be the goal to digitize a collection within a certain library and make the material available to the public, or to digitize books with a certain type of content.
30. Each of these approaches is justified; after all, the goals pursued within various projects vary greatly. Nevertheless, to avoid overlaps and unnecessary work, it makes sense to cooperate and reach agreement on which books should be digitized, in the event that different groups are conducting similar projects. For instance, we reached agreement with the University of Illinois for the Emblematica Online project that we would each, as far as possible, only digitize those books that are not available at the other institution. Since the two collections complement each other extremely well and there is relatively little overlap, we are able to offer users a great variety of digitized emblem books as a result. In the current Emblematica Online we practice cooperative research in the best way.
31. Another type of cooperation is represented by arkyves, formerly known as Mnemosyne. Arkyves collects image material from various collections and projects, indexing them using Iconclass. Emblem books from Utrecht, Glasgow, Illinois, and the Herzog August Bibliothek can all be found here, providing multiple points of access to our digitized books. A variety of search options are available to guide the user to the digital sources and their indexing data in the respective projects. The HAB imprints that have been indexed in arkyves can be accessed directly there. Arkyves is a database of images and texts using Iconclass for indexing.
32. The Iconclass system is available on-line in a new version that is being developed into a freely available webservice. It is an internationally recognized system for indexing art objects and provides a hierarchical depth of indexing appropriate to emblem research. All projects working in the OpenEmblem group have unanimously approved the classification system Iconclass. The projects expect to index with Iconclass notations in order to provide powerful search capabilities via the OpenEmblem Portal.
33. The development of online portals, such as the OpenEmblem Portal, gives users a way of accessing the metadata from a number of institutions that use different standards. It is now possible to search international emblem book collections within a uniform user interface.
34. During the Wolfenbüttel conference, researchers viewed and discussed an early version of Stephen Rawles’ “Spine of Information Headings,” and agreed that it made sense to make a distinction between the information about the books and the information about the emblems themselves. This separation of a book from its content is a traditional feature of library catalogue records, which list the bibliographic information, i.e., the metadata on the respective volume. The indexing of its content is performed separately. In our projects we did so by using the TEI DTD. For emblems the encoding of aspects of content needs a more detailed schema like the Spine.
35. The categories which are intended to give information on the emblem itself in the Spine, from E.1 to E.22, go well beyond the relatively simple indexing procedure of the Wolfenbüttel project in many areas. By employing a common structural model, it should, however, be possible to bring together data of varying index depths. This means that a number of flexible models of content indexing and for cooperating with other projects can be considered. <
36. It also means that it should be possible for libraries to make the digitized sources available on the internet, with a good image quality, together with the bibliographic metadata. The exchange of metadata via the internet is possible via OAI interfaces. Researchers can search through joint databases, irrespective of the project, and locate the individual collections, sources, pages and emblems. The OpenEmblem Portal shows us the direction we should take.
37. While at the triennial conference of the Society of Emblem Studies in 2005 a distributed method of indexing seemed to be only a future possibility, the current joint project Emblematica Online shows that this has become reality. The creation of structural data and the digitization of books do not necessarily have to go hand in hand. It is possible to divide the indexing process between various institutions, each of which possesses and promotes the required skills that are then brought together in a joint database and made available to the user.
38. During recent years much work has been accomplished and many problems were resolved before this goal could be attained. With the basic requirement of using a uniform indexing schema, we have completed the first step. The current project Emblematica Online converts these results into concrete practice. We use the defined format Emblem XML Schema, assign global emblem identifiers, encode mottos, and apply Iconclass notation. Data exchange has been established via OAI (Open Archives Initiative).
39. With completion of Emblematica Online in 2012 the OpenEmblem Portal will offer the ability to search and browse across significant levels of granularity and create functional access to the entire collections of emblem books at Illinois and HAB, to book-level metadata for a number of projects worldwide, and to a large corpus of emblem-level metadata for German emblems from the collections of Illinois and the Herzog August Bibliothek.
 For a detailed description of the Wolfenbüttel Digital Library, see Stäcker, "Die Wolfenbütteler Digitale Bibliothek".
 For the description of the DTD-Project, see Opitz, p. 74.
 The current project Emblematica Online encodes the picturae with Iconclass
 See also Opitz, p. 75.
 See Stäcker, XML, p. 272.
 See Stäcker, "Pratical Issues of the Emblem Schema",in this volume.
 Rawles, pp.19-28.
Responses to this piece intended for the Readers' Forum may be sent to the Editors at M.Steggle@shu.ac.uk.
© 2012-, Matthew Steggle and Annaliese Connolly (Editors, EMLS).