Send reply to: Nancy Ide From: Nancy Ide Subject: TEI Workshop at DL96 To: Multiple recipients of list TEI-L ********************* * W O R K S H O P * ********************* The Text Encoding Initiative Guidelines and Their Application to Building Digitial Libraries March 23, 1996 9:30am - 3:30pm Organizers Nancy Ide Vassar College, USA and CNRS, France Judith Klavans Columbia University, USA Held in conjunction with DIGITAL LIBRARIES'96 First ACM INTERNATIONAL CONFERENCE ON DIGITAL LIBRARIES March 20-23, 1996 Hyatt Regency Bethesda, Maryland USA P R O G R A M 9:30 - 10:15 Overview of the TEI Nancy Ide and Judith Klavans, TEI Steering Committee 10:15 - 10:45 The TEI in the Perseus Project David A. Smith, Perseus project, Tufts University 10:45 - 11:15 How will library cataloging relate to TEI documents? Issues on USMARC and TEI Steven Davis, Columbia University 11:15 - 11:45 TEI and the American Memory Projectat the Library of Congress Debbie Lapeyre and Tommie Usdin, Atlis Consulting 11:45 - 12:15 Encoding two large Spanish corpora with the TEI scheme: Design and technical aspects of textual markup Marta Pino, Instituto de Lexicografia, Real Academia Espannola 12:15 - 1:00 Lunch 1:00 - 1:30 The Model Editions Partnership: Creating Editions of Historica l Documents for the Internet David Chesnutt and Michael Sperberg-McQueen, Model Editions Project 1:30 - 2:00 Creating DTDs via Fred Keith Shafer, OCLC Online Computer Library Center 2:00 - 2:30 Some Problems of TEI Markup and Early Printed Books Julia Flanders, Brown University Women Writers Project 2:30 - 3:00 TEI and the National Digital Library Program LeeEllen Friedland, National Digital Library Program, Library of Congress 3:00 - 3:30 Suggestions for the future development of the TEI Guidelines -------------------------------------------------------------------- | This announcement with links to papers, etc. is available on the | | World Wide Web at | -------------------------------------------------------------------- D E S C R I P T I O N The Text Encoding Initiative's Guidelines for Electronic Text Encoding and Interchange of Machine-Readable Texts were published in May 1994, after six years of development within the academic and research communities. The SGML-based Guidelines provide standardized encoding conventions for a large range of text types and features relevant for a broad range of applications, including natural language processing, information retrieval, hypertext, electronic publishing, various forms of literary and historical analysis, lexicography, etc. The Guidelines are intended to apply to texts, written or spoken, in any natural language, of any date, in any genre or text type, without restriction on form or content. They treat both continuous materials (running text) and discontinuous materials such as dictionaries and linguistic corpora. As such, the TEI Guidelines offer the best encoding solution currently available for the development of digital libraries, where varied and complex texts must be stored and manipulated in ways that answer a wide variety of user needs, and where the linkage of multi-media is essential. The TEI provides encoding conventions for describing the physical and logical structure of many classes of texts, as well as features particular to a given text type or not conventionally represented in typography. The TEI Guidelines also cover common text encoding problems, including intra- and inter-textual cross reference, demarcation of arbitrary text segments, alignment of parallel elements, overlapping hierarchies, etc. In addition, they provide conventions for linking texts to acoustic and visual data. The TEI's specific achievements include: o the specification of restrictions on and recommendations for SGML use that enables maximal generality and flexibility in order to serve the widest possible range of research, development, and application needs; o analysis and identification of categories and features for encoding textual data, at many levels of detail; o specification of a set of general text structure definitions that is effective, flexible, and extensible; o specification of a method for in-file documentation of electronic texts compatible with library cataloging conventions, which can be used to trace the history of the texts and thus assist in authenticating their provenance and the modifications they have undergone--this is especially valuable for the development of digital libraries; o specification of encoding conventions for special kinds of texts or text features, including: character sets, language corpora, general linguistics, dictionaries, terminological data, spoken texts, hypermedia, literary prose, verse, drama, historical source materials, and text critical apparatus. The Guidelines also provide an extensible and flexible Document Type Definition (DTD) framework for text encoding, containing a common core of features, a choice of frameworks or bases, and a wide variety of optional additions for specific applications or text types. In addition, the TEI Guidelines offer the possibility to encode many different views of a text, simultaneously if necessary, which is of critical interest for building digital libraries, where different users may view the same text in many different ways (physical object, logical structure, rhetorical object, linguistic object, etc.). Theme and Goals of the Workshop ------------------------------- Extensive application of the Guidelines began in a large-scale way since their release in spring of 1994. Numerous projects in North America and Europe have recently adopted the Guidelines for a wide variety of applications. The work of the TEI is now to evaluate, modify and extend the Guidelines in response to user experience and needs. This workshop provides a forum for technical discussion and evaluation of the TEI Guidelines, as they have so far been implemented in real applications, particularly those which have relevance for building digital libraries. The topics include but are not limited to: o detailed description of application of the Guidelines, with particular emphasis on interesting problems and (TEI or non-TEI) solutions o handling unusual or complex text types, or text types not treated in the Guidelines o handling multi-media with the Guidelines o evaluation of the TEI DTD architecture, element and entity classes, etc. o encoding multiple views or information types o proposals for extension of the TEI Guidelines o data architectures (e.g., multiple linked files, etc.) for storing complex documents A second focus of the workshop is the refinement and/or adaptation of the TEI Guidelines for particular text types and/or applications. Because it aims at maximal generality, the TEI necessarily takes its encoding solutions to the highest possible level of abstraction. In addition, the TEI often provides multiple options for encoding the same phenomenon. The need to provide mechanisms which are maximally general and flexible is at times at odds with the provision of mechanisms which are most efficient and/or effective for a specific application or intended use. To develop an encoding standard specifically suited to a given application, it is desirable to choose from among various encoding options the method that is optimal in the light of intended use. It may also be advantageous to refine or delimit TEI solutions which are over-general for the needs of a given application. In sum, the overall goals of the workshop are (1) to generate a technical discussion on the applicability of the TEI Guidelines for building digital libraries, and (2) to provide a forum for a broad assessment of encoding needs for building digital libraries, in order to obtain a clearer idea of what these needs are, and, if applicable, the directions in which the development of the TEI Guidelines and surrounding activities should go to accomodate them.