Image from Oscar Wilde’s manuscript of “The Picture of Dorian Gray"

Editors’ Choice: Transforming TEI for the Web

Last month, I led a workshop for the GC Digital Initiatives on “Getting Started with TEI.” For those who don’t know, TEI (short for Text Encoding Initiative) is a method for encoding, or “tagging,” texts in such a way that both humans and computers can make sense of them. It is a set of guidelines used for electronic editing and working with textual data in the humanities, social sciences and linguistics, which is based on XML (the eXtensible Markup Language).  With TEI, editors and scholars can “tag” a text for various features such as structure, typography, or references. My workshop specifically focused on how TEI facilitates the digital transcription of hand-written manuscripts. We practiced encoding a couple of pages from Oscar Wilde’s manuscript of “The Picture of Dorian Gray,” paying particular attention to how Wilde edited out the homosexual elements and innuendos as he revised his draft.

In the image below, you can see how the workshop participants used TEI to mark up the revisions that Wilde made on the passage. Here, the participants went through the manuscript line by line to indicate what is written and where things are struck out, added, or cannot be read. Among the available TEI elements, we focused on <del>, <add>, and <gap>, which indicate deletions, additions, and undecipherable script, as well as <rend>, which describes how or where a piece of text is rendered, such as with a strikethrough, or above the current line. For example, in the below excerpt from our encoding, you can see the <rend> elements in orange.

At the end of the workshop, we were able to see our progress by transforming the TEI file into HTML, which we could then view in the browser. The browser rendition shows how the typographic elements appear once applied to the source text. As a result, it’s useful for presenting the manuscript details in an easy to read format. For example, an excerpt tagged with “strikethrough” would present the text with a line running across it. You can see this presentation in the image at the top of this article, displays our transformed TEI side by side with the original manuscript page. In textual editing terms, this kind of transcription is known as a “diplomatic transcription” of a manuscript.


Read the full post here.

This content was selected for Digital Humanities Now by Editor-in-Chief Justin Broubalow based on nominations by Editors-at-Large: