crowdsourcing – Scalable Reading

Back to the Future or Wanted: A Decade of High-tech Lower Criticism

Note:The title of this blog entry is the title of a keynote address I gave at the Chicago Digital Humanities and Computer Science Colloqium, held November 18-19 , 2012 at the University of Chicago. It is lightly edited and shortened. I have added a postscript called “A Decade later”.

This talk is about the challenges and opportunities posed by the EEBO-TCP corpus. Between 2015 and 2020 (and beginning with an initial release of ~25,000 texts) TEI-XML transcriptions of ~70,000 texts–at least one version of every title published between 1473 and 1700– will pass into the public domain. Once this resource is in the public domain it will for most scholarly purposes replace other surrogates of the printed originals. It will be free, it will often be the only, and nearly always the most convenient source for the many look-up activities that make up much of scholarly work.

EEBO-TCP is a magnificent but flawed enterprise, and few of its transcriptions fully meet the scholarly standards one associates with a decent edition in the print world. Who will guarantee the integrity of this primary archive that will be the foundation for much future scholarship? In a print-based documentary infrastructure there was a simple answer to the question “Who provides quality assurance (QA in modern business parlance) for the primary sources that undergird work in your discipline?” It was “my colleagues,” and it might include “I do some of that work myself.” From the nineteenth century well into the middle of the twentieth century, “Lower Criticism” of one kind or another counted as significant scholarly labor and made up a significant, though gradually declining, share of the work of humanities departments.

Consider Theodor Mommsen. In 1853 and 1854 he published the first volume of his Roman History, and he started the Corpus Inscriptionum Latinarum (CIL), the systematic gathering of inscriptions from all over the Roman empire. For the next five decades he was the chief editor and a major contributor to its sixteen volumes, which transformed the documentary infrastructure for the study of Roman history. Since the early 20th century, a student of Roman history with access to a decent research library has had “at hand” a comprehensive collection of the epigraphic evidence ordered by time and place. That has made a huge difference to the study of administrative, legal, and social history.

The CIL is a majestic instance of the century of curatorial labour that created the documentary infrastructure for modern text-centric scholarship in Western universities. In that world the integrity of primary data rested on what you might call a Delphic tripod of cultural memory with its three legs of scholars who made editions, publishers who published them, and librarians who acquired, catalogued, and made them available to the public. After World War II there was a growing consensus that you no longer needed to worry about data curation because a century of it had succeeded in creating a print-based data infrastructure that from now on you could take for granted. For the last forty years many disciplines in the humanities have lived off the capital of a century of editorial work while paying little attention to the progressive migration of textual data from books on shelves to files on servers or in ‘clouds’. Using some back-of-the-envelope calculations, Greg Crane argued in 2010 that classicists now allocate less than 5% of their labour to curatorial work (using the term in its broadest sense). That sounds about right for departments of English or History that I know something about. It is possible for individuals within fields of activity to make choices that make professional and economic sense within the field but lead the field as a whole astray. The steel industry of the seventies or the current monoculture of corn in Iowa come to mind.

A decade ago Jerry McGann observed that “in the next fifty years the entirety of our inherited archive of cultural works will have to be re-edited within a network of digital storage, access, and dissemination” (quoted from his essay “Our Textual History” in TLS , Nov. 20, 2009). This digital migration has so far made slow progress. The integrity of an emerging cyber infrastructure for text-centric scholarship has received remarkably little attention in the discourse of disciplines that will increasingly rely on digital surrogates of their primary sources. The current buzz about ‘Digital Humanities’ or ‘DH’ has very little to do with serious work on that front.

Back to the EEBO-TCP corpus and the ~45,000 texts (~2 billion words) that have so far been transcribed. EEBO-TCP will serve as the de facto documentary infrastructure for much Early Modern scholarship, accessed increasingly via mobile devices that provide each scholar with his or her own “table of memory.” Montaigne had a couple of thousand books in his tower library. A little more than two years from now, graduate students will be able to load 25,000 books from Montaigne’s world (and beyond) onto their Apple, Google, or Samsung tablets as epubs or raw XML files.

“How bad is good enough” when it comes to the quality of those texts? A lot of work needs to be done if you believe, as I do, that a digital surrogate with any scholarly ambitions should at least meet the standards we associate with good enough editions in the print world (I am ignoring here the additional features required to make the digital surrogate fully machine actionable). There are two interesting properties of the TCP corpus that affect the discussion of data curation and quality assurance. Both of these have analogues in other large collections of primary materials. In fact, the TCP archive exhibits characteristic features of the large-scale surrogates of printed originals that will increasingly be the first and most widely consulted sources.

First, the TCP is published by a library. Second, in a collection of printed books, the boundaries between one book and another or one page and another impose physical barriers that constrain what you can do within and across books or pages. In a digital environment, these constraints are lifted for many practical purposes. You can think of and act on the current TCP archive as 45,000 discrete files, 2 billion discrete words, or a single file. This easy concatenability is the major reason for the enhanced query potential of a full-text archive. It also has the potential for speeding up data curation within and across individual texts.

If you come across a simple error in a book it is usually a matter of seconds to correct it in your mind. It takes much longer to correct it for other readers of the book. You must provide the correction in a review or write to the author/publisher. The publisher must incorporate it into a second edition, and libraries must buy the second editions before the corrected passage is propagated to readers at large. That is a typical form of data curation in a world where the tripod of cultural memory rests on the actions of scholars, publishers, and librarians. In a digital world that tripod rests on the interactions of scholars, librarians, and technologists. In a well-designed digital environment scholars (and indeed lay people of all stripes) can directly and immediately communicate with the library/publisher. If I work with a text and come across a phenomenon requiring correction or completion I can right away do the following:

1. Log in (if I’m not logged in already) and identify myself as a user with specified privileges
2. Select the relevant word or passage and enter the proposed correction in the appropriate form.

If I do not have editorial privileges, my proposal is held for editorial review. If I am authorized to make or approve corrections my proposal is forwarded for inclusion in the text either immediately or (the more likely scenario) the next time the system is re-indexed. The system automatically logs the details of this transaction in terms of who did what and when.

The obstacles to such an environment are not primarily technical or financial. They are largely social. You need substantial adjustments in the ways scholars and librarians think about their roles and relationships. Scholars often complain about the shoddiness of digital resources, but if they want better data they must recognize that they are the ones who must provide them. And they need to ask themselves why in the prestige economy of their disciplines they have come to undervalue the complexity and importance of “keeping” (in the widest sense of the word) the data on which their work ultimately depends. Librarians need to rethink the value chain in which the Library ends up as a repository of static data. Instead they should put the Library at the start of a value chain whose major component is a framework in support of data curation as a continuing activity by many hands in many places, whether on an occasional or sustained basis. Such a model of collaborative data curation is the norm in genomic research, a discipline that from the perspective of an English department can be seen as a form of criticism (both higher and lower) of texts written in a four-letter alphabet.

Some of the best thinking on these issues has come from Greek papyrologists, a very special scholarly club with highly specialized data, tools, and methods, but with some good lessons for the rest of us. Papyrologists have for a century kept a Berichtigungsliste or curation log as the cumulative and authorized record of their labours. The Integrating Digital Papyrology project (IDP) is based on the principle of “investing greater data control in the user community.” Talking about the impact of the Web on his discipline, Roger Bagnall said that

these changes have affected the vision and goals of IDP in two principal ways. One is toward openness; the other is toward dynamism. These are linked. We no longer see IDP as representing at any given moment a synthesis of fixed data sources directed by a central management; rather, we see it as a constantly changing set of fully open data sources governed by the scholarly community and maintained by all active scholars who care to participate .

He faced head-on the question: “How … will we prevent people from just putting in fanciful or idiotic proposals, thus lowering the quality of this work?” and answered that collaborative systems

are not weaker on quality control, but stronger, inasmuch as they leverage both traditional peer review and newer community-based ‘crowd-sourcing’ models. The worries, though, are the same ones that we have heard about many other Internet resources (and, if you think about it, print resources too). There’s a lot of garbage out there. There is indeed, and I am very much in favor of having quality-control measures built into web resources of the kind I am describing.

A collaboratively curated Berichtigungsliste or curation log offers an attractive model for coping with the many imperfections of the current TCP texts. The work of many hands, supported by clever programmers, quite ordinary machines, and libraries acting consortially, can over the course of a decade substantially improve the TCP texts and move them closer to the quality standards one associates with good enough editions in a print world. Imagine a social and technical space where individual texts live as curatable objects continually subject to correction, refinement, or enrichment by many hands and coexist at different levels of (im)perfection. You could also imagine a system of certification for each text — not unlike the USDA hierarchy of grades of meat from prime to utility. But “prime” would always be reserved for texts that have undergone high-quality human copy-editing. Such a system would build trust and would counteract the human tendency to judge barrels by their worst apples.

What I have said about collaborative curation of the TCP texts applies with minor changes to other archives. Neil Fraistat and Doug Reside in conversation coined the acronym CRIPT for “curated repository of important texts”. Not everything needs to be curated in that fashion, but high degrees of curation are appropriate for some texts, whether for their intrinsic qualities or evidentiary value. Large consortial enterprises like the Hathi Trust or the DPL might be the proper institutional homes for special collections of this type. Somewhere in the middle distance I see the TCP collection as the foundation of a Book of English defined as

• a large, growing, collaboratively curated and public domain corpus
• of written English since its earliest modern form
• with full bibliographical detail
• and light but consistent structural and linguistic encoding

It will take a while to get there. It is a lot of work, and like woman’s work, it is “never done.” But progress is possible. Here is the challenge of the next decade(s) for scholarly data communities and the libraries that support them: put digital surrogates of your primary sources into a shape that will

rival the virtues of good enough editions from an age of print

add features that will allow scholars to explore the full query potential of the digital surrogate.

I use “good enough” in the sense Donald Winniccott used it when he argued against a generation of psychoanalysts who were fond of blaming the mother. He defined a quite modest level of maternal competence. Going beyond it would not add a lot, but dropping below it would get bad very fast. Much of the digital and increasingly dominant version of our textual heritage will require a fair amount of mothering before it is clearly good enough.

Postscript: A Decade later

The 20212 talk sketched an ambitious agenda. Here are some facts about the modest progress we have made since then. Between 2013 and 2015 about 20 students from Amherst, Northwestern, and Washington University in St. Louis made over 50,00 corrections in some 510 Early modern plays. The number of textual defects per 10,000 running words is a crude but telling measure of how close texts come to being “good enough” for many purposes. The median rate of uncorrected drama texts in the TCP corpus was 14.5 defects per 10,000 words The work of these students reduced the defect rate by an order of magnitude to 1.4–an improvement visible to a casual reader.

In 2016 Shakespeare His Contemporaries became part of EarlyPrint, a more broadly based enterprise doubly centered at Northwestern and Washington University in St. Louis. EarlyPrint currently has close to 60,000 texts. Most of them come from EEBO-TCP. But there are also ~ 4,500 Early American text (Evans TCP) and ~ 2,000 English 18th-century texts (ECCO TCP). The EarlyPrint versions of these texts are linguistically annotated, can be searched via a corpus query engine, and close to 650 texts are “digital combos” that offer a side-by-side display of the text and high-quality digital images. Their technical infrastructure supports collaborative curation by anybody anwhere.

The acronym “FAIR” describes data that meet standards of findability, accessibility, interoperability, and reusability. It is a more elaborate version of Ranganathan’s fourth law of library science: “Save the time of the reader”. It summarizes an ethos well captured by Brian Athey, chair of Computational Medicine at Michigan, when he said at a conference about “research data life cycle management” that “agile data integration is an engine that drives discovery”. For Early Modern studies in the Anglophone world, the creation of the TCP archives has been the most monumental achievement. An important goal of EarlyPrint has been to make those texts FAIRer.

Since 2017 anybody with a computer and an Internet connection has been able to offer textual emendations via the EarlyPrint Annotation Module. It makes textual correction as easy as as writing in the margin of a book. I call it “curation en passant”. As soon as a reader enters and saves a correction in a little data entry field next to the text, the who, what, when, and where of a correction are automatically recorded in a central curation log. Emendations are provisionally displayed in the text, but their final integration into the source text is subject to editorial review.

A “digital combo” and a computer with a good screen (larger is better) will provide a user with a better than good enough text lab for many basic–and some not so basic– forms of philological labour. Work of this kind follows a “find it, fix it, log it” pattern, where the finding and the logging typically take more time than the fixing. The EarlyPrint environment significantly reduces the time cost “finding” and turns the “logging” into an automated process.

Plays have continued to be a source of special interest. In the EEBO-TCP corpus as a whole the interquartile range of defects per 10,000 words lies between 1 at the 25th and 48 at the 75th percentile, with 12 as the median range. Today the values for 814 plays in the EarlyPrint corpus range from 0 to 16.2 with a median value of 2.8 defects. In this important subcorpus of Early Modern texts a quarter of the texts have no defects, half of them have at most five defects per play, and three quarters of them have on average at most one defect per page.In practice, defects cluster heavily. Most of the remaining defects in EarlyPrint plays occur on the pages of texts for which there are currently no good digital surrogates on the Internet.

Last summer, three Classics majors at Northwestern tackled a corpus of 120 medical works. These were in much worse shape than the plays. The three figures for the interquartile range before curation were 14,7, 53.2, and 156.3 The students made about 20,000 corrections and reduced the interquartile range to 3.3, 13.3 and 30. They paid special attention to Thomas Cogan’s Haven of Health, a characteristic late 16th century work. With the help of better images from the Conway Library of Medicine (via the Internet Archive) and the help from a machine-learning experiment they corrected more than 1,000 defects. Its two dozen remaining defects (1.8 per 10, 000 words) are philological “cruxes” that have so far defied solution. But the EarlyPrint version of The Haven of Health is sound enough for most purposes.

It has also demonstrated that motivated undergraduates with an interest in Early Modern texts can be easily trained to do most of this low-level but essential philological labour. I close with reflections of two students on their work in this project. In the summer before her senior year in 2013 Nicole Sheriko was a member of the first collaborative team that worked on what was then called Shakespeare His Contemporaries. She went on to Graduate School, working at the “intersection of literary criticism, cultural studies, and theory history”, has published a handful of essays in leading journals, and is now a Junior Research Fellow at Christ’s College, Cambridge:

Having the experience of working with a professor on a project outside of the classroom–especially in digital humanities, which everyone seems to find trendy even if they have no idea what it entails–was a vital piece of my graduate school applications, I think, and other students may see a similar benefit in that.

In a less vulgarly practical sense, though, I would say that working on what was then Shakespeare His Contemporaries made a significant difference in how I approach studying the field of early modern drama. The typical college course can only focus on a handful of canonical texts but working across such an enormous digital corpus reoriented my sense of how wide and eclectic early modern drama is. It gave me a chance to work back and forth between close and distant reading, something I still do as I reconstruct the corpus of more marginal forms of performance from references scattered across many plays. A lot of those plays are mediocre at best, and I often remember a remark you once made to us about how mediocre plays are so valuable for illustrating what the majority of media looked like and casting into relief what exactly makes good plays good. The project was such a useful primer in the scope and aesthetics of early modern drama. It was also a valuable introduction to the archival challenges of preservation and digitization that face large-scale studies of that drama. Getting a glimpse under the hood of how messy surviving texts are–both in their printing and their digitization–raised all the right questions for me about how critical editions of the play get made and why search functions on databases like EEBO require a bit of imaginative misspelling to get right. That team of five brilliant women was also my first experience of the conviviality of scholarly work, which felt so different from my experience as an English major writing papers alone in my room. That solidified for me that applying to grad school was the right choice, a sentiment likely shared by my teammate Hannah Bredar, who–as you probably know–also went on to do a PhD. Once I got to grad school, the project also followed me around in my first year because I took a course in digital humanities and ended up talking a lot about the TCP and some of the little side projects I ended up doing for Fair Em, like recording the meter of each line to see where breakdowns occurred. I even learned some R and did a final project looking for regional markers of difference across the Chronicling America historical newspaper corpus. So, in big ways and small, the work I did at NU has stayed with me.

After her freshman year, in the summer of 2022, Lauren Kelley was part of the team that worked on medical texts. In the section about “Academic and Personal Development” in the final report to the College she wrote this:

As a premed student, spending the summer learning about the historical tradition of Western medicine has been incredibly valuable. Reviewing the medical corpus allowed me to understand how the field of medicine has evolved throughout the early modern period, and totrack the gradual development of knowledge and medical practice. Although the vast majority of knowledge in these books is outdated, the true value of this summer’s work lies in acquiring an intimate understanding of medical history from primary documents, as well as learning how to better interpret and analyze texts from this period. I also enjoyed having the opportunity to use my knowledge of Latin in a setting outside of the classroom, which reinforced the importance of studying Classics and its multitude of applications. Writing the final report for Haven of Health was an especially fulfilling experience that stimulated my academic growth; I had the opportunity to synthesize my observations throughout the 8 weeks and expand on them, as well as research a subject of interest to me and write about it.

Having just finished freshman year, this summer was the first experience that I have had with collegiate research. It was extremely enriching for me to spend eight weeks in a collaborative, research-oriented environment. I feel that coordinating several aspects of my work with my coworkers has vastly improved my teamwork skills. Finally, my confidence in my own academic abilities has increased, especially in my ability to apply knowledge in a real-world setting. Overall, this opportunity was a great introduction into how research is performed in humanities, and I am excited to further develop the skills I acquired this summer throughout my academic career.

Thou com’st in such a questionable shape: Data Janitoring the SHC corpus from the perspectives of Hannah, Kate, and Lydia

Below are the reflections of Hannah Bredar, Kate Needham, and Lydia Zoells about their adventures in the mundane world of Lower Criticism, about which I wrote in an earlier blog and of which the digital surrogates of our cultural heritage will need a lot in the decades to come. Racine observes in his preface to Bérénice that toute l’invention consiste à faire quelque chose de rien (all invention consists of making something from nothing). These three “inventors”, after spending much time time with commas and stray printers’ marks, came up with excellent insights into the business of criticism and the (un)certainties of making sense of texts, especially old ones.

Kate and Shakespeare’s scepticism

Is this a comma that I see before me,
Its tail hanging down? …Or art thou but
A comma of the mind, false punctuation
Proceeding from the text-oppressed brain?

Correcting transcriptions can sometimes feel like banging one’s head against a massive, impermeable wall. As often as I made a definitive correction, it seemed, I came across something that appeared irresolvable. Is this a period or a badly printed comma? A misaligned end-stop or the remnant of an intended colon? How long can I stare at it before I realize I will never know? We (undergraduates like me, new to the instability of early modern texts) arrive with a conception of textual clarity and authenticity that in many cases is simply not there. Some cases might be answered by looking at more books, more witnesses, visiting more libraries over more hours, but this isn’t conducive to curating an entire literary corpus for digital publication. And even were we to fully collate every text in the database, some of these questions might never be resolved. This means making peace with the unsatisfactory text and setting our aims somewhere less idealistic: closer to “good enough.” We turn towards clarity, functionality, and truthfulness to the text without forcing on it a definitiveness it does not have in every instance.

In a previous post on this blog (“How to fix 60,000 errors” June 22 2013), Prof. Mueller noted that the original 60,000 known errors in the SHC transcriptions constituted just 0.4% of the data in the database. That number is statistically insignificant for computer analysis of the texts, but even a cursory look at the transcriptions themselves confirms that the presence of so many errors is prohibitive to human readers, whatever the statistical significance. Making corrections at this level was our aim this summer, to help propel the transcriptions from masses of computer data to texts readable (and enjoyable) for people. For those who hail (with trepidation) the digital humanities as the end of reading and human response, our work is a reminder that digital texts and projects are ultimately designed with human readers in mind. Our sense of “good enough” is governed not by statistical significance but by the demands of human persnickety-ness, of the desire to immerse oneself in a text that at least appears to be “complete.”

Out damn’d ink blot! Out I say

Why should Macbeth be the play that lends itself most easily to (admittedly quite silly) comparisons with this work? When the instability of the source texts themselves obstructs our own desire for authority, how do we respond? What degree of alteration would be considered “murthering” the text, and how do we square our conscience with these, arguably inescapable, choices about what to transcribe, what to make more legible, and what to leave as crux? This might feel oddly dramatic as written here, but the experience of sitting face to face with a 16th century book, of making choices about how that text is transmitted and transcribed, feels something akin to tragedy for the conscientious and affectionate reader. And while this must be old-hat for those who work with these texts every day, it was entirely new to me. The cruxes I’ve described did not represent a majority of that errors we examined, but they are the ones that stick out in my memory, that solicited a sense of deep frustration strangely at odds with the silent stillness of the reading room. Yet more powerful than this frustration was the feeling of awe at these texts that had, somehow, survived—survived fire and flood and most of all indifference to sit before me, open and ready to survive once more.

Hannah’s Folger Reflections

Washington waxed feverish outside the walls of the Folger Shakespeare Library, but a different atmosphere persisted within. The rooms were chill, verging on icy; the wool-clad scholars were, wittingly or otherwise, alert. I sat with Lydia Zoells and Kate Needham in what Folger regulars call the New Room (ca. 1980), attracted to its abundant natural light. We were there to perform a task: in the course of two weeks, we intended to correct the maximum number of the remaining 20,000 errors in the database of early modern plays transcribed by Annolex. This was Phase Two of the Shakespeare His Contemporaries project, in collaboration with the greater Text Creation Partnership initiative. Previously, Professor Mueller had enlisted a handful of undergraduates, including myself, to check Annolex’s translation of texts with coinciding EEBO images. Unfortunately, due to the fact that these images were microfilm photographs of other pictures, the quality was often too poor to ascertain whether a mark on a page was an exclamation point or an erroneous blot of ink.

Lydia, Kate, and I convened at the Folger in order to determine if the original manuscripts housed there could illuminate any of the troubling instances that the digital tools could not. Previously, we had employed this brass-tacks method of cross referencing on an individual basis, adding the Bodleian Library, the University of Chicago Library, the Northwestern University Special Collections, and the Newberry Library to our list of visited sanctums. The Folger, however, seemed to hold the key to our transcription puzzle. We placed orders for over 50 texts, all of which the Library had in its vaults. Our work did not cease: we did not halt our editing when a tourist set off a fire alarm, and we only glanced up when specialists hung Henry Fuseli’s life-sized painting of the Macbeth witches on the facing wall.

As the days progressed we saw that most of the errors that we were correcting were ambiguous punctuation marks. In former phases of the project it was far easier to discern the meaning of a single word from its context clues than it was to determine whether a faint mark was a semicolon or a comma from its context alone, so the punctuation remained uncorrected. Even at the Folger it was often too difficult to identify such a mark with total certainty. Thus, we faced a recurring dilemma: do we leave the error uncorrected and the play incomplete, or correct the error to the best of our thinking and risk changing the text? This conflict inspired a number of conversations about the ethics of guessing at such a correction and the chance of accidentally transforming a text from its original form. Occasionally, Folger staff and scholars would join our conversations. They cited the movement in the 18th century to “improve” manuscripts such as these, when scholar-editors would add apostrophes and commas with prodigal liberalism in the hopes of clarifying an author’s “intended” rhythms and cadence. Inferring authorial intent of sixteenth century punctuation, when standard punctuation did not exist, was not only impossible but also a time sink, which we could not afford. One wise Folger staff member suggested that at some point an editing effort could be “good enough” and a text set aside.

These conversations were tea time discussions. Each afternoon and with charming inconsistency, a bell would ring: scholars would file out of the reading rooms, descend the stairs to the cafeteria, and revive as they nibbled biscuits and sipped steaming mugs. After witnessing a few days of the animation and conversation that arose during these mid-afternoon gatherings, I realized that tea time was crucial to intellectual life at the Folger. It was here that readers shared with one another their findings, their theories, and their academic mirth. Based on a mutual interest in English breakfast tea and early modern books, a community of scholars took shape. The Shakespeare His Contemporaries project strives to broaden this community. With free access to a cleaned-up database of early modern texts, a greater public can in turn discuss “moral editing,” the risk of drawing a text away from its original form, and the concept of work that is “good enough.” By adding more voices to these conversations, the worlds of both early modern literature and digital humanities will have the opportunity to complicate, broaden, and flourish.

Lydia on the Materiality of the Text

As my undergraduate career has progressed, I have become increasingly aware of, and fascinated by, the material nature of books. This has been facilitated by the fact that my studies tend toward literature that was written before 1700. For a long time, like most people, I took textual stability for granted and never thought about where, or rather what, books came from. But slowly, I became acquainted with EEBO, started reading textual introductions, and began to seek out classes that considered the materiality of texts. In my junior year, I took part in Professor Joseph Loewenstein’s Spenser Lab, where I took part in his project to produce an edition of the collected works of Edmund Spenser, diving headfirst into that rich area of interaction between the digital humanities and book history. When Professor Loewenstein suggested that I become involved with Professor Mueller’s project, Shakespeare His Contemporaries (SHC), I agreed because I was excited by the opportunity to work with early modern books in person as well as to contribute to early modern scholarship in a meaningful way.

Between April and July 2015, sometimes with Kate and Hannah, sometimes alone, I corrected transcriptions using the first edition play texts at the University of Chicago Special Collections Resource Center, the Newberry Library, the Folger Shakespeare Library, and the Houghton Library at Harvard University. Becoming comfortable working in these libraries and handling the delicate books was certainly one of the most valuable parts of my experience. The librarians were very accommodating and patient when it came to instructing a novice in the delicacies of handling the texts, and soon I was at ease with the books and with my surroundings. Each library I visited has its own atmosphere, and each one was a pleasure to get to know (though they were all kept at arctic temperatures). The books themselves offered their own special pleasures. I enjoyed finding the classified advertisements pinned inside front covers, engravings of stiff-looking authors, and the odd annotations left by early readers.

While the work of tracking down and entering punctuation marks, letters, and words was in large part tedious, it would sometimes bring me in contact with interesting passages. One of the great pleasures of working with colleagues who have a similar enthusiasm for early modern theater is that we often shared these moments with one another. This kind of work does not lend itself to a depth of understanding in the body of literature with which we were working, but I do believe that splashing in the pool that is the SHC corpus is valuable at this point in our undergraduate careers. We gained a kind of broad familiarity with the early modern dramatic corpus, and often found plays that interested us that we did know existed before.

It is important to me that our project this summer will contribute to the dissemination of quality transcriptions of early modern plays, especially of little known works. It was exciting when a correction I made felt meaningful: when it made a significant semantic difference in the text, or when it brought up an interesting question. It is my hope that these transcriptions will continue to be questioned and checked, but also that they will make the plays easier to read and more transparent for scholars and students. I have often been frustrated by the difficulty of finding good copies of less canonical plays, and making good transcriptions publicly available is a good start.