EarlyPrint is a collaborative effort—centered doubly at Northwestern University and Washington University in St. Louis—to transform the early English print record, from 1473 to the early 1700s, into a linguistically annotated and deeply searchable text archive. Its leaders have been Joseph Loewenstein and Martin Mueller. Contributions to the project have been made by faculty, librarians, IT professionals, and students at Amherst, Northwestern, Notre Dame, Nebraska-Lincoln, Tübingen, and Washington University in St. Louis, notably Anupam Basu, Craig Berry, John Ladd, Philip Burns, Douglas Knox, Stephen Pentecost, Kate Needham, Elisabeth Chaghafi, Peter Berek, Tracy Bergstrom, Daniel Johnson, Eric Lease Morgan, Hannah Bredar, Brian Pytlik Zillig, and Lydia Zoells.


Martin Mueller, co-PI, taught at Brandeis University (1965-67) and the University of Toronto (1967-76) before moving to Northwestern University, where he taught until his retirement in 2013. His primary research field has been the uses of ancient epic and tragedy by European writers since the Renaissance. He has also written on Homer and Shakespeare. More recently he has become interested in the uses of information technology for traditional philological inquiries. Together with Ahuvia Kahane, he is the editor of The Chicago Homer, a multilingual web site that uses the search and display capabilities of digital media to make distinctive features of Early Greek epic accessible to readers with and without Greek. He is also the general editor of WordHoard, an application for the close reading and scholarly analysis of deeply tagged texts.

Joseph Loewenstein, co-PI, specializes in the relation between the book trade and the field of literary practice. The author of Ben Jonson and Possessive Authorship, The Author’s Due: Printing and the Prehistory of Copyright, and Responsive Readings: Versions of Echo in Pastoral, Epic, and the Jonsonian Masque. Professor Loewenstein was a contributing editor to The Cambridge Ben Jonson and is one of the editors of the Oxford Edition of the Collected Works of Edmund Spenser. He directs the Humanities Digital Workshop at Washington University in St. Louis as well as the university’s Interdisciplinary Project in the Humanities.

Anupam Basu is an assistant professor of English at Washington University in Saint Louis. An early-modernist working on print culture and drama, his work has increasingly succumbed to the seductions of scale as he develops techniques to make the entire EEBO-TCP corpus tractable for search and analysis. Anupam has used the data behind EarlyPrint to explore the standardization of English orthography and Spenser’s archaism. He is currently working on a monograph on form and scale that asks how we might rethink literary forms through computational analysis. He has also published on the representation of poverty, vagrancy, and criminality in popular literature.

John R. Ladd is a postdoctoral fellow at the Kaplan Institute for the Humanities at Northwestern University, where he uses archival research and digital methods to investigate the ways early modern material practices and social networks shape literary forms. His current project, Network Poetics: Studies in Early Modern Literary Collaboration, reflects these interests by arguing that despite the rise of authorial individuation throughout the early modern period, collaborative, networked forms of writing persist and continue to take new shapes even after the Restoration. John’s writing has appeared in Milton Studies, The Spenser Review, and The Programming Historian, and he previously worked as research fellow for the Six Degrees of Francis Bacon digital project.

Craig Berry is a Software Project Manager and independent scholar who holds a Ph.D. in English from Northwestern University. He has published numerous articles, mostly on Chaucer and Spenser, one of which won the 1995 Isabel MacCaffrey Prize from the International Spenser Society. He has served as Secretary-Treasurer of the Spenser Society and is currently Digital Projects Editor at The Spenser Review. His digital work includes contributions to The Chicago Homer and preparing the text of The Variorum Spenser for the WordHoard database. He currently maintains the software behind the EarlyPrint Library.

Philip R. “Pib” Burns possesses fifty years of experience with humanities computing, natural language processing, databases, statistical analysis guidance, and computer programming in support of research and educational activities. In addition to EarlyPrint, he has participated in the Mellon-funded WordHoard, Monk, MorphAdorner, and Bamboo humanities computing projects. Philip’s educational background is in mathematics, computer science, and English literature. His MorphAdorner suite provides the bulk of the natural language processing for EarlyPrint.

Douglas Knox is assistant director of the Humanities Digital Workshop at Washington University in St. Louis. He has worked with dozens of projects across disciplines including history, literature, classics, art history, and music. He enjoys thinking creatively about exploratory data analysis and critically about methodology. Before coming to Wash U he directed digital publication projects at the Newberry Library in Chicago. He was managing editor of the Encyclopedia of Chicago and led the project that created a full-text digital version of the Chicago Foreign Language Press Survey.

Stephen Pentecost is Senior Digital Humanities Specialist at Washington University in St. Louis, where he supports projects in textual editing, archive creation, textual criticism, and musical composition.

Kate Needham is a Ph.D candidate in English Literature at Yale University studying early modern English translators. She began working on the project as an undergraduate research assistant and has now led undergraduate workshops on the EarlyPrint Library, supervised teams of summer interns, and planned classroom activities using Early Print tools.

Additional Credits

Each text is derived from an EEBO-TCP transcription. Most of the texts come from the TCP Phase 1 project. Proquest graciously gave permission to add some three dozen plays from TCP Phase 2 to the Shakespeare His Contemporaries Project, a pilot effort now subsumed within EarlyPrint. All texts in this corpus are covered by a Creative Commons Attribution-NonCommercial 3.0 Unported license . See the short summary of how the license affects your use of the texts.

The texts were converted from their original SGML format to at TEI P5 with Abbot, written by Brian Pytlik-Zillig and Stephen Ramsay at the Center for Digital Research in the Humanities at the University of Nebraska-Lincoln.

All of the texts underwent collaborative curation by undergraduates. At Northwestern these included Nayoon Ahn, Hannah Bredar, Madeline Burg, Nicole Sheriko, Melina Yeh, Sally Moore Hausken, Irina Huang, Yue Hu, Ashley Guo, Anelia Kudin, and Katherine Elizabeth Poland. At Washington University in St. Louis the curators were Kate Needham and Lydia Zoells, who learned much about the editing of Early Modern texts in Joe Loewenstein's Spenser Lab. These students used and were instrumental in the design and refinement of the collaboration curation tools Annolex and Library Finder (see below). Peter Berek at Amherst College directed three students, Heejin "Gabby" Ro, Yixin "Arthur" Xiao, and Keren Yi. Over a period of three weeks in January 2016 they corrected many textual defects in 118 plays, consulting printed originals at Smith College and the Houghton Library.

EarlyPrint’s improvements to the EEBO-TCP archive would not be possible without the time and effort of the many contributors who have submitted annotations and corrections via the EP Library site.

Craig Berry designed and built the correction tool Annolex. Annolex was a prototype curation tool now superseded by the annotation features of EarlyPrint.

The texts were tokenized and linguistically annotated with MorphAdorner, a Natural Language Processing toolkit developed by Philip R. Burns at Northwestern University. Burns also developed the website for Shakespeare His Contemporaries using the TEI Simple PM toolkit written by Wolfgang Meier.

We are grateful to the Andrew W. Mellon Foundation for funding various aspects of the EarlyPrint project and in particular the Annotation Module, which was built by eXist Solutions GmbH.

Work bearing on EarlyPrint has been generously supported by five grants from the Andrew W. Mellon Foundation and a Digital Extension grant from the American Council of Learned Societies.



All texts in the EarlyPrint Library corpus are covered by a Creative Commons Attribution-NonCommercial 3.0 Unported license . See the short summary of how the license affects your use of the texts. If you are interested in licensing the texts for commercial use, please contact Martin Mueller at martinmueller@northwestern.edu .


The EarlyPrint Library site builds upon the TEI Simple PM software written by Wolfgang Meier. That software, and the modifications made at Northwestern University, are dual-licensed as follows.

  1. Distributed under a Creative Commons Attribution-ShareAlike 3.0 Unported License http://creativecommons.org/licenses/by-sa/3.0/

  2. http://www.opensource.org/licenses/BSD-2-Clause

All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the copyright holder or contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.

Bugs and Known Limitations

The current release of the Library site has some bugs and weaknesses.

  • When generating a PDF or EPUB, you cannot exclude the machine-generated castlist or the list of corrections, nor can you extract particular acts or scenes.
  • When searching for a phrase by enclosing the search terms in quote marks, you may not find all the instances because the standard spelling may intrude between the words you specify. Select the proximity search type instead.
  • At present you cannot search for lemmata or parts of speech even though these are encoded in the texts.

We expect to correct these deficiences as time goes on.

Please report any other bugs you find to editors@earlyprint.org