The Perdita Manuscripts Database is as useful a political and scholarly argument about the value of manuscripts as it is an archival or preservational tool. The database hosts hundreds of manuscripts written by early modern women, and does so not simply for fear of a potential loss of the texts due to natural disaster or library space reduction. Instead, the foundation of the project is predicated on an argument that these manuscripts were already being erased through devaluation related to both the gender of their authors and the format of the texts. This is not to imply Perdita represents a perfect resource; indeed, this review explores several elements of the site that render it a challenging tool limited to a specific audience. However, while further development of metadata and enhanced technology may eventually improve the data presented on the page, the project itself represents an important mobilization of the digital realm as a critique of existing forms of knowledge rather than a simple regeneration of them.

            The Perdita Project was founded in 1997 as a response to a simple but fundamental lack experienced by scholars working on early modern women’s manuscripts: the ability to organize, find, and access the manuscripts themselves. Existing resources tended to privilege not only male canonical writers, Jill Millan argues, but also the format of the book, which was viewed as “authoritative and final” in contrast to the “private, tentative and ephemeral” versions of texts in manuscript form (“The Perdita Project Catalogue”). As a result, the writings of early modern women, which tended to exist primarily in manuscript format and largely outside the canon, were quite literally lost, often buried in “record office catalogues and card indexes” (“Catalogue”).

            The eventual response to this lack was the creation of the Perdita database, which currently houses over 230 manuscripts written by women in the British Isles from 1500-1700. The site’s searching aid facilitates browsing through a variety of lenses; users can view texts based on author names, source locations, first lines, and genre. The latter is a particularly interesting and extensive collection which, through its inclusion of categories such as “advice”, “receipt” and “culinary writing”, expands the often limited and gendered definition of genre itself and calls for emphasis on preserving and encountering a broader variety of texts. The advanced search function allows keyword searches to be refined using the same categories listed above in addition to Boolean operators.

            Though such resources are not yet consistently supplied for every text, additional information is also provided about many manuscripts on the database. The most detailed of these physical descriptions inform the user about the form, support, extent, hand, binding, and condition and acquisition histories of the manuscript being viewed. Biographies of figures featured on the site are similarly useful but not yet fully developed. A 2009 review by A.B. Johnson notes the absence of a biography for Queen Elizabeth I, for instance, and as of this review the entry for this important and well-known monarch remains blank. However, the ability to search the database’s additional information, incomplete though it is, represents an important attempt to, as Jill Millman writes, “unlock the potential of a catalogue in ways that would be difficult or impossible in print” (“Catalogue”).

            Where the site’s usability is most limited is, somewhat ironically, the viewing and manipulation of the manuscripts themselves. It is true that the search function allows for the location of keywords on specific pages. However, the unavailability of a full-page transcription combined with the difficulty of reading many of the texts, even on a medium-large sized screen, means these ‘lost’ documents, though now found and centrally located, remain inaccessible to many users. A person must log on to the site with a clearly formed conception of exactly what they hope to find, and posses equipment advanced enough to allow for on-screen examination of texts that are often extremely difficult to view. Johnson’s review, then, which concludes that the audience of the database is necessarily limited through its format to “graduate students and faculty/researchers” (“Perdita Manuscripts”), is an assessment that is unlikely to shift until technological developments render either the site or the manuscripts themselves easier to present and disseminate.

            Imperfect though its execution may periodically be, one can ultimately read The Perdita Manuscripts database as a partial answer to the critique cited in John Walsh’s work (which is itself referencing Derrida).  That is, while digital humanities may indeed be particularly susceptible to ““archive fever”, an inordinate amount of emphasis on textual editing and archiving at the expense of more…creative uses of technology” (“Multimedia and Multitasking”), Perdita’s very existence acts as a powerful critique of the ways in which information about the early modern period has traditionally involved the gendered and formal devaluation of manuscripts. Thus archives can, and in the context of Perdita I believe must, be read not as impartial collections of texts created purely for the sake of collection, but as broader and inherently political projects that generate and regenerate knowledge.

Early English Books Online (EEBO) is a comprehensive source of English texts covering works from 1473 to 1700. The texts originate from a number of countries including England, Ireland, Wales, and British North America. The database is the product of a partnership between Proquest, the University of Michigan, and Oxford University. EEBO currently includes over 125,000 documents and is still being updated. The resource is composed of texts from the STC, WING, and Thomason Tracts collection. Covering such fields as history, literature, medicine, and politics, these resources provide a diverse range of material useful to almost any field of enquiry.

EEBO’s extensive list of texts is complimented by a user friendly interface. Enter in your search keywords and you are presented with a clean layout clearly displaying the retrieved documents. With an advanced search, users can choose to view articles from a particular collection, and choose region specific articles amongst other options. A useful tool included within EEBO’s search engine is the Variant Spellings and Variant Forms option. This broadens the range of search results that are found by including alternate spellings of keywords. Typing in murder for instance will then include variant spellings such as murdir, mvrder, and murdre. Whether or not this helps remedy potential errors in the text resulting from OCR like ECCO’s fuzzy search is unclear. Information provided for each entry is concise and includes the author, title, date, a physical description of the object, and an image of the first page. Above the image accompanying each article are icons linking to the various ways in which the article can be viewed. By clicking each respective icon the user can view individual page images, thumbnails, illustrations, a detailed bibliography, and, where available, full text. The full text is acceptable however it appears to have some kinks. When clicking on the full text version of a certain document I was presented with text from an entirely different page.  EEBO also allows texts to be downloaded as PDF and TIFF formats. A final note of criticism is the lack of any digital table of contents. This makes the navigation of lengthier articles a bit cumbersome especially when there is also no full text available.

EEBO Interactions is a recently added component that appears to be EEBO’s answer to the rising popularity of social networking. With EEBO Interactions, users can engage with the resources found on EEBO, adding information or pointing out where information—such as dates or other bibliographic material—should be checked or added. The format of EEBO Interactions is structured but leaves something to be desired. When an article is viewed through EEBO Interactions, a number of compartmentalized boxes appear in which specific comments can be made. Users can, amongst other things, comment on the physical copy of the text, suggest related links, and add notes. Each of these contributions must fit into the specified box. Again, it provides a clean interface, but it does not feel very organic. It is clear that EEBO Interactions is still in its infancy. It feels disjointed from the rest of EEBO, requiring an alternate portal accessed from the EEBO home page. Hopefully with time, EEBO and its social element will become more unified. While EEBO’s effort is not as intensive as something like the NINES social interactivity suite, it is a step in the right direction. In conjunction with the continuing addition of new material into the database, EEBO is an online resource that is sure to appeal to scholars and students for some time to come.

The Reading Experience Database provides a timely home for important textual details like marginalia that reveal “a recorded engagement with a written or printed text—beyond the mere fact of possession” (“What is a ‘Reading Experience’?”). The scholarly allure of such details lies in their uniqueness; they represent the social life of texts. Whether inscribed directly onto the texts themselves or recorded elsewhere, the RED seeks to collect subjective engagements with journals, paper fragments, and individual book objects that can reveal important insights into the reading habits, moral sensibilities, and daily concerns of readers in various historical contexts. Although the RED has had an online presence since 1995 through The Open University, its purview had been restricted until recently to materials from the United Kingdom between 1450 and 1945. This past February it expanded internationally, launching partner projects in Canada, New Zealand, the Netherlands, and Australia.

In many ways, this expansion could not have arrived at a better moment. Because most mammoth digitization projects like Google Books scan only one “clean” copy of any given text to supplement their online collections, they are effectively effacing many of the physical traces of reading that are crucial for scholars working in areas like print culture and literary history. They are not destroying books after they digitize them, of course, but it is becoming increasingly difficult for scholars to gain access to older texts that have already been made available in some form online. Many old or valuable texts are preserved in Special Collections, but not everything qualifies for inclusion in these collections, particularly materials published after 1923 that are still in copyright. Further, the vast majority of books that contain marginalia and other extraneous user-data are cheap, mass-produced copies that have little value from a librarian’s point of view. As shelves continue to fill up, more and more of these cheaper, low-circulation items are being locked away in off-site storage locations, which are harder for users to access. At the same time, scholars are beginning to increasingly utilize online databases like Early English Books Online (EEBO) and 18th Century Connect for their research needs. Without databases like the RED, these interrelated developments of digitization and scholarship would soon ensure that the physical traces left by the readers of millions of books in many different countries over hundreds of years would become invisible to online search queries—and, increasingly, to more labour-intensive stack-searching in circulating collections.

The RED seeks to provide a repository of transcribed scribbles, notes, court records, memoirs, letters, diaries, and other materials that can support research into “what…people read, where and when they read it and what they thought of it” (“Reading Experience?”). Documents like these provide a glimpse into the history of reading that is not available through the “lending records from libraries, or sales records from publishers and booksellers,” which only convey information about the history of the book (Crump 28). Although currently the four partner projects are still in their planning phases and UK RED is the only fully operational database, their eventual goal is to interactively display comparative search results between the five repositories to enable scholars to, among other things, “chart the reading tastes of individual readers as they travel to other countries, and consider how different environments may have influenced their reading habits” (“Welcome to RED”). This functionality will be invaluable for scholars working in areas like transatlantic modernism and diasporic literatures, and will also dovetail nicely with other well-established online thematic research collections dedicated to internationally renowned and emulated figures, such as The Walt Whitman Archive.

Because of the extraneous nature of its targeted content, the RED, unlike many other databases on the web, encourages volunteers to record and submit primary materials. Various user-driven initiatives through online scholarly communities like NINES have been slow to get off the ground, but this attempt to attract user submissions appears more likely to succeed than previous initiatives, partly because it appeals to individuals both within and beyond academic circles. The RED represents a unique opportunity to share and ensure the preservation of materials such as obscure research notes and relatives’ journals that would otherwise sit dormant in private filing cabinets. NZ RED’s first project, “Reading in WWI,” for instance, strikes a familial, patriotic chord by aiming “to collect and analyse reading experiences by and about New Zealanders, at the front, in the trenches, on the troopships, and at home” (NZ RED). In addition to submitting more labour-intensive details about specific documents, users also have the option to contribute to RED’s wiki page, where they can make suggestions about what should be digitized.

This emphasis on user-driven submission is evident in the UK RED site’s user-friendly functionality. Next to the familiar “Browse” and “Search” options there is an “Explore” button, which provides several examples of past scholarly works that have utilized evidence from the RED, as well as a series of tutorials outlining some of the various ways one might interact with the database’s materials oneself. Browsing is clearly routed through “Readers” and “Authors,” and although there is obvious overlap, the options are distinct: if I want to see what Katherine Mansfield thought of Virginia Woolf’s short story collection, Night and Day, for example, I would “Browse by Reader” and select Mansfield. Conversely, searching for Mansfield under “Authors” reveals among other things, Woolf’s opinion of Mansfield’s story, “Bliss,” and so on. In addition to the browsing function, the basic and advanced search features from the main page provide users with both simple and highly detailed options for directing search queries.

As is, the site has two main limitations, one more pronounced than the other. The smaller limitation is that, while browsing, there is no built-in way to search or sort through the entries on any given page, besides the search function in the browser. With figures like Woolf, who has quite a few entries, this inability sometimes results in a monolithic single page of information that can only be manipulated through vertical scrolling. The larger limitation is that the content available through RED consists entirely of transcriptions; it does not provide actual scanned and OCRed images of its materials. Instead, everything is flattened into superficially coded HTML descriptions. As a result, scholars interested in visual and contextual framing details (exact location of marginalia, size and appearance of script) have to rely on the user-submitted details, which, though extensive, remain explicitly mediated. However, these weaknesses aside, and despite the fact that the usefulness of the RED will vastly increase once the four partner projects begin digitizing, UK RED already provides an invaluable resource that addresses an important niche, both in academia and the United Kingdom more generally.

Ben Gehrels
Simon Fraser University

Canadian literature is going online – as literary scholars, we have a unique opportunity to rejuvenate public interest in our national literary culture through this transition to a medium that allows for more open and democratic participation in content production. For my final project, I am beginning work on a digital edition of works by Edith Eaton/Sui Sin Far (EE/SSF), under the auspices of Editing Modernism in Canada (EMIC) project, that will push the edges of this opportunity by integrating innovative approaches and tools from both inside and outside the academy.

My primary goal for this exercise is to explore ways to innovate on the interface design for digital editions in order to allow the reader/user to have more authority in designing his or her reading experience. This is a natural extension of the last decades’ efforts to de-centre the author in literary criticism, and Edith Eaton is a perfect candidate for a digital edition that furthers these efforts. As an author with multiple identities who has primarily been read through one limited lens (i.e. race) Eaton and her work are burgeoning with potential re-figurations. Dr. Mary Chapman has already been working diligently to move beyond the narrow construction of Eaton as a bi-racial writer. Eaton’s known body of work, now quadrupled thanks to Dr. Chapman’s efforts, includes poems, children’s stories, love stories, humor pieces, stridently anti-racist editorials, “native informant”-style magazine features, stunt-girl journalism, sensationalized reportage of a murder case in small-town Ontario, and more. Eaton emerges from this body of work as an incredibly complex figure whose “real” perspective on the events of her time is obscured by the many different authorial identities she chose to assume.

The first step I’ve taken towards completing this project has been to install and begin customizing a Drupal site, using the free hosting provided to me as a student of BCIT (this is a temporary solution – finding longer term hosting is something I intend to work out with EMIC). Given that much of the contribution of this project will involve designing a user interface (UI) that helps represent the complexity of EE’s identity, design has been the main focus of my early work. I have quickly learned that I need to at least quadruple my estimate of the amount of time it will take to complete different aspects of the project, partly because of the challenges I’ve encountered in using Drupal.

Drupal — an open source content management system — is almost completely bare bones out of the box, but can be extended and made more sophisticated through the use of modules. For example, the user has an immediate option to create webpages via a basic Drupal install, but must create them in HTML unless she enables a WYSIWYG module. To work with images, an image editor module must also be downloaded and installed. For this project, I worked with IMCE.

The basic look of a Drupal site — i.e. fonts, colors, position of links — is determined by its theme – again, a basic Drupal install comes with just a few options, but there are many free user-contributed themes you can download and install. Unfortunately, the theme that best matched my vision for how my site should look (Framework) continuously breaks the administrative control panel of my site, rendering it unusable. At this juncture, I’m expecting to have to learn how to customize my own theme, something that should add at least a week onto my timeline for completing the project.

My original goal was to the have the basic structure of the site and the user interface complete as my mid-term project. In reality, I’m not even done the homepage yet, after at least 30 hours of working on the site.  I began working with an image of EE/SSF, but as the only images of her available are very low quality, I was not able, using Photoshop, to create something aesthetically pleasing with my limited graphic design skills. I finally settled on an image of a white narcissus, which inspired Eaton to take the name Sui Sin Far (apparently the Cantonese word for the flower). I anticipate that figuring out how to get the page title and the RSS logo removed from the homepage will take at least another few hours, if not more. Learning how to make the homepage image interactive will be even more work.

One learning from this experience so far has been how quickly the more traditional scholarly elements of designing an edition can seem like a low priority. Choosing the words I wanted to use on the homepage, which will become the basis for the structure of the website, was something of an afterthought as I was rushing to create an image I could live with, and then solve the problem of how to put it in place as the homepage of the site (I ran into extensive problems here related to the default text on the Drupal homepage). Fortunately, these categories can still be changed, but it was instructive to see how quickly the scholarly aspects of the humanities can get short-shrift when one is trying to conceive, design and build an entire digital project alone.

My goals for the completed site include the following:

  • Intro page to appeal to the widest possible audience
  • Brief bio pages for each of the authorial identities we’ve chosen to highlight: journalist; ethnographer; traveller; cross-dresser; storyteller; amd anti-racist.
  • Suggested exercises for teachers
  • Facebook & twitter share buttons
  • Canadian modernism timeline
  • Map showing EE/SSF’s pattern of travel and resettlement throughout her lifetime.
  • Scholarly apparatus
  • Documentation of scholarly principles
  • Google analytics

I’m still optimistic that by end of April 2012 I will have completed the items on this list. I’m revising my goal for the end of this term, though, to reflect what I’ve learned thus far. My aim for the end of term is simply to have the structure of the site in place, including primary and secondary navigation, and all the subpages (which will be ready but blank). This is an ambitious goal.

Testing TypeWright

I participated as one of the “power users” in the testing of the TypeWright tool hosted by In my assessment of the tool I chose a work titled Love-letters on all Occasions Lately Passed between Persons of Distinction, collected by Mrs. Eliza Haywood. I corrected the first eight pages of the document.  After completing my evaluation, I submitted a report that outlined what I have learned, my experience with interface, and suggestions for future developments. The sections below contain my evaluation of the tool as well as a more comprehensive version of the report I have submitted to

For those who are not familiar with the tool, here is a Video Introduction.

TypeWright Beta Evaluation:

How much did you learn about mechanically typed text from this exercise?

To test TypeWright, I chose a work titled Love-letters on all Occasions Lately Passed Between Persons of Distinction, Collected by Mrs. Eliza Haywood. I did not realize before I started the exercise that spacing between letters would be the main issue. Starting with the first line of the book—where the OCR text for “LOVE-LETTERS” was “LOV E-LETTERS”—most of the changes on the eight pages I corrected involved deleting or adding spaces between letters and words:

I’ve also noticed that printers often used extra spacing as part of the ornamentation of text. In my example, the publishing city, London, had an extra space between each letter. Similarly, the first word of the letter, Madam, had larger spacing between letters.

I also learned that, aside from spacing, mechanically typed text is remarkably varied in its use of lower and upper case letters as well as bold and cursive typefaces. Whether used for emphasis, distinction, or aesthetic reasons, almost every page of my document contained words in bold typeface, italics, small capitals, or regular capitals. The variations of the letter “s” were the result of the second most common OCR error. While OCR worked surprisingly well with some words that contained the long “s,” words like “shine” or “mistress” where the letter “s” was in close proximity to other problem letters (“h” and “t” in these cases) would almost always require correction. The doubling of the letter “s” also frequently caused multiple mistakes:

Overall, I think that this exercise is valuable to any student of literature at the undergraduate or the graduate level.

Please tell us anything else you’d like us to know about your experience using the TypeWright beta.

I had some difficulty finding a TypeWright enabled text. I wanted to work with one of the texts by Eliza Haywood. A search for “Haywood” from the TypeWright tab, where the engine searches specifically for TypeWright enabled texts, returned 28 results. Less than half of these results allowed me to edit the work; out of the first 10 results only one did not take me to the error page. I was aware of the last line issue (I learned from the walkthrough video that the last line would not retain any changes if one did not return to the previous line), but on several occasions I still managed to forget to return to the previous line to make sure that the last line edits would be saved. Perhaps there should be a “check/save” button for the entire page. I also had some problems with adding and deleting lines. The section of the book that I worked on contained an ornately decorated initial capital letter (p. 2).
Because of this initial the OCR text missed the first line. I have tried to insert the line above the section, and while I was able to do so, the line disappeared as soon as I navigated away from the page. A bug with line insertion was reported in the original “call for testers” e-mail, so the feature was probably not fixed when I did my testing. When I moved to the third line of that section, I realized that the first line was actually captured by the OCR text, but because of the positioning of the ornamented initial—the ornaments took some space forcing the actual letter to be slightly below the first line—the first was moved to the position of the second line while the second line was mistakenly captured as the first one. I was able to delete the first line (which was now in a position of the second line) to make sure that the actual second line was followed by the third line of the text. I was not able, however, to transcribe the first line of the text in any position above the second line. Similarly, the page header of some of the pages I corrected contained the word “DEDICATIONS” that the OCR missed, and I was not able to add that line above the first line of the text.  I have also experienced several instances where the last line would be on the same level as a page number/identifier. Page 2 of my document, for instance, contained this OCR text in the last line: “A3 fully.” Clearly, “A3” was not part of the original last line, but I was not sure whether to leave this line intact or insert a line with “A3” below to separate the page identifier from the rest of the text. Perhaps small issues such as the one I had with page numbers should be documented as part of the “Instructions” text on the bottom of each “Edit” page.

What additional features would you like to see?

Given the problems I have encountered with adding and deleting lines—and assuming the line behavior I experienced with my text was not singular—a “swap/switch lines” feature would be a good addition to the “insert line above/below” buttons. It would also be great if the red frame that displays the current line would be more interactive. For example, if I am adding a line that was missed by the OCR, I could move the frame to the exact line position on the scan. Similarly, if the OCR missed the last character/word of the line, I could expand the red frame to the missing part of the scanned page. As I have mentioned before, a “check/save” button for the entire page—that perhaps could also allow a final review of all the changes made—would be a great feature. While I understand the need to confirm that the line is correct with the “Assert the line is correct” keyboard shortcut or the “check mark” button, I found that sometimes most of the lines on the page were correct, but I would forget to assert that the lines were correct on several occasions. I think it would be better to automatically assert that each unedited line is correct. And if developers implement the final “check/save page” feature, I would be forced to double check and assert that each correct line is in fact correct. Perhaps allowing the user to see the entire page in plain corrected text would also be a good reviewing strategy. Finally, one of the most frustrating parts of the user interface was not being able to see the entire image scan. For example, when I came across that decorated initial capital, I constantly had to jump up and down the lines around the letter in an effort to figure out the OCR mistakes. I understand that showing the full scan of the page would require more scrolling for the corrector, but if developers introduce the ability to resize the scanned page window, then users can select a size that is more comfortable for them and their computers’ screens and resolutions.