[PDF][PDF] Supporting" Distant Reading" for Web Archives.

J Lin, K Kraus, RL Punzalan - DH, 2014 - cs.uwaterloo.ca
DH, 2014cs.uwaterloo.ca
In a recent essay on the stock footage libraries amassed by Hollywood studios in the first
half of the 20th century, Rick Prelinger—moving image archivist at the Internet Archive—
laments that “archives often seem like a first-aid kit or a rusty tool, resources that we find
reassuring but rarely use”(Prelinger 2012). Although he doesn't single them out by name,
web archives are particularly vulnerable to this charge. User studies, access statistics, page
views, and other metrics have in recent years told a consistent story: web content that has …
In a recent essay on the stock footage libraries amassed by Hollywood studios in the first half of the 20th century, Rick Prelinger—moving image archivist at the Internet Archive—laments that “archives often seem like a first-aid kit or a rusty tool, resources that we find reassuring but rarely use”(Prelinger 2012). Although he doesn’t single them out by name, web archives are particularly vulnerable to this charge. User studies, access statistics, page views, and other metrics have in recent years told a consistent story: web content that has been harvested and preserved by collecting institutions, universities, and other organizations often lies fallow, and like Prelinger’s rusty tool may be notable more for its latent potential than for having served any real purpose (Hockx-Yu 2013; Kamps 2013; Huurdeman et al 2013). While the reasons for neglect are myriad, this paper focuses on one: the lack of tools to support a wide range of interactions with the content. We describe initiatives underway at the University of Maryland to partially redress the problem and highlight the need for qualitative user studies.
The Internet Archive’s Wayback Machine is perhaps the best-known and most widely available tool to browse captured content. Both the Internet Archive’s main public site and Archive-It, its subscription-based web archiving service, replicate the experience of viewing web pages on the live web, thus reifying a “close-reading” experience. First developed in the mid-1990s, the software came of age at the same time digital humanities scholars were building the first generation of web collections aimed at providing highresolution digital facsimiles of literary and artistic works by Blake, Rossetti, Dickinson, Whitman, and others. The emphasis on accurate rendering and display is thus a hallmark of both the Wayback Machine and many early DH projects, the latter of which likewise self-identify as “archives,” albeit archives on a dramatically smaller scale.
cs.uwaterloo.ca
Showing the best result for this search. See all results