Web Archives and the Future of Historical Inquiry: “History in the Age of Abundance” Book Review Part I

This is a guest post from Scott Richard St. Louis.

Milligan, Ian. History in the Age of Abundance? How the Web is Transforming Historical Research. Montreal: McGill-Queen’s University Press, 2019.

While considering the methodological demands, staggering possibilities of volume, and thorny ethical questions associated with the use of web archives as historical sources, Ian Milligan – Associate Vice President for Research Oversight and Analysis at the University of Waterloo – sees a major epistemic challenge on the horizon for his fellow historians:

“We used to, as a rule, forget. Now we have the power of recall and retrieval at a scale that will decisively change how our society remembers … what does it mean to write histories with born-digital sources – from websites written in the mid-1990s to tweets posted today? How can we be ready, from a technical perspective as well as from a social or ethical one, to use the web as a historical source – as an archive? Historians with the training and resources are about to have far more primary sources, and the ability to process them, at our fingertips. What will this all mean for our understanding of the past? How can these sources be used responsibly? Finally, if historians cannot rise to the moment, what does this mean for the future of our profession?” (page 3)

These questions bring life to History in the Age of Abundance? How the Web is Transforming Historical Research, a thought-provoking title that calls historians to action in collaboration with librarians, archivists, and computer scientists, among other stakeholders in preserving and extending access to the world’s rapidly growing digital heritage. Milligan’s admirable vision signals a future of greater computational savvy and empowerment to collaboration among historians, though the book makes clear that the road ahead will be a long and difficult one.

Given that allied professionals in the information sciences have been “leading the extremely complicated conversation around how to preserve and make accessible digital material” for the long term, one might wonder why Milligan chooses historians as his audience for a book about web archives (page 7). The answer lies in their longtime status as a significant user group of library and archival services; “historians will be among the primary future users of these materials, as the professionals who interpret and give shape to our understanding of the past. They are in danger of being left behind as research topics begin to consider the 1990s … historians need to wake up to the changes ahead … new skills to better contextualize and understand digital material are needed” (page 7). A brief discussion about historians’ lack of capability or incentive to engage – relative to other disciplines – in the Culturomics project at Harvard University offers a bleak illustration of what opportunities are missed when historians’ metaphorical toolbelts are missing the skills for which Milligan urgently advocates (pages 238-240). Milligan’s answer to the question of audience also reveals why students of information science today should pay attention to this work; after all, tomorrow’s historians might turn to tomorrow’s librarians and archivists for support in capturing, preserving, and making sense of web archives.

The year 1996 is key for Milligan, as it marks the milestone when “widespread web archiving began at the San Francisco-based Internet Archive and several national libraries around the world” (page 236). Milligan concedes that “the number of historians engaged in post-1996 scholarship is still relatively small” but nevertheless successfully illustrates with stark clarity the imminent stakes for historians and information professionals alike: “To neglect the web would ignore the main medium for communication, publishing, social interaction, commercial enterprise, and creative activity since the 1990s” (pages 13-14). With major cultural heritage institutions now a quarter century into their web archiving work, it is clear that historians must begin getting comfortable with web archives if they seek to pursue serious work engaging the late 1990s and beyond.

With the question of audience answered, a question of timing remains. Why is this book necessary now, when so few historians are working on post-1996 projects? Here, Milligan’s accessible historiographical engagement strengthens his argument for the book’s overall significance: “It took roughly twenty years for the first drafts of 1960s history to be written, and only ten years after that for it to be an uncontroversial part of the profession. The web is roughly the same age today as 1968 was when historians wrote their first drafts of the events of that tumultuous decade” (page 20). The clock is ticking; the work ahead is daunting. Where to start?

In a manner fitting for a historian, Milligan begins in part by gathering context. When have historians worked with computers before, and to what ends? What successes and failures marked these endeavors? The answer is fascinating:

“By the 1960s and 1970s, historians had … begun to turn to early computers – imagine stacks upon stacks of punch cards – to help them make sense of large datasets such as   national censuses. Coded by armies of researchers, innovative studies helped social historians understand the degree to which social mobility occurred between censuses, or the economics underpinning American slavery. Yet this turn toward ‘Cliometrics’ ended as quickly as it had begun for a variety of reasons, ranging from illegible handwriting in primary documents, to the difficulties of teaching numbers to humanists, to argumentative overreaches … Other historians became involved with computers, not to    facilitate large-scale research but to collect, present, and disseminate materials: a vision of computers and the humanities grounded in the public history mission of reaching new audiences, working with teachers, and making history relevant.” (page 54)

Milligan argues that we are now experiencing another “computational history revolution” facilitated by “decreasing storage costs, the power of the internet and distributed cloud computing, and the rise of professionals dealing with both digital preservation and open-source tools” (page 61). Even so, the rise of digital technology does not represent the first time that historians have expressed concern over the risks of medium change: “In 1929 Robert C. Binkley, a historian at New York University, raised the spectre of the then recent cultural heritage residing on fragile, acidic pulp paper … The concerns raised by Binkley and other scholars led to dramatic advances in technology, the widespread adoption of acid-free paper, and in general a concerted effort towards paper preservation” (page 73). Just as this previous medium change led to advances born of necessity, so too – Milligan believes – can web archives galvanize the progression of historical scholarship in exciting new directions: “As collections grow, new methods are necessary. Some of this work involves … harnessing and deploying cutting-edge enterprise big data platforms to explore data at scale” (page 143). One exciting example of such progression is the Archives Unleashed Toolkit, which “provides an environment to both ingest web archive content and run sophisticated analytics on content gathered” (page 155). Milligan, a member of the Archives Unleashed project team, is not content to tell others what to do; instead, he and his colleagues lead by example.

Part II of this review continues here.

Scott Richard St. Louis is a 2021 graduate of the Master of Science in Information program at the University of Michigan, where he focused on digital curation.

1 reply

Leave a comment