Josephine Jenks, Jessica Walthew, and Andrea Lipps
Electronic Media Review, Volume Seven: 2021-2022
As Cooper Hewitt, Smithsonian Design Museum builds its nascent born-digital collection, new formats and technologies are regularly introduced, each with their own acquisition and preservation needs. This article focuses on the museum’s first effort to collect instances of interactive data journalism. Throughout the ongoing acquisition process, Cooper Hewitt is prioritizing public access and hands-on interaction. As our nation’s design museum, our emphasis on collecting these works is centered on how the design and interaction enhance information access. Revisiting a selection of interactive web-based works already in the collection provided a starting point for developing strategies for documentation, preservation, and digital exhibition of these works. Central to these case studies is the reciprocal relationship between preservation and access, and a vision for how collecting institutions might harness one to further the other.
Introduction
Over the past five years, Cooper Hewitt’s Digital Acquisitions Working Group has fostered a nascent digital collection that includes emojis, typefaces, animations, and more. With each new category of digital design that enters the collection, the team works collaboratively across multiple departments to determine how the acquisition will take shape. Recently, the museum embarked on its first effort to collect instances of interactive data journalism. These stories expand the traditional definition of journalism by emphasizing visual design and taking full advantage of the digital and mobile environment. The proliferation of interactive digital journalism across the news media landscape began circa 2012, with a feature called Snow Fall by New York Times reporter John Branch about a deadly avalanche that had recently torn through a ski slope in Washington State. Snow Fall featured several innovations that transformed the article into an immersive experience. Interactive insets for each skier expanded when clicked to reveal photographs of them with their family, or as children on the bunny hill—contextualizing their dramatic accounts of the event with details from their day-to-day lives. Simulated animations and aerial footage made readers feel as though they themselves were on the mountain that day. The piece reverberated across the Internet as a momentous leap forward in digital storytelling.
Since then, the Times has been producing ever more of these web-based experiences for their stories. In “New York’s Subway Map Like You’ve Never Seen It Before” (December 2, 2019), readers virtually traverse the map while learning about its history and graphic design. Two stories likewise used the unique capabilities of the Times’ interactive platform to reach audiences against a backdrop of an overwhelming and frenetic news cycle in the early days of the COVID pandemic. “How the Virus Got Out” (March 22, 2020) uses animations to trace the spread of coronavirus across the globe. “An Incalculable Loss” uses a seemingly infinite scroll and snippets of obituaries from across the country to mark the grim milestone of 100,000 Americans lost to COVID-19. As websites that pull from a multitude of links, style sheets, and animation scripts, each of these stories has a complex, networked anatomy. They are testaments to a growing consensus that “as performative digital objects—‘software’ in the broadest sense rather than digital simulacra of documents—are sought to be preserved, object boundaries appear increasingly ‘blurry’” (Espenschied and Rechert 2018, 1). To form a preservation strategy for these complex works, Cooper Hewitt revisited and experimented anew with a selection of digital works already in the collection. The result was a multitiered approach investigating documentation and web archiving, which has informed future plans for expanded public access.
Documentation Through Screen Captures and Videos
Documentation for the following case studies took the form of screen captures and videos. In his article “Taking Screenshots,” Dragan Espenschied shares several guidelines on screen captures for the purposes of preservation. He advocates for including the browser’s frame as an acknowledgment that web pages are inextricably linked to their browser environments. The visual details of our Google Chrome or Safari windows may seem commonplace to us, but “in a year’s time, the memory of that particular browser version will already have faded away” (Espenschied 2021). He also recommends ensuring that a website’s full URL is visible, as “it signifies ownership and control, and is often subject to artistic choices” (Espenschied 2021). Armed with Espenschied’s guidelines, Cooper Hewitt created screenshots of the Times stories and several other digital works already in the collection.
Nearly as simple to produce as screenshots are screen recordings. One challenge, however, was the sheer amount of scrolling required, particularly by “An Incalculable Loss.” At 33,565 pixels long, the story’s exceedingly long format was a purposeful design choice, intended to underscore the unthinkable number of COVID deaths in America. It was difficult to avoid slipping up during a recording—mistakenly scrolling past text or receiving a notification halfway through. Inevitably, the pace would be inconsistent. The solution to many of these issues turned out to be automated scrolling. For the desktop screen recording, testing a few different Chrome plug-in extensions eventually led to one that allows a user to pause and resume as needed. For the mobile version, Shortcuts, the Apple app that comes preloaded onto iPhones, was the answer. A Shortcut called Safari Autoscroll offers a variety of speeds and, once installed, appears as an option whenever one visits a web page. The result of the automated scrolling was a screen recording that appeared much smoother and more professional, giving this solemn story the gravity that it deserved. Last, over-the-shoulder videos were created to capture context about the websites’ native hardware environments and the way in which a user physically interacts with each story (fig. 1). Although born-digital works are often described as intangible, they all have material elements. Software and web pages cannot exist without our laptops, iPhones, servers, or us. Over-the-shoulder videos are one way to avoid obfuscating a work’s physical reality.


Screenshots and videos are dependable forms of documentation because they are easy to create and their file types are relatively stable and compatible across most environments. In addition to being tools of preservation, Cooper Hewitt relies on them for its YouTube channel and blog posts—important forms of outreach and education. Moreover, they captured the Times stories in their desktop and mobile formats, which is more difficult using the tools described in the following. The downside, of course, is that recordings are not interactive. For this reason, they may not be the ideal representation of a work in an exhibition context. As Daisy Abbott argues in “Preserving Interaction,” “There is a danger of relying on the heritage of conservation studies and fixating on the curation of the more manageable, tangible and static aspects of the work at the expense of the more difficult (and resource-intensive) but more meaningful representations of essence” (Abbott 2012, 65). Interactivity is part of the essence of these works and a priority for Cooper Hewitt; this led the museum to explore web archiving.
Preservation Through Web Archiving
Web archiving is a process by which either a person or a bot systematically browses a website and saves all of its content and data in an archival format. As the original, live website is updated or redesigned, the web archive will remain the same and serve as a snapshot of the past. Web archiving began in earnest in 1996 with the Internet Archive, a digital library that now holds more than 600 billion websites and a platform for playback called the Wayback Machine. Web archives have a “.warc” file type that can be downloaded and stored locally but played back and viewed in a browser environment (Blumenthal 2021). Users navigate the archived site just as they would a normal web page. Web archiving is routine practice at many libraries, and in this context, a subscription-based service called Archive-It, also part of the Internet Archive, is popular for conducting automated and repeated crawls at a mass scale. Yet, for Cooper Hewitt’s small collection of media-rich, web-based artworks, some of the features of Archive-It are overly advanced, whereas others do not offer enough control.
One leading alternative is Webrecorder, a free, open-source web archiving service that comes in the form of a Chrome browser extension or a full desktop application. Unlike Archive-It, Webrecorder does not use automated crawlers. Instead, an archive is recorded as a user manually browses the desired site. This model gives a conservator more control over a capture, and it worked well for several Cooper Hewitt collection items. Similarly effective was Conifer by Rhizome, which developed out of the same initial project as Webrecorder and employs many of the same open-source components, including manual browsing. Conifer is free to use for up to 5 GB of storage, provides space for descriptive metadata, and has the option of making archives public or private. It also has the capacity to capture and play back archives in emulated legacy browsers that support Adobe Flash. This feature was instrumental for one work in the Cooper Hewitt collection titled Ten Thousand Cents (fig. 2). Created by Aaron Koblin (b. 1982) and Takashi Kawashima (b. 1985) in 2008, Ten Thousand Cents relies on the now obsolete Flash software to depict a $100 bill made up of 10,000 drawings, executed by 10,000 different people. Completely inaccessible using modern browsers, the work functioned exactly as intended when visited via Conifer’s built-in emulators.

Saving and storing a web archive of Ten Thousand Cents, however, proved more complicated. The work consists of 10,000 links; clicking on any one of its individual drawings will produce a magnified version, highlighting that particular author’s style. Because Webrecorder and Conifer only capture those squares that are manually clicked on, it would take hours to achieve a complete capture without the use of scripted or programmatic intervention in the recording. The Times stories, however, can be archived in a matter of minutes. This contrast points to a broader distinction that can be applied to many other web-based works: some are linear in structure, whereas others branch off into a multitude of possible pathways. Abbott refers to these pathways as “trajectories” and the users who explore them as “spect-actors”—viewers, as well as creators, of their own experiences. She writes, “A trajectory through an artwork is the whole user experience, the ‘narrative’ of the work as defined jointly by the work itself and its interfaces, and spect-actor knowledge and choices” (Abbott 2012, 63–64). A key aspect of documenting interactive works involves “mapping these trajectories of interaction and the reasons why the experience unfolded as it did” (Abbott 2012, 64). The number of possible trajectories within a given work can necessitate different strategies for providing public access.
Access and Accessibility
Even more sprawling than Ten Thousand Cents is a site called Watercolor Maptiles by Stamen Design. Created in 2012, in development through 2015, and maintained to the present by the design firm, it was acquired by Cooper Hewitt in 2021. It is a web-based open-source mapping tool with the textures and qualities of a hand-painted watercolor. Like Ten Thousand Cents, Watercolor Maptiles would be difficult, if not impossible, to create a web archive for because it relies on millions of image files to function, and there are as many possible user experiences of the site as there are places in the world. In this case, the best option for preserving and providing access to the work was to launch a live website (https://watercolormaps.collection.cooperhewitt.org). The original Watercolor Maptiles still exists under the Stamen domain (http://maps.stamen.com/#watercolor), and the firm is theoretically free to change, update, or even stop maintaining it. Existing in parallel is the site hosted by Cooper Hewitt, which features information about the work and its acquisition, and it will remain as the designers initially intended. With this method, interactivity and open access—hallmarks of the web—are preserved.
The challenges involved in executing such an approach are not insignificant. Cooper Hewitt’s Digital Acquisitions Working Group considered what it would entail for the Times sites. The “Subway Map” story alone calls on several JavaScript files, CSS style sheets, and a total of 81 videos of the moving map—some for desktops and others for mobile users. Upon obtaining the backend code, the museum would need to parse through it and remove those lines which point at the Times’ internal servers. It would have to be adapted to fit the Smithsonian’s strict security requirements and regularly updated and maintained by Smithsonian staff, in coordination with Cooper Hewitt. There would need to be close collaboration between the acquiring and gifting institutions, as well as Smithsonian web departments that do not normally work with art objects or collection items. Preserving Maptiles in this way was a feat and an invaluable proof of concept. The linear, easily archivable nature of the Times stories, however, meant that there was a simpler solution. A solution proposed by Small Data Industries’ Cass Fino-Radin is to use a Webrecorder tool called Embedded Replay, which allows one to embed the playback of a .warc file within another website, just as a YouTube video might be embedded in a blog post. In this scenario, an archive embedded within a Cooper Hewitt web platform would provide users with access to digital works without the experience being mediated through a third-party’s graphic interface. This approach is being used already for an upcoming loan of one of Cooper Hewitt’s interactive works to another institution.
With each of these methods–the live, parallel website and the embedded archive—the time-based work is continuously activated, and able to be accessed by virtual visitors on any day, at any moment. An institution’s engagement with the public need not stop here. In the future, Cooper Hewitt hopes to post the source code for some of its collection of web-based works on GitHub, using the platform as a transparent collections repository that will foster crowdsourced preservation solutions. There are certain barriers to this model—as Glenn Wharton points out in “Public Access in the Age of Documented Art,” some artists might feel protective of their intellectual property, whereas others may “want the public to experience their work in the gallery or online without technical knowledge about how the work was produced. This ‘black box’ approach to exhibiting preserves the mystery or magic by offering an unencumbered experience” (Wharton 2015, 187). Despite exceptions like these, “museums need to open their archives for public access to and contributions from outside sources” while respecting the wishes of individual artists (Wharton 2015, 183). Whereas traditional media rarely encourage visitor interventions, time-based artworks often feature distributed authorship and audience participation, a fact that conservators “will have to take into account and can benefit from,” by embracing “variability rather than creating a freeze state” (Dekker 2014, 82). Living, networked artworks call for a living, networked approach to their preservation.
Thus far, Cooper Hewitt’s experience with crowdsourced conservation has been a surprisingly positive one. In 2014, the museum collected an iOS app called Planetary, which visualizes a user’s personal music collection as a network of celestial bodies. The app soon became outdated and unable to run on today’s iPhones or iPads (Walthew 2022). The museum had made the source code available on GitHub, however, and during the pandemic a software developer across the globe, Kemal Enver, updated it to run seamlessly on the current iOS—a gratifying example of what Annett Dekker refers to as “networks of care” (Dekker 2014, 79). Because GitHub has built-in version control, the original code is saved and the contributor’s changes are able to be tracked as branches or merged into the main codebase. Given the varying options presented by GitHub of envisioning the relationship of new code to old, some work will be needed to negotiate how these terms translate to museum ontologies about the naming and valuing of versions, iterations, and refabrications. It is also worth noting that although Git, the version control software on which GitHub relies, is open source, GitHub as a whole is not (Barok et al. 2019, 99). Truly open-source repository managers, such as GitLab, can and should be explored as well.
Inextricably linked to the matter of access is that of inclusive accessibility. Alternative text, zoom functionality, keyboard inputs, and many other design and engineering features could enable people with disabilities, older users, and those with bandwidth restrictions to better interact with the works in Cooper Hewitt’s digital collection. The next step for the museum in evolving its digital acquisition practices will be exploring how to improve web accessibility, particularly while maintaining an artwork’s integrity and respecting a designer’s intent. Already, acquisitions like Watercolor Maptiles and Planetary demonstrate the reciprocal relationship between preservation and access; investing in the work’s accessibility will help ensure its longevity.
Conclusions
Each of the tools and strategies described here have a time and a place. Screenshots and screen recordings are essential for documenting web-based works, especially in a mobile environment. Web archiving allows for preservation through a stable file format and access through embedded replay. Full-fledged, parallel websites are an option for works that are difficult or time consuming to capture completely as an archive. GitHub holds promise as a platform for collaboration between institutions and the public. Even before the point of acquisition, the work of preservation begins, because “documentation is not a task that can be left until the ‘completion’ of a work or installation” (Abbott 2012, 68). In this case, Cooper Hewitt looked to its existing collection to make a plan for the potential incoming of a new design category—digital journalism. The hope is that through a multifaceted approach, the museum will be prepared for the unpredictability of the future, without sacrificing what is most important: access and interaction.
ACKNOWLEDGMENTS
The authors would like to thank Cass Fino-Radin of Small Data Industries for advising and assisting in the documentation and archiving of Cooper Hewitt’s web-based works. Emma Dickson also contributed to conservation of some of the works described here. We gratefully acknowledge the National Collections Program for funding this work.
REFERENCES
Abbott, Daisy. 2012. “Preserving Interaction.” In The Preservation of Complex Objects, Volume 2: Software Art, edited by Leo Konstantelos, Janet Delve, David Anderson, Clive Billenness, Drew Baker, and Milena Dobreva. Portsmouth: University of Portsmouth. 61–70. http://radar.gsa.ac.uk/2806/1/pocos_vol_2_final_release[1].pdf (accessed 08/15/22).
Blumenthal, Karl-Rainer. 2021. “The Stack: An Introduction to the WARC File.” Archive-It Blog. https://web.archive.org/web/20210415041508/https://archive-it.org/blog/post/the-stack-warc-file/ (accessed 08/15/22).
Barok, Dušan, Julie Boschat Thorez, Annet Dekker, David Gauthier, and Claudia Roeck. 2019. “Archiving Complex Digital Artworks.” Journal of the Institute of Conservation 42 (2): 94–113. https://doi.org/10.1080/19455224.2019.1604398 (accessed 08/15/22).
Dekker, Annet. 2014. “Enabling the Future, or How to Survive FOREVER: A Study of Networks, Processes, and Ambiguity in Net Art and the Need for an Expanded Practice of Conservation.” PhD dissertation. University of London. https://research.gold.ac.uk/id/eprint/11155/1/CCS_thesis_DekkerA2014.pdf.
Espenschied, Dragan. 2021. “Taking Screenshots.” Rhizome Almanac. https://almanac.rhizome.org/pages/taking-screenshots (accessed 08/15/22).
Espenschied, Dragan, and Klaus Rechert. 2018. “Fencing Apparently Infinite Objects.” Paper presented at the 15th iPres International Conference on Digital Preservation, Boston and Cambridge Massachusetts. https://osf.io/pw5dq (accessed 08/15/22).
Walthew, Jessica. 2022. “A Love Letter to Planetary.” Cooper Hewitt. www.cooperhewitt.org/2022/02/16/a-love-letter-to-planetary/ (accessed 08/15/22).
Wharton, Glenn. 2015. “Public Access in the Age of Documented Art.” Revista de História da Arte 4: 180–91. https://revistaharte.fcsh.unl.pt/rhaw4/RHAw4.pdf (accessed 08/15/22).
FURTHER READING
Dekker, Annet. 2014. “Assembling Traces, or the Conservation of Net Art.” NECSUS Traces (Spring). https://necsus-ejms.org/assembling-traces-conservation-net-art/.
Fino-Radin, Ben. 2012. “Conservation in Collections of Digital Works of Art.” Paper presented at the Electronic Media Group Session, AIC 40th Annual Meeting, Albuquerque, New Mexico. https://resources.culturalheritage.org/emg-review/volume-two-2011-2012/conservation-in-collections-of-digital-works-of-art/ (accessed 08/15/22).
Rubio, Fernando Domínguez, and Glenn Wharton. 2020. “The Work of Art in the Age of Digital Fragility.” Public Culture 32 (1): 215–45. https://doi.org/10.1215/08992363-7816365 (accessed 08/15/22).
SOURCES OF MATERIALS
Conifer by Rhizome: https://conifer.rhizome.org/
Webrecorder: https://webrecorder.net/
AUTHORS
Josephine Jenks
Mellon Foundation Fellow in Time-Based Media Art Conservation
The Conservation Center at the Institute of Fine Arts, New York University
New York, NY
jbj6526@nyu.edu
Jessica Walthew
Conservator
Cooper Hewitt, Smithsonian Design Museum
New York, NY
walthewj@si.edu
Andrea Lipps
Contemporary Design Curator
Cooper Hewitt, Smithsonian Design Museum
New York, NY
lippsa@si.edu