From Immersion to Acquisition: An Overview of Virtual Reality for Time Based Media Conservators

Savannah Campbell and Mark Hellar
Electronic Media Review, Volume Six: 2019-2020

ABSTRACT

As Virtual Reality (VR) artworks are acquired and become part of museum collections, the long-term care of VR hardware, software, and media will need to be carefully considered. VR is not a narrowly defined medium. VR works can include combinations of different hardware and software components; thus, each may present different needs for long-term preservation. Additionally, the nature of one VR work can be drastically different from another in terms of not only its content but its technological makeup. A VR work can be video art, software-based art, an installation, an interactive experience for users, or a combination of any of these. As such, museums may face challenges when bringing such complex works into their permanent collections.

In anticipation of the challenges that conservators may face when VR works enter museum collections, this article aims to provide an introduction to VR technologies, including both hardware and software, and discuss potential preservation considerations. To begin, we will present an overview of the current state of VR technologies and considerations for the acquisition of works built with these platforms. The current hardware platforms will be examined, including mobile VR, Oculus, and the HTC Vive system. We will look at the three major platforms that are used to develop VR projects. This includes WebXR, an emerging standard that uses web-based technologies, as well as the popular proprietary gaming platforms that are commonly used to author VR experiences: the Unreal engine and the Unity engine. Additionally, the differences between 360-video and interactive VR projects will be articulated, along with the unique preservation considerations for each medium. A case study will be presented of a VR artwork in the Whitney Museum of American Art’s permanent collection. Finally, we will conclude by presenting an acquisition template for VR works that conservators can use as a guideline for the collection of these works.

INTRODUCTION

Virtual reality (VR) can encompass a plethora of different hardware, displays, content types, and user experiences. This article will provide an introduction to some of the technical components of VR systems and important facets to consider when acquiring and conserving these works.

VR falls under the larger umbrella category of spatial media known as Extended Reality (XR). XR encompasses a range of technologies that mix virtual and real elements, which includes VR as well as Augmented Reality (AR) and Mixed Reality (MR). While AR and MR applications overlay digital information over the user’s real-world environment, VR fully surrounds the user’s field of vision in order to place the user in a wholly virtual environment. VR is often characterized by its immersive and interactive user experiences, though VR projects can vary greatly in their level of immersiveness and interactivity. As there are some overlaps between these different technologies, some aspects of this article will be applicable to other varieties of XR. However, this study is primarily focused on VR.

Given that VR is an evolving medium with rapidly developing technology—and considering its inherent variability and complexity, as well as burgeoning presence in museum collections—this article is intended to be the beginning of a larger discourse within the field about conserving VR works and planning for their long-term care.

HARDWARE

When hearing the words “Virtual Reality,” the first image to come to mind is likely that of the VR headset, also known as the head-mounted display (HMD). This discussion of hardware will largely focus on the different types of HMDs, but it is important to note that VR artworks can be shown on other types of displays as well. For instance, Womb (2018), by Jennifer Steinkamp (b. 1958),is an interactive VR installation shown as a wall projection that the user can interact with through an HTC Vive controller. VR can also be shown as a full-room interactive projection known as a Cave Automatic Virtual Environment (CAVE), in which the visuals are projected on all four walls, the ceiling, and the floor. Additionally, VR can be shown as other types of projections, such as the three-dimensional (3D) 360-video installation Trading Futures (2016), by Ben Coonley (b. 1976), which can either be shown on an HMD or projected on the inside of a geodesic dome. Trading Futures will be further discussed as a case study. These examples show that VR can be displayed in different ways but still have similar underlying hardware, software, and content.

Head-Mounted Displays (HMDs)

HMDs are increasingly ubiquitous virtual reality platforms. HMDs are VR headsets containing a set of stereoscopic lenses, which can include built-in sensors for tracking the user’s motion and can be compatible with peripheral devices, such as controllers, haptic gloves and suits, and external sensors. There is a wide array of VR headsets currently available that offer an equally wide range of user experiences, levels of immersiveness, and degrees of interactivity. These headsets may also require additional hardware devices in order to play back content. While some HMDs are fully self-contained units that can display VR works with no additional source, others are powered by computers, mobile devices, or video game systems. Based on these dependencies, HMDs can be classified into four different categories: Mobile VR, Standalone VR, VR Systems, and Console VR (fig. 1).

Fig. 1. A chart classifying some of the current virtual reality head-mounted displays available based on their dependencies.
Fig. 1. A chart classifying some of the current virtual reality head-mounted displays available based on their dependencies.

Mobile VR refers to HMDs that require a cell phone to function. These headsets do not have a built-in screen or any internal storage. Instead, they work by placing a cell phone into the headset behind the lenses; the phone screen acts as the display and media player. Some of the common Mobile VR headsets currently available include the Samsung Gear VR, Google Cardboard, and Google Daydream View.

Standalone VR headsets are entirely self-contained units that do not require any other devices (such as computers or mobile devices) to function. These HMDs have internal storage and built-in screens. They do not require any cables that tether the user to an external computer. Thus, with standalone VR devices, the user is able to have complete freedom of movement. Standalone VR headsets include the Oculus Go, Oculus Quest, and the HTC Vive Focus.

VR systems encompass all headsets that require a computer, typically a powerful PC, to function. These represent the most high-resolution, immersive VR experiences currently available on HMDs. Since the computer is transmitting the audiovisual information to the headset, these devices connect to a computer via a wire, and the user can move only as far as the tether allows, often in a defined physical space delineated with external sensors. VR systems include the Oculus Rift and the HTC Vive.

Finally, there are also console-based VR headsets. These HMDs are compatible with video game systems. For example, Sony offers the PlayStation VR (PSVR) headset for use with the PlayStation 4 console, and Nintendo has the Labo VR kit, which transforms the Switch console into a giant cardboard HMD.

Degrees of Freedom

The wide array of VR headsets available allows for a variety of different VR content, from more passive viewing experiences to highly interactive and immersive environments. Here, one key concept in terms of hardware is that different headsets allow the user to have different ranges of movement within the virtual space. This is known as Degrees of Freedom, or DoF. Headsets can either be three degrees of freedom (3-DoF) or six degrees of freedom (6-DoF). With a 3-DoF headset, the user is able to move one’s head and look around in every direction but is otherwise rooted to a fixed point in space and cannot move freely in VR. With 6-DoF, a user is able to actually walk around the virtual space, usually with the aid of external sensors in the real world mapping the user’s movements into the virtual one.

Of the HMDs discussed here, all of the mobile VR headsets, the standalone Oculus Go, and the Nintendo Labo are capable of displaying only 3-DoF content. All of the more powerful VR systems—such as the Oculus Rift and HTC Vive, as well as the standalone Oculus Quest and HTC Vive Go and the console-dependent PSVR—are able to run both 3-Dof and 6-DoF content.

VR CONTENT TYPES

The myriad of VR displays can be used to experience a similarly wide array of VR content. VR can encompass many different media types and file formats. Next, several of the different types of VR content will be discussed. VR can be a software program that renders in real time, a web-based experience, or a 360-degree video that plays in a media player. Depending on the technical makeup of the VR artwork being acquired, a collecting institution may be receiving different kinds of files. The following is a general overview of types of VR content and what a museum should be prepared to acquire when one of these works enters a collection.

SOFTWARE-BASED VR (REAL-TIME 3D)

When receiving the software assets of a VR artwork, there are some things to consider. A work of VR software typically starts as an uncompiled software project that contains a variety of digital assets. These assets can include 3D models, images, audio and video files, and the computer source code that ties them together. Additionally, these project files frequently contain external code libraries. An example might be the Oculus software development kit (SDK) for Windows, which contains prebuilt functionality for creating content for the Oculus Rift VR hardware. The software project files are usually built in a suite of software meant for creating video games, known as a game engine. Frequently, a library is used to add VR hardware functionality to the game engine it is being created in.

The project file is where the creators build the work—they can adjust it, modify it, and get it to work correctly for their artistic vision. One can think of this project file as the canvas. Once the project is ready, it gets compiled into a binary file. The binary version is an efficient computer-readable version of the application.

In looking forward through a conservation lens, it is essential to understand that acquiring the uncompiled project file along with the included assets will offer a great deal of flexibility in the future for migration of the software. There are some possibilities to reverse engineer a binary file, but it is difficult. Obtaining the uncompiled project file with its source code will offer much more flexibility in future conservation exercises when the work needs to adapt to new hardware.

DEVELOPMENT PLATFORMS

As mentioned earlier, software-based VR projects are typically created in a video game engine. A game engine is a software development environment designed for people to build video games or other 3D and two-dimensional (2D) media projects. There are many different game engines for developing VR content, with the two currently most prevalent being Unity and Unreal. As of 2019, the Unity Engine is statistically the most popular engine for creating VR content. Unity is a cross-platform game engine developed by Unity Technologies. It was first released in June 2005. The Unreal Engine is a close competitor. The Unreal Engine is a game engine developed by Epic Games, first showcased in the 1998 first-person shooter game Unreal.

These engines have built-in programming languages that allow for adding logic and interactivity to gaming and VR projects. Unity supports the C# programming language, which is a general-purpose programming language created by Microsoft. Unreal Engine uses the C++ language.

The game engines have a specific version number and get updated frequently to add new features. It is essential to note the version of the engine that the artwork was created in as it may not compile in a different version.

Additionally, there is a newer web-based standard called WebXR, to be discussed shortly, which supports the creation of VR using web-based technologies. This platform is typically programmed in the Javascript language and other web-native technologies.

When acquiring a VR artwork, it is essential to take note of the development platform in order to adapt the code in the future, if needed. An institution should also take stock of the supporting software libraries that were used to build the project.

Unity and Unreal produce binary files that typically come in the form of a Windows executable (EXE) file for desktop VR platforms and Android package (APK) file for Android-based VR systems, which are typically found in mobile and standalone VR systems. These binary files are stored and executed on the VR hardware; for desktop VR that is the host PC and for mobile and standalone systems, the binary exists on the device’s local embedded storage. For WebXR-based projects, the files exist on a remote server and are delivered over the network via a URL, then rendered to the VR hardware through a compatible web browser.

OpenXR and Cross-Platform Compatibility

The plethora of VR hardware that is out currently has created a challenge of cross-compatibility for VR projects across the various hardware platforms. For example, the Oculus platform has its own software libraries for the Unity and Unreal Engines. This library is not compatible with the competing HTC Vive platform and other non-Oculus headsets. Because of this hardware fragmentation, standards are being ratified to ensure cross-compatibility for VR content across devices. OpenXR is an open, royalty-free SDK for access to VR and AR platforms and devices. It is developed by a working group managed by the Khronos Group consortium.

According to Brent E. Insko, Lead XR Architect at Intel and OpenXR Working Group Chair:

OpenXR seeks to simplify AR/VR software development, enabling applications to reach a wider array of hardware platforms without having to port or re-write their code and subsequently allowing platform vendors supporting OpenXR access to more applications. With the release of the OpenXR 1.0 specification, AR/VR developers can now create true cross-platform XR experiences. [https://www.khronos.org/openxr].

In addition to the OpenXR initiative, there are similar efforts for platform unification, such as Open Source Virtual Reality (OSVR). OSVR is an open-source software project that aims to enable headsets and game controllers from all vendors. The OSVR platform allows VR developers to detect, configure, and operate VR devices across a wide range of operating systems.

Another platform is OpenVR, which is an application programming interface (API) and runtime that allows access to VR hardware from multiple vendors without requiring that applications have specific knowledge of the hardware that they are targeting. It comes in the form of an SDK that contains the API and samples. The runtime is under SteamVR in Tools on the Steam gaming platform from the Valve software corporation.

In summary, several software libraries add VR functionality to video game development platforms such as Unity and Unreal Engine. When collecting a software-based VR artwork, one should have an awareness of the libraries used and understand whether they are limited to one hardware platform, such as Oculus Rift, or are based on open standards and operate across multiple platforms.

WebXR

Building on the ideas of OpenXR is a specification called WebXR. WebXR is a specification that offers access to VR content across hardware platforms using web-based technologies.

The WebXR Device API provides access to input and output capabilities commonly associated with VR and AR devices. It allows one to develop and host VR and AR experiences on the web. WebXR content is commonly built with the Javascript programming language.

There are several software packages and platforms that WebXR content is reliant on to function; the main ones at this writing are the following:

WebGL (Web Graphics Library) is a JavaScript API for rendering interactive 3D and 2D graphics within any compatible web browser without the use of plug-ins. Before WebGL, access to that specialized hardware was limited to desktop software. WebGL itself does not have any VR functionality built in.

Three.js is a Javascript library that uses WebGL to draw 3D. WebGL is a very-low-level system that only draws points, lines, and triangles. To do anything useful with WebGL generally requires quite a large amount of code. This library handles content such as scenes, lights, shadows, materials, textures, and 3D math, all of which would have to be written from scratch if using WebGL directly. Three.js is mainly used for generating web-based 3D content; however, recent versions have added features for supporting VR hardware.

A-Frame is a web framework for building VR experiences. A-Frame is built on top of HTML and the Three.js library. It is a VR first library that supports all of the current VR hardware platforms. Because of its HTML-like syntax, it is simple to get started building projects.

glTF(Graphics Library Transmission Format) is a royalty-free specification for the efficient transmission and loading of 3D scenes and models by applications. Known as the JPEG of 3D, glTF minimizes both the size of 3D assets and the processing required to unpack and render the assets. glTF defines an extensible, standard publishing format for 3D content. As WebXR projects are rendered in VR over the web rather than being run from a local hard drive, an efficient way to deliver the content is needed. The glTF file format serves that purpose.

WebXR projects are delivered over a network such as the Internet from a remote server. The acquisition process should include collecting the Javascript code and other code, libraries, and 3D assets used in doing the project. Additionally, as the projects are delivered via a URL, one should explore whether the domain name for that URL is essential and what are the conditions for hosting it on the collecting institution’s web servers if needed.

360-VIDEO

360-video refers to a video that plays not just on a flat surface but in all directions surrounding the viewer. For this reason, 360-video can also be referred to as spherical video. 360-videos are typically experienced through a VR headset, though they can also be displayed as projections or within a web browser, where the user can point and click to change the view. Though 360-video can have interactive elements, it is generally more of a passive viewing experience than the more immersive software-based VR artworks. 360-video is 3-DoF, which means that the user is rooted to a fixed point in the virtual space but can turn one’s head around and watch the video in all directions.

With 360-video artworks, the component being acquired will be a video file. It should be noted that these works require specific 360-media players in order to be viewed, as they will not render correctly if playback is attempted with a regular video player. There are a variety of applications available that support playback on a computer desktop, within a VR headset, or both. Some examples of 360-video-compatible media players are the GoPro VR Player and the Whirligig Media Player. The VLC Media Player also supports 360-video playback in versions 3.0 and above.

360-video generally uses the same kinds of codecs (such as h.264 and ProRes) and container formats (such as .mov, .mp4, and .mkv) as standard video files. However, from a conservation standpoint, it is important to know that 360-videos have several key differences from regular videos and can also vary in terms of their file characteristics. Understanding the technical structure of a particular 360-video will be integral to its long-term preservation. 360-video can be either monoscopic (2D) or stereoscopic (3D), and files can carry 3D formatting information in a couple of different ways. Another key facet of a 360-video file is that even though the video displays spherically, the files are stored as flat video information and require a 360-media player in order to decode them and display them as a sphere. This is accomplished through projection mapping, also known as sphere-to-plane projection, and there are a number of different ways this can be mapped within the file.

3D Formatting

For 3D 360-videos, the frame will be divided in half because the video information gets sent to the viewer’s left eye and right eye independently in order to achieve the stereoscopic effect. 360-video files can carry 3D information in different ways. The file can be formatted in a top/bottom configuration, with the left-eye video stacked on top of the right-eye video. Alternatively, the video can be formatted side by side, with the video on the left going to the left eye and the video on the right going to the right eye (fig. 2).

Fig. 2. This image depicts how 3D video is encoded in the side-by-side and top/bottom formats.
Fig. 2. This image depicts how 3D video is encoded in the side-by-side and top/bottom formats.

With top/bottom formatting, the two videos share the overall vertical resolution. With side-by-side formatting, the frame is divided so that the overall horizontal resolution is evenly split among the left-eye and right-eye videos. The 360-media player will decode this information and output the proper images through the left and right lenses of the VR headset, creating a unified 3D image for the viewer. 

Projection Mapping

In terms of projection mappings, the 360-video can be stored flat and decoded as a sphere through a number of means. It is impossible to transform a sphere into a flat object without creating some kind of distortion, and different projection mappings will distort the video in different ways. As of this writing, the two most common projection mappings are equirectangular and cube maps.

In an equirectangular projection, the spherical video is flattened out in the way that a globe of the earth would theoretically be unrolled out into a flat map. Equirectangular videos will exhibit distortion due to uneven pixel distribution. More pixels will be concentrated at the top and bottom of the frame, and fewer pixels will be in the center of the video. When viewing an equirectangular video in a flattened state, lines that will appear straight when played back in VR will look curved.

With a cube map, the spherical video gets squashed into a cube shape; then, the cube is unfolded to store flat. Here, the pixels are more evenly distributed across the video than with an equirectangular projection, though distortion will appear along the seams/edges of the cube.

There are other proprietary projection mappings, such as Google’s equiangular cube map (which aims to combine the concepts of equirectangular maps and cube maps to create a less distorted image with more evenly distributed pixels), and FaceBook’s pyramid projection map. As these projection mappings are proprietary and newly emerging from their respective developers, they may be less commonly found in current VR artworks, though it is possible that they may become more prevalent in the future.

VR AUDIO

Though the conversation thus far as primarily covered the visual aspect of VR, it is also essential to consider the audio component of a VR artwork. For both 360-video and software-based VR projects, the sound can be traditional stereo audio or it can be ambisonic. In the same way that the VR video or generated software environment is spherical, the audio mapping can be as well. Ambisonic audio is spatial audio—when users move their head in one direction or another, the relative loudness or quietness of soundscape components will change to reflect that in the same way that human hearing works in reality. Immersive VR software projects can take this a step further—they have ambisonic audio attached to different objects or specific areas within the virtual environment so that objects will sound closer or farther away from users as they move around the space.

CASE STUDY: A 360-VIDEO INSTALLATION

The focus of this article has largely been on the technical characteristics of VR systems and content types, any of which can be a significant property of a VR artwork. To look at a practical example of this information in a conservation context, a specific VR artwork will now be examined. In 2016, the Whitney Museum of American Art showed Ben Coonley’s 3D 360-video work Trading Futures (2016) in the exhibition “Dreamlands: Immersive Cinema and Art (1905–2016)” and subsequently accessioned the work into their permanent collection.

Trading Futures is variable in terms of its playback hardware, as it can be exhibited in one of two ways: either with a VR headset or as a projection inside a dome-shaped structure. In “Dreamlands,” Trading Futures was displayed as a video projection inside a cardboard geodesic dome. Visitors were provided with active-shutter 3D glasses to wear inside the dome and could view the 360-degree video projected on the dome’s ceiling in a planetarium-style manner (figs. 3, 4).

Fig. 3. Ben Coonley (b. 1976), Trading Futures, 2016; installation view, “Dreamlands: Immersive Cinema and Art, 1905–2016,” Whitney Museum of American Art, New York, October 28, 2016 to February 5, 2017. 360-degree 3D video (color, sound, 11 min.) projected onto a spherical mirror in a geodesic cardboard dome. (Whitney Museum of American Art, New York; purchase with funds from the Film, Video, and New Media Committee 2017.13. © Ben Coonley. Photograph by Ron Amstutz)
Fig. 3. Ben Coonley (b. 1976), Trading Futures, 2016; installation view, “Dreamlands: Immersive Cinema and Art, 1905–2016,” Whitney Museum of American Art, New York, October 28, 2016 to February 5, 2017. 360-degree 3D video (color, sound, 11 min.) projected onto a spherical mirror in a geodesic cardboard dome. (Whitney Museum of American Art, New York; purchase with funds from the Film, Video, and New Media Committee 2017.13. © Ben Coonley. Photograph by Ron Amstutz)
Fig. 4. Trading Futures, as installed in “Dreamlands: Immersive Cinema and Art, 1905–2016,” Whitney Museum of American Art, New York, October 28, 2016 to February 5, 2017. (Photograph by Ron Amstutz)
Fig. 4. Trading Futures, as installed in “Dreamlands: Immersive Cinema and Art, 1905–2016,” Whitney Museum of American Art, New York, October 28, 2016 to February 5, 2017. (Photograph by Ron Amstutz)

Taking a look at the video files that make up Trading Futures, they can be seen as an example of possible technical characteristics of a 360-video file. Upon acquisition, the Whitney received several different types of video files for Trading Futures from the artist. This included ProRes/MOV native files, h.264/MP4 exhibition files, a cyan-red anaglyph for internal previewing purposes only, and copies on Blu-ray 3D. Exemplified here, both the native and exhibition files for Trading Futures use codec and container formats that are common to regular files that are not 360-video. Two areas where these files differ is in their 3D formatting and their projection mapping.

Like all 360-video files, those for Trading Futures must be viewed in a 360-video player application in order to render and display properly. However, attempting to view them in a standard video player can serve to illustrate some of their key file characteristics. Depicted in Figure 5 is a still of the h.264/MP4 exhibition file for Trading Futures when opened with QuickTime X, a player that is not compatible with 360-video.

Fig. 5. Ben Coonley (b. 1976), Trading Futures, 2016. 360-degree 3D video (color, sound, 11 min.) projected onto a spherical mirror in a geodesic cardboard dome, or with virtual reality headsets. (Whitney Museum of American Art, New York; purchase with funds from the Film, Video, and New Media Committee 2017.13. © Ben Coonley)
Fig. 5. Ben Coonley (b. 1976), Trading Futures, 2016. 360-degree 3D video (color, sound, 11 min.) projected onto a spherical mirror in a geodesic cardboard dome, or with virtual reality headsets. (Whitney Museum of American Art, New York; purchase with funds from the Film, Video, and New Media Committee 2017.13. © Ben Coonley)

As the layout of the two video streams suggests, this file carries its 3D video information in the top/bottom format. The video viewed through the left eye is on top; the right-eye video is on the bottom. Viewing the file in QuickTime X also shows that the work has an equirectangular projection mapping. Note the distortion and how straight lines, such as the orange lines in the background, appear curved here. When played back in VR using a media player compatible with 360-video, these lines will appear straight.

VR artworks have a myriad of technical characteristics that need to be thoroughly documented in order to preserve and reinstall them in the future. An example such as Trading Futures is extraordinarily complicated due to its multiple display formats (as a dome projection or viewed through a VR headset). In order to aid conservators in documenting the hardware, software, file specifications, and other critical features of VR artworks, the authors of this article have drafted a template for documenting VR works, ideally upon acquisition.

AN ACQUISITION TEMPLATE FOR VIRTUAL REALITY ARTWORKS

Given the diversity of VR hardware and media formats—not to mention software dependencies, peripheral devices, and display formats—there is a large amount of data to collect when documenting a VR artwork. In order to provide an organizational structure for and base guideline of information to collect upon acquisition, a Virtual Reality Acquisition Template has been drafted.

This template came out of a Virtual Reality Preservation Summit hosted by the Tate Modern in March of 2019. This was a collaborative day of presentations on various facets of VR systems, including examinations of two VR artworks. The group that convened at the summit has since formed the Preserving Immersive Media Group for continued collaboration on addressing the needs of VR artworks and other spatial media.

The initial template draft was further built on and revised by the authors of this article. This version can be found as a downloadable Google doc here: http://bit.ly/VRacquisitiontemplate or through the Preserving Immersive Media Group’s Github page here: https://github.com/PreservingImmersiveMedia/Acquisition-Template

ACKNOWLEDGMENTS

Our thanks to Tate time-based media conservators Jack McConchie and Tom Ensom, as well as the members on the Preserving Immersive Media Group.

CONTACT INFORMATION

Savannah Campbell
Video and Digital Media Preservation Specialist, Media Preservation Initiative
Whitney Museum of American Art
Savannah_Campbell@whitney.org

Mark Hellar
Owner
Hellar Studios LLC.
mark@hellarstudios.com