The Electronic Media Review, Volume Four: 2015-2016
In the spirit of the 2015 AIC conference theme, “Practical Philosophy or Making Conservation Work,” this paper will discuss low-cost, practical methods for both the testing of error rates in audio compact discs and the accurate extraction of linear pulse code modulated audio streams contained within those discs. The methodologies are derived from a case study of the New York Public Radio archives, in which they devised a system to mass-extract audio, while at the same time collected block error rates, E22, and E32 error information from a statistically significant sampling of approximately 2,400 Mitsui-brand silver compact discs.
DEFINING A PROBLEM
The New York Public Radio (NYPR) archives holds a collection of approximately 25,000 Compact Disc-Digital Audio (CD-DA) items. These CD-DAs were primarily created between 2000-2008 and mainly consist of airchecks of NYPR-produced original programming. Prior to 2000, many shows recorded airchecks to ¼” open reel audio tape, and after 2008, the radio station implemented a born-digital workflow, making the burning of CDs for the purposes of archiving and future access unnecessary. For all intents and purposes, these 25,000 CD-DAs are the master—and often only—copies of broadcast audio recordings during those years.1
As a regular part of the daily workflow, station archivists would pull CD-DAs off the vault shelves and rip the audio individually using Plextools Utilities and Plextor-brand drives, a process that took a considerable amount of time, cost, and effort. In early 2009, staff began noticing problems with both the playback and extraction of audio on these CDs. For example, Plextools Utilities would crash or stall during the extraction process, or the discs would sometimes not play using a standard Windows Media Player. Occasionally, so many errors were encountered that the normal software processes of error correction and concealment didn’t completely work (meaning there was audible digital distortion upon playback of the extracted WAVE file, even though the discs may have been successfully ripped).
The NYPR Archives was confident these extraction problems were not the result of faulty software or hardware, but rather these failures were most likely the result of a decaying photosensitive dye layer that holds the disc’s digital information (Wade and Youket 2012). The observations from archives staff about CD-DA decay was surprising because the discs were relatively young—some only 5–7 years old. Previously published studies had estimated a 30 year or more life expectancy for similar materials (Zheng and Slattery 2007).2
The archives staff began to worry that without a quick and affordable batch-extraction process, the audio discs would decay faster than the archives could rip them. Plextools, although accurate, could not be integrated into an automated process and with 25,000 discs on the shelf, a more expedient solution was necessary. After some research, the archives purchased a low-cost commercial ripping machine called the MFDigital Ripstation. The machine is used by several archives institutions that hold collections of rapidly decaying CD-DAs.
There are some features of the Ripstation that lend itself well to mass audio extraction. Depending on the model, the operator can load up to 200 discs in a single session and, using proprietary software, extract audio into single, concatenated WAVE files using a file naming convention of the operator’s choice. The Ripstation software allows a user to set a maximum rip speed of up to five times the data rate of the audio on the CD (fig. 1). This function allows the user to reject discs with which the software encounters a problem, and the optical drive subsequently slows down to get a better read. If a CD-DA was rejected during the process, the operator set it aside and used the older Plextools disc extraction solution, which had more robust extraction features. Initially, NYPR was experiencing a near 10% rejection rate from the Ripstation machine when the software was set at a five-times read threshold.3
ERROR TEST RESULTS
Because the Ripstation software did not have the kind of nuanced audio extraction capabilities that the slower Plextools workflow offered, the archives felt a need to capture more information about the discs as they were being processed. By testing a sample of the discs ripped by the Ripstation, the error reports could act as a check against the possibility that the machine was creating WAVE files with audible distortion (by allowing significantly damaged discs to pass through the rip process). Plextools Utilities offers both error reporting functionality and extraction capabilities. If the Ripstation had created a valid WAVE file, but Plextools Utilities reporting indicated significant error rates, then the archives would look at the disc with closer scrutiny. Throughout the ripping of 2,400 CD-DAs, this second step of error reporting was applied to 20% or 480 discs. We feel that the CD-DAs in our collection provided us with an interesting control group. The discs had been stored in optimal environmental conditions for the previous 10–15 years, they were initially burned the same way (real time), and on the same professional equipment by the same two or three engineers. The archives thought that if it could eliminate a majority of the human/environmental reasons discs fail, a better understanding of the intrinsic difficulties of the discs would emerge.
Plextools Utilities offers a number of error reporting features for audio compact discs. The NYPR archives chose the most widely used error report, which measures three main error types: BLER, or the block lever error rate, E22 errors, and E32 errors.
BLER refers to the number of blocks of data that have at least one occurrence of incorrect data. BLER is quantified as the rate of errors per second and it is the main metric when testing audio discs during the master and manufacturing process. E22 errors are correctable errors typically caused by a high degree of clustered errors. E22 errors can often occur if there is a physical defect on a disc like a scratch or if there is an accumulation of dust or other particles on the disc surface. Although E22 errors are technically correctable, they can indicate damage more severe than BLER alone. Large amounts of E22 errors may mean the disc is approaching unreadability. E32 errors, on the other hand, indicate uncorrectable errors. The occurrence of E32 errors means that data has been lost and is unrecoverable (Pohlmann 2005).
In 2009, the International Association of Sound and Audiovisual Archives (IASA) published Guidelines on the Production and Preservation of Digital Audio Objects (Bradley). Chapter eight of the text outlines preservation best practices for optical media, and includes recommended BLER, E22, and E32 error thresholds. Table 1 illustrates IASA’s specifications for the maximum allowable errors for an archival CD-DA for BLER, E22, and E32 errors.
After running tests on the 480 CD-DAs, the archives discovered that no disc in the sample set passed the above requirements. In taken as an aggregate, the CD-DAs NYPR tested had results as outlined in Table 2.
Table 2 shows that on average, the CD-DAs in the NYPR collection far exceed the maximum allowable error rates for CD-DAs, as specified by IASA. Additionally, when the errors per CD-DA were charted out on a scatter plot, there also appeared to be a general overall trend toward more errors over time, even with a span as long as three years (fig. 2).
Another conclusion from the data was that, although there appeared to be a general trend toward more errors over time, there was also a wide variation from one CD to the next. The kind of drastic variation between CD-DA errors from day to day was an important factor to consider. Fig. 3 shows sample screenshots of BLER, E22, and E32 error reports generated from Plextools Utilities of CD-DAs.
UNDERSTANDING THE MEANING OF ERROR REPORTS
Measuring the collected error rate data against IASA standards can lead to some conflicting conclusions about the general health of CD-DAs. Strictly speaking, the NYPR CD-DAs had failed to meet the IASA recommended standards, but practically speaking, most degraded discs were playable on professional CD audio decks. Perhaps more importantly, in a majority of cases, the archives staff was able to successfully create WAVE files from some of the most heavily damaged discs.
This distinction between the ability of CD players to play back serviceable audio streams from degraded discs and the integrity of a disc as an archival container can be complicated. The Archives was surprised at how often error tests would reveal large clusters of uncorrectable errors, yet those errors were undetectable by the human ear upon playback. When audio disc playback equipment hides errors by concealment methods like interpolation, muting, and duplication it can make for better listening, but it can also mask disc instability. As a result, an audio disc could sound perfectly fine one day, and the next, it could no longer function because error rates have suddenly exceeded a threshold of playability.
At the end of the project, the archives found that the rate of failed CD-DAs far exceeded the predicted shelf life of previous studies. We were surprised by both how badly some audio discs had failed and by the wildly deviant errors reports from one disc to the next. The Archive’s ability to extract audio from poorly damaged discs only seemed to belie the urgency for their preservation. By the end of the study, we came away more concerned about the fragile nature of CD-DAs.
One of the hurdles for small archives working with large CD-DA collections is that highly accurate testing devices are expensive. High quality production testers can cost anywhere between $30,000-$50,000 dollars. Additionally, IASA advises against the low-cost tools, like Plextools Utilities, that were utilized in this study (Bradley 2009). Plextools Utilities itself also includes a disclaimer stating that results using the tool can vary. This warning appears upon startup of the software: “The results of the tests may differ from system to system, and should always be viewed in context, like test environment, used hardware, software, media, etc.”
This study does not aim to diminish or ignore the warnings and recommendations of IASA or Plextools, rather the Archives believes that despite the potential for inaccurate results, there still exists benefits for utilizing low-cost solutions in any preservation workflow. For example, during the testing process, Plextools could immediately identify a CD-DA that was either blank or had not been finalized during the initial burn. The utility could also identify CD-DAs that contained very large bursts of errors. Perhaps most importantly, a highly accurate error report generated by an expensive analytics tool would not have affected the preservation action plan, which would have always been to extract the audio using the most robust methods available.
There is no commercial solution currently on the market that offers an audio archivist both robust error testing of CD-DAs with simultaneous batch audio disc extraction. The NYPR Archives would like to explore the development of such a tool. The archives would also like to launch a study comparing multiple audio disc testing solutions across several price ranges to determine exactly how variant error results are between CD-DA analytic tools.
1. Compact disc digital audio (CDDA or CD-DA) is the format for audio compact discs. There are many format specifications of compact discs, each with its own preservation considerations. This paper will refer to audio compact discs, discs, and CD-DAs throughout. All are different terms describing the same format. In the radio industry, an aircheck is generally a demonstration recording, often intended to show off the talent of an announcer or programmer to a prospective employer, but mainly intended for legal archiving purposes, rather than broadcast
2. There is a general consensus among audio archivists that Plextor-brand drives, especially drives manufactured in the US prior to 2006, have better functionality than competing manufacturers. It is also the general consensus that CD-only drives, rather than drives that handle multiple optical disc formats, perform better read/write functionality for CD-DA preservation. “Ripping” is an informal but commonly used term to describe digital audio extraction (DAE).
3. Lower read speeds can provide more accurate rips. Plextools offers substantially more flexible options than the Ripstation audio extraction software, including the ability to rip in real time, and a setting to recover the best bytes and least errors per sector. Quality extraction software uses a complex matrix of methods to ensure accurate reads across a given disc.
Bradley, K., ed. 2009. Standards, Recommended Practices and Strategies: Guidelines on the Production and Preservation of Digital Audio Objects. IASA-TC04. 2nd ed. www.iasa-web.org/tc04/audio-preservation (accessed 09/01/15).
Bradley, K., ed. 2009. Guidelines on the Production and Preservation of Digital Audio Objects IASA-TC04. 2nd ed. www.iasa-web.org/tc04/errors-life-expectancy-and-testing-and-analysispreservation (accessed 09/01/15).
Pohlmann, K. 2000. Principles of Digital Audio. New York: McGraw-Hill Professional.
Wade, J.A., and M. Youket. 2012. Characterizing Optical Disc Longevity at the Library of Congress. The Electronic Media Review 1:97–105.
Zheng, J. and O. Slattery. 2007. NIST/Library of Congress join optical disc longevity study. www.loc.gov/preservation/resources/rt/NIST_LC_OpticalDiscLongevity.pdf (accessed 09/01/15).
Baert, L., and L. Theunissen, eds. 1998. Digital Audio and Compact Disc Technology.London: Focal Press.
Duryee, Alexander. 2014. An Introduction to Optical Media Preservation. http://journal.code4lib.org/articles/9581 (accessed 01/09/15).
Pohlmann, K. 1989. The Compact Disc: A Handbook of Theory and Use. Madison: A-R Editions, Inc.
New York Public Radio
160 Varick Street
New York, NY 10013