Digital Repositories

U Kentucky Goes Digital with Thousands of Oral Histories

During his time as governor of Kentucky in the late 1960s, the late Louie B. Nunn decided to fund a project for the University of Kentucky Libraries. The endowment was for the collection of non-partisan oral histories, and the result was the University of Kentucky Libraries Louie B. Nunn Center for Oral History.

Since its inception, the oral history center has amassed nearly 8,000 interviews. These are stories that often focus on Kentucky--its history, politics, authors, military, geography, and more. Interviews include the famous (Martin Luther King Jr., Jacqueline Kennedy Onassis, Stan Musial, Robert Penn Warren) and the not-so-famous from all walks of Kentucky life. These are not only precious recordings of personal stories that journal many aspects of the state; the recordings are also used by scholarly researchers. Historians, folklorists, anthropologists, linguists, armchair politicians, and horseracing buffs alike have found much to explore in this large, prestigious repository.

Over the years, new oral histories were increasingly recorded in digital formats. As the Nunn Center planned how to manage the digital files, they realized they really needed to start thinking about the enormous amounts of interviews stored on unstable media.

"Magnetic tape can degrade very fast," said Doug Boyd, director of the Louie B. Nunn Center. "Digitization for preservation was critical. It was a race against time."

Some were even stored on cassette tapes, which can degrade in only 10 years. So the Nunn Center and UK's Digital Programs Department teamed up on the project. The goal was not only to convert the analog recordings to digital media but also to find a way to store them economically, make them accessible to researchers around the world, and identify the technologies that could make those goals happen.

"Four folks put their heads together on how to preserve and disseminate oral histories," Boyd said. "We especially focused on ease of access--'let's make it usable and exciting and fun.'"

To that end, search functionality was crucial. The team decided the best approach was to identify a scalable solution that connected transcripts, audio/video, and the metadata with each other. Users want to be able to search both text and audio. The trick was how to get to the audio media from a full text search.

The idea was to create Web files for end-user access. But access to digital audio files alone wasn't going to be enough. It would require using metadata to describe the interviews and the context in which they took place to facilitate searching. Full-text transcripts would be useful both for researching and for searching for content and would allow for the maximum use for the content.

The process began with encoding existing, finalized transcripts. An interface was created that allows for full text of transcripts and can display time markers within a transcript. These were then hot-linked to provide access points to the related audio segments.

"The interface allows users to search a word (in a single interview or in a collection) and takes them to that moment in the transcript, in a one-minute 'chunk,' so you're never more than 59 seconds away from your search word," said Boyd.

When it came to transferring the analog to digital, one of the toughest decisions they had to make was which resolution to use for the recordings. The spoken word has fewer quality requirements than does music.

Storage was also a major consideration because they needed to store multiple copies of the digital files for backups. After making and editing the master file, they made four copies of each master and stored one copy on an external server and two on 1.2 TB external hard drives. The fourth is archived to DVD.

The team selected the University of Michigan's DLXS content management system for the online interface. UK had used this system long enough to know it had a good interface and could deliver access to very large collections. Another benefit is that the data structures are XML-based and portable to other systems and metadata formats.

The basic technologies used included XHTML; AJAX for syncing interface; MySQL for the dataset that stores interview progress state, authentication, and metadata; and PHP for pulling from the database and delivering it in the right format.

Hosting is done by the Kentuckiana Digital Library, which stores a range of historic information from various libraries and collections around the state. The Nunn Center was able to forego the financial and time investments associated with starting and maintaining their own servers.

"The sky's the limit for potential applications," said Boyd. "You can take a video of anything [and] index it, and keywords can sync to important points in the content."

He said the results are right on target. ""We get amazing feedback from researchers, thanking us for making their work easier, giving them such a precise way of dealing with oral history."

comments powered by Disqus