Home > Digital Repositories: A Global Work Effort

Interview

Digital Repositories: A Global Work Effort

A brief interview with Michael Keller

10/10/2007

Stanford University librarian Michael Keller will join other leading digital archiving experts November 14-16 in Paris for the inaugural meeting of the Sun Preservation and Archiving Special Interest Group, a group dedicated to working on the unique problems of storage and data management, workflow, and architecture for very large digital repositories. The Sun PASIG brings together a large group of organizations for an ongoing global discussion of their research and sharing of best practices for preservation and archiving. Here, CT asks Keller for his perspectives on the effort and the goals of the Sun PASIG.

 

CT: What has the Sun PASIG set out to accomplish? Could you comment on its work from your own institution’s perspectives and needs?

Keller: More than 10 years ago we in the library profession began to realize that we had to take responsibility for preserving—both for the long term and for access—the digital objects that were coming to us in increasing waves and increasing numbers of flows from varying sources.

Over those 10 years a lot of developments occurred and a lot of projects started, but none of them were particularly large-scale—at least the ones that anybody can talk about. We know that the government and the secret agencies are doing a lot of big-scale gathering, but we don’t know whether they are preserving anything. [So, we need] technology both in the form of software and hardware that can manage across very complex hardware arrays, but also ingest across a very wide variety of data formats and what we might call digital genres.

At Stanford, we recognized about five or six years ago that the university was producing on the order of 40 terabytes per year, and consuming on the order of 40 terabytes per year, of various kinds of digital information. And of course that number has only increased in the intervening five years. So starting four years ago we acquired a big-tape robot and some spinning disk that was intended to help us understand how to manage the huge flow of digital objects that we need to ingest and preserve…

What was missing was a very effective spinning disk technology. We’d experimented with a few of them, and frankly, for various reasons there were points of failure that revealed themselves in operation. In the case of Honeycomb, we’ve tested it very extensively. We’ve subjected it to all the same stress tests, and the same experiences as other technologies and it’s proven to be quite robust.

That said, we know that we have to have a combination of magnetic disk technology, near-line tape storage, and off-line tape storage—which we will have to carefully manage for the very long term, until we see different technologies becoming available to us or different technologies becoming more robust and more appropriate to the various missions we have set for ourselves.



Recommended Reading