Digital Repositories: A Global Work Effort --- Test
Stanford University librarian Michael Keller will join other leading digital archiving experts November 14-16 in Paris for the inaugural meeting of the
Sun Preservation and Archiving Special Interest Group, a group dedicated to working on the unique problems of storage and data management, workflow, and architecture for very large digital repositories. The Sun PASIG brings together a large group of organizations for an ongoing global discussion of their research and sharing of best practices for preservation and archiving. Here, CT asks Keller for his perspectives on the effort and the goals of the Sun PASIG.
CT: What has the Sun PASIG set out to accomplish? Could you comment on its work from your own institution’s perspectives and needs?
Keller: More than 10 years ago we in the library profession began to realize that we had to take responsibility for preserving—both for the long term and for access—the digital objects that were coming to us in increasing waves and increasing numbers of flows from varying sources.
Over those 10 years a lot of developments occurred and a lot of projects started, but none of them were particularly large-scale—at least the ones that anybody can talk about. We know that the government and the secret agencies are doing a lot of big-scale gathering, but we don’t know whether they are preserving anything. [So, we need] technology both in the form of software and hardware that can manage across very complex hardware arrays, but also ingest across a very wide variety of data formats and what we might call digital genres.
At Stanford, we recognized about five or six years ago that the university was producing on the order of 40 terabytes per year, and consuming on the order of 40 terabytes per year, of various kinds of digital information. And of course that number has only increased in the intervening five years. So starting four years ago we acquired a big-tape robot and some spinning disk that was intended to help us understand how to manage the huge flow of digital objects that we need to ingest and preserve…