Digital Library Initiatives >> Next-Gen Libraries -- Campus Technology

Digital Library Initiatives >> Next-Gen Libraries

By Matt Villano
02/03/06

As digital repositories continue to evolve, keep your eye on these projects. They may serve as the models for your own digitization initiatives. Read on!

Keeping a history of peace studies. Cataloging audio and video broadcasts important to the Northwest. Digitizing images and other documents older than the state of California. All of these efforts are important steps toward preserving the history of our nation’s development. And all of them are going on right now, thanks to various digital library efforts. Digital library initiatives are nothing new; the effort to digitize data for posterity has been alive and well now for the better part of two decades. (Opening a Digital Library Campus Technology, Sept. 2005.) Still, the world of digital libraries changes furiously every month, and it seems there’s always something new to explore. Here are details on some of the newest stories, and predictions about the digital library movement in the months to come.

Teaching Peace

When it comes to modern-day religions, Quakerism isn’t exactly the first faith that comes to mind. Still, in today’s era of war and strife, the peace-loving Religious Society of Friends that was founded in England in 1660 is alive and well, its tenents taught mostly at three Indiana institutions: Goshen College, Earlham College, and Manchester College. In order to make their collections more accessible, the three schools banded together in 2003 to digitize local archives centered on peace studies. Today, the collection includes organizational records, diaries, and specialized periodicals. The effort has been dubbed the Plowshares Project.

Plowshares began out of a drive for the colleges to more closely align their resources. Earlham is Quaker, Goshen is Mennonite, and Manchester is Brethren, and these three churches traditionally comprise the Historic Peace Churches. Indiana is the only state in the country that has colleges from all three denominations, so it seemed fitting that the three should join forces and pool materials. Tom Kirk, library director and coordinator of Information Systems at Earlham, explains that the schools had met first in 2004 to discuss ways in which they could collaborate. Months later, a $13.3 million grant from the philanthropic Lilly Endowment (www.lillyendowment.org) set the plan in motion.

“The project’s focus was not on the digital library per se but on this larger institutional collaboration, of which one part was the library,” says Kirk. “Each of the three campuses is selecting materials from its own archives that reflect the history of the denominations’ work in the areas of peace and social justice.”

As Kirk explains, these materials comprise the very heart of the project itself. Over the course of eight months of image scanning in 2005, the schools added thousands of images, pages, and documents from just about every era since the peace movements began. By the time the project launched formally last month, it boasted 40,000 items, and has added even more since then. Much of the scanning has been performed by Orem, UT vendor Backstage Library Works (www.marclink.com). In addition, each school has farmed out certain scanning projects to full- and part-time staffers, including students who are paid by the hour.

“We’re hoping we can communicate with just about any digital library in the world.” —Jim DeR'est, University of Washington

Still, the project hasn’t been without its challenges. At Goshen College, Library Director Lisa Guedea Carreno says the schools have been having some trouble ironing out copyright issues as they attempt to scan items for which rights have reverted to someone other than the original holder. Another big problem has been the quality and uniformity of the metadata, which she identifies as the information used to describe the items in the library itself (see “Rethinking Metadata,” below). As Carreno outlines the problems with the project so far, she is careful to note that ultimately, the benefits of Plowshares will far outweigh all of the struggling that participating schools have endured thus far.

Rethinking Metadata

A library is nothing without access. Even digital libraries need “card catalogs” of a sort to give users a sense of what’s there and how to find it. For most digital libraries, virtual catalog cards exist in the form of metadata, or data about the content inside the library itself. As digital libraries have become increasingly sophisticated, however, creating worthwhile metadata has become a challenge.

For starters, there is the issue of having accurate metadata. If a school errs in cataloging a particular image or document, that item could be lost forever in the digital ether until someone happens to stumble on it. Then there is the question of uniformity in metadata; if one school uses one set of metadata conventions and another school uses another, the two schools cannot combine libraries without a major programming initiative to reconcile the differences.

“It’s ‘Garbage in, garbage out,’” says Lisa Guedea Carreno, library director at Goshen College (IN). “If you put in inconsistent things on the input side, the output side will be inconsistent, too.”

Fortunately, librarians nationwide are set to improve the metadata situation. Some have rallied around the Dublin Core Metadata Initiative (DCMI; www.dublincore.org), an open forum that revolves around the “extended Dublin Core model” of information—author, title, publisher, and 12 other points of identifying information. Others have called this approach too vague, and are flocking to the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH; www.openarchives.org), new standards being pushed by the Digital Library Federation (DLF; www.diglib.org) in Washington, DC.

According to Barrie Howard, program associate and principal investigator, the DLF approach consists of 21 identifiers, many of the same characteristics found in the Dublin Core. The difference is that the DLF approach also focuses on how information is written. An example: dates. Because schools now catalog the date an object was digitized in a variety of formats, DLF seeks to standardize that information. Solving such vagaries may end the metadata conundrum.

“Metadata enables resource discovery, but that’s why it’s such a big deal,” Howard says. “There’s a lot of digital scholarship that’s going on today, but if an item or piece of content can’t be found, all the metadata in the world isn’t worth a thing.”

“These are distinctive collections,” she says. “No matter what we have to go through to get them online, it’s important they’re up there so everyone can use them down the road.”

Going to the Well

While most digital repositories exist to store small, static files such as images and documents, a new project at the University of Washington has an entirely different purpose: storing big, bulky audio and video files. The DigitalWell project has been under development since 2000, when technologists set out to build a system to catalog video broadcasts from ResearchChannel (www.researchchannel.org), a media and technology organization, as well as from UWTV, the school's television station, and audio broadcasts from the school-sponsored KEXP radio station. Jim DeR'est, director of Streaming Media with the Streaming Media, Video and TV Technologies division, has overseen it all.

Today, to hear DeR'est describe “DigitalWell” is to listen to a proud father chatting about his child. He explains that after the school came up empty in a search to find an off-the-shelf program to manage its audio and video files, UW technologists got to work building their own. The result, represents a standard central computing model with clustered servers that scale as demand requires. Thanks to a proprietary underpinning system called the Storage Resource Broker (SRB), a middleware system built by the San Diego Supercomputer Center at the University of California at San Diego, DigitalWell is also interoperable with other digital library systems across academia.

“By using SRB as an underpinning, makes it possible to interact wit digital libraries anywhere in the world,” says DeR'est, who notes that with the help of this technology, UW already has been able to trade video content with the University of Canberra in Australia.

What makes SRB so compatible is the language in which it is written. Because the original DigitalWell architect had come from Microsoft (www.microsoft.com), the project was built on the Windows Operating System using .NET technology. Since then, however, versions of the initial prototype have been developed in a variety of computing languages and forms. In September, ResearchChannel and the Coalition for Networked Information (www.cni.org) began discussing the release of an open source version of the project to broaden the number of schools and other institutions that can take advantage of the technology.

With this larger effort in full swing, DeR'est has redoubled his efforts to roll out DigitalWell on his home campus at UW. The university’s School of Music has been itching for a way to digitize its ethnomusicology collection, and DigitalWell would be a perfect fit. The astronomy department has some deep-space videos that they’re hoping to get into the system, as well. And the School of Nursing has expressed a desire to convert video coverage of lectures. Then, of course, there’s the forestry department, which has inquired about collaborating with the University of São Paulo in Brazil, to digitize video content about old-growth forests and how forest fires start.

“So long as we’re preserving this stuff, we’ll look at anything and everything,” he says. “If it’s a string of bits, that’s about all that matters for us.”

The Real Enabler: Information Literacy

The flourishing digital library scene boasts great potential for widespread resource sharing and increasingly sophisticated search, retrieval, and delivery mechanisms. But, as with most technology-enabled applications, the real promise lies in just how well the technologies are used by the communities they are intended to serve. True, educational programs can be put in place to bring individuals up to speed on how to search for and access digital materials. But there are broader issues to be addressed as well, as faculty, administrators, and librarians examine the overall effects of the digital age on their institutions.

A few years ago, people spoke of “computer literacy,” which usually meant a basic familiarity and comfort level with a personal computer and mouse. Today, “information literacy” is a call to action that cuts across departmental and organizational lines and encompasses a range of issues, from access, to intellectual property, to professional roles.

Students need to be taught how to discriminate among digital search results and be aware of copyright concerns. Faculty members and librarians must decide how to work together and examine whether their roles need to be redefined: Should the librarians and technologists work alongside the faculty in designing and teaching certain courses, in order to cover information literacy effectively and make good use of digital library resources? Are instructors taking advantage of pedagogies that incorporate digital materials? And there just may be a physical aspect to the digital library after all: Learning and study spaces in libraries and other common areas are beginning to be designed for highly connected and collaborative next-gen students, our digital natives. Finally, ask: How can you assess information literacy on your campus?

Institutions are now seeing that they need well-thought-out information literacy programs. To help with the planning process, the Council of Independent Colleges (www.cic.org) runs a series of workshops on “Transformation of the College Library.” The series covers the questions and issues above and much more, and is a great place to start or keep plugging for information literacy on your campus (www.cic.org/conferences_events/index.asp).

The Big Boys

Just as the Historic Peace Churches and the University of Washington are blazing new trails for storing certain types of data, larger digital repository efforts at the University of Virginia and the University of Michigan are innovative in similar ways. At UVA, University Librarian Karin Wittenborg says her staff has worked to digitize dozens of special collections of old images and manuscripts to preserve them for generations to come. Highlights of the collection include photos from the American Civil War, documents from railroad history, and unique documents from the Appalachian heyday.

These special collections are only one portion of UVA’s digital library effort. The bulk of the library, including the school’s famous, 1,700-item Thomas Jefferson Digital Archive, is comprised of “e-texts,” or scanned content in a variety of languages that is searchable by anything from topic to specific characters and words. According to Wittenborg, a key principle behind the e-text project was to build a user community as consumers rather than simply convert or acquire digital texts. Over time, the texts were tagged using Standard Generalized Markup Language (now eXtensible Markup Language, or XML) to make them easier to search.

“We were fast out of the gate, kept our concept for the electronic center simple, and focused on user-centered collections,” Wittenborg says. “We wanted to create digital communities and to involve faculty and departments from the outset.”

A similar effort is underway at the University of Michigan, home of the digital collection, “Making of America.” The collection, accumulated over the last five years, has been drawn from images, documents, and other 19th-century publications in the UM libraries. Another big draw is the school’s digital archive of non-Euclidian mathematics monographs, also from the same period. The collection includes a variety of content types from the 1800s: science, business, and more. Perry Willett, director of the school’s Digital Library Production Service, admits the collection is a “hodgepodge,” saying the school digitizes what it can, when it can, anyway it can.

Because Willett’s department has managed to digitize so much, it recently has capitalized on the software behind the effort, for use at other schools. The open source program, Digital Library eXtension Service (DLXS), is designed for schools to create digital collections of text and images and other kinds of structured data. Similar to the

D-Space effort at the Massachusetts Institute of Technology, DLXS is stand-alone content scanning and management software that can help schools without digital libraries set them up quickly. As Willett explains, the more digital libraries UM can facilitate down the road, the better.

“There isn’t that much commercial software available for digital libraries, so we continue to develop our version, adding functionality and changing the architecture,” he says. “In our field, where things are always changing, you can never spend too much time or energy getting your resources online.”

About the Author

Matt Villano is senior contributing editor of this publication.