The Digital Library Triumvirate: Content, Collaboration, and Technology

A single point of access to digital materials for historical and cultural scholarship, the Japanese American Relocation Digital Archive (JARDA) is a model of a collaboratively constructed gateway to the documentation of an era.

Few events of World War II had as dramatic an effect within the United States or upon a population as did Executive Order #9066, which resulted in the evacuation and internment of thousands of Japanese Americans. The War Relocation Authority (WRA), the agency created to assume jurisdiction over the evacuees, controlled the relocation centers, alternatively labeled "relocation camps," "concentration camps," or "evacuation centers." These WRA camps housed more than 120,000 Japanese Americans for over four years.

The experiences of the adults, young adults, and children—the majority of whom were U.S. citizens—in internment camps during the war is richly documented in the holdings of several California repositories. Among the 10,000 images and 20,000 pages of electronic transcriptions of documents and oral histories are diaries, letters, drawings, and photographs from internees at the 11 camps. Also included are camp newsletters, final reports, photographs, and other documents relating to the day-to-day administration of the camps by the WRA.

In September 1998, the members of the California Digital Library's Online Archive of California (OAC) identified the need to develop digital archives for "thematic collections," and thus was born the Japanese American Relocation Digital Archive (JARDA). As with several CDL projects, JARDA is supported by both university and grant funds. JARDA development was supported by the U.S. Institute of Museum and Library Services under the provisions of the Library Services and Technology Act, administered in California by the State Librarian.

Focus on Collaboration

Academic digital library initiatives range from the digitization of specific collections of materials to elaborate state, national, or international organizations devoted to developing innovative technologies for providing information services. Both modest and large-scale projects have high-quality digital content—and its selection, design, management, preservation, and tools to ensure its discovery and availability—at their core. A scan of the activities of the 20+ members of the Digital Library Federation provides an informative overview of the scope of digital library activities (see www.diglib.org/).

To that focus on high-quality content, the California Digital Library has added a particularly strong focus on collaboration. From its inception in 1997 as a "co-library" of the ten campuses of the University of California, the CDL has embraced collaboration both within and well beyond the university as essential to the pursuit of its goals. Collaboratively designed and built, JARDA serves as a gateway to the personal and institutional documentation of an era and also to scholarship mined from newly available digital cultural heritage materials.

JARDA collaborators currently include the California Historical Society; the California State Archives, California State University at Fullerton; the Japanese American National Museum, UC Berkeley's Bancroft Library, UCLA's Young Research Library Department of Special Collections, the University of the Pacific (UOP); and the University of Southern California (USC). These institutions, along with nearly forty others, are all members of the CDL's Online Archive of California (www.oac.cdlib.org/), of which JARDA is but one component.

"Selection of the sites of the centers considered farming opportunities to assist in the sustenance of the evacuees without drawing on the outside farm production. In nearly all cases it required subjugation of untried farming land, clearing of brush and rocks, building canals and irrigation ditches, turning over new soil."

—Robert B. Cozzen, assistant director, War Relocation Authority, February 26, 1946. Photographer: Dorothea Lange. Manzanar, California

Content: Primary Source

Materials To build JARDA, the CDL began with a survey of primary source materials in California-based libraries and archives. The survey identified the heavily used Japanese American internment collections throughout the state as a high priority for digitization. Curators at all Online Archive of California (OAC) repositories reported that Japanese American internment collections are among the most-requested materials, but the fact that they are so widely scattered has presented a serious discovery-and-access problem.

The JARDA gateway provides a single point of access for college and university students and faculty, students and teachers in kindergarten through high school, and the general public. Both the UCLA and UC Berkeley Libraries are actively engaged in projects that have demonstrated the importance of incorporating primary sources into the curriculum. These projects have confirmed the need to make Japanese American Relocation collections more accessible to the community. The OAC's statewide union database of finding aids provides the means to integrate collections in a Web-accessible "virtual archive," forming the premier source of digital documentation on the subject.

JARDA is the first thematic collection added to the larger Online Archive of California. The strategy of building thematic "virtual collections" (other California cultural communities are also under consideration) is a deliberate strategy not only to make high-demand materials available in digital form, but also to encourage new analysis and scholarship drawn from those materials to flourish. The ideal digital library leverages collaboration to provide access to high-quality content and also encourages, indeed enables, the creation of new scholarship. The CDL's eScholarship (http://escholarship.cdlib.org/) program of supporting innovation in scholarly communication is based upon that belief. eScholarship is working with UC Press and JARDA participants to identify new scholarly publications associated with Japanese American Relocation materials.

The CDL and its partners are enthusiastic about the scholarship that can be supported by this single access point for geographically dispersed source materials. For example, a history student sitting in her dorm room can now compare personal memoirs, observations by WRA officials, and oral histories with the many photos and sketches of camp life from different collections.

Oral history—memories, thoughts and motivations behind people's actions, the details of day-to-day life—may elude other primary sources. It has been particularly valuable in documenting perspectives often marginalized from the mainstream historical record.

The Japanese American National Museum's Life History Program has joined several other excellent programs in the quest to recover Japanese American history through oral history. No longer is history told about them; it is told by them. Moreover, the anti-Japanese hysteria of World War II forced many families to destroy irreplaceable photographs, letters, and other documentation. These historical records would have been very valuable to researchers today. Oral history offers one way in which some of this lost history can be recovered.

That night we all felt as if we were in still having a nightmare. Obasan called and told about what was happening in L.A. That night we all went to sleep wondering what was going to happen to us. Little did I know then that one year from then I would be in Heart Mt. Wyo. in a evacuation camp.

— From the diary of Stanley Hayami, a Japanese American student from Los Angeles who attended high school at the Heart Mountain concentration camp in Wyoming, Japanese American National Museum

Technology Infrastructure

In addition to the JARDA Web site, access to the digital content will be available through catalog records in UC's Melvyl union library catalog and by collection inventories in the OAC. These inventories, or finding aids, are created with an SGML document type definition called the Encoded Archival Description (EAD). The JARDA electronic texts have been marked up according to the Text Encoding Initiative (TEI) standard. The use of cataloging and structured text markup standards reflects the CDL's strong belief in the use and promulgation of standards to assist both technological innovation and long-term access and preservation.

By virtue of its reliance on markup standards, technology to store and present structured text and images, and distributed production of materials, JARDA is also a proving ground for building and managing the technological infrastructure to support digital libraries. For example, the CDL has established image and metadata standards that partners must meet. Creating a flexible architecture for the JARDA collection, one that can manage different types of digital objects and allow the possibility of federating the material with still other distributed collections, including those outside California, is also a goal.

"Field with Stumps." Photographs Rohwer Relocation Center. Yoshikawa Family Collection, University of the Pacific.

Envisioning an Online Future

Of course the CDL vision of technologically federated collections of high-quality research and source materials is shared well beyond California. The U.S.-based Digital Library Federation (www.diglib.org), of which the CDL is a member, is largely motivated by this vision. A similar dream is arguably driving the National Science Foundation's multi-million dollar, internationally connected Digital Library Initiative (www.dli2.nsf.gov/) and the many research projects it supports.

In fact, the synergies that emerge from the interactions among these programs and projects, even those of modest proportion such as JARDA, are grander than what may be first inferred from the notion of federated library collections. When their attempts—to coordinate interoperability between entire collections, to investigate such possibilities as self-describing collections and visually based search results and navigation, to provide access to the "dark matter" buried in online multimedia collections not reached by the surface of the Web and its search engines—are combined with fundamental network developments such as Internet2, then the "digital library" domain actually emerges as a defining force in the future of an online world.

Digital Library Standards

The grandparent of digital library standards is the Machine Readable Catalog Record, aka MARC format, which codifies the electronic description of books, journals, and other library holdings so that those materials can be represented in databases (see www.loc.gov/marc/ for more information). The electronic equivalent of paper-based library catalog cards in many ways, MARC underlies all online library catalogs and is the precursor to current efforts to efficiently share information and interoperate systems.

But the pantheon of digital library standards has grown significantly since the 1960s-era origin of the MARC format. Digital library standards can be generalized to include those related to digital information objects and those related to the communication between the systems that hold the objects and the users who seek them.

Systems communication for digital libraries has come to depend upon, and assume the ubiquity of, Internet standards such as TCP/IP and HTTP. In fact, a reading of Tim Berners-Lee's intentions for the Web reveals that it was founded upon a desire to provide access to research information, a decidedly library-like goal. In addition, the digital library community has, for example, standards for forwarding a user query from one system to another one with a different native query language and to discover and then "harvest" the descriptive records a system contains (see www.openarchive.org/ for more information on the Open Archives Protocol for Metadata Harvesting).

Of course, standards for the creation and encoding of digital objects—for simplicity's sake, think of either a "born" digital object or a digital replication of a non-digital object such as a photograph, journal article, film, or sound recording—determine how they can be used and how long they will be usable. The challenges in the capture and encoding of the content itself are often addressed in the form of "best practices" for a project or institution. For example, the California Digital Library has, or is developing, standards for digital content, both to assure its quality and to increase the likelihood that the content will persist through time and changing technologies. Where possible, the CDL—and most other digital library initiatives—adopt or capitalize upon proven or promising standards from related domains. This is especially true for digital content in which structured-text standards such as SGML and XML, and multimedia standards such as MPEG and TIFF, rule.

Representing digital objects through descriptions of their content, or structure, or details of their production, ownership, or limits on use accounts for a surprising number of standards. Because these "metadata" structures and records are necessary in order to discover, retrieve, exchange, and describe digital content, they often receive more attention than the objects themselves. Among digital library metadata standards—which include the MARC format mentioned above and others like CORC, MOA2, Encoded Archival Description, and VRA—the Dublin Core standard stands out. Dublin Core was proposed as a standard for a set of descriptive elements that nearly all digital objects had in common. Included are 15 basic descriptive elements, such as the objects' creators, and dates of creation.

The attempt to create a universal set of descriptors and the complexity of such a seemingly straightforward task are typical of digital library standards. It is a continuation of a rich tradition of noble intent to make information available more easily.

Collaboration Benefits All Patrons

The Bancroft Library, UC Berkeley's main library for special collections, has been a key member of the California Digital Library (CDL) consortium since its inception. With one of the largest collections of any library of paper records and archival photos of the Japanese American internment experience, Bancroft was a significant contributor to the JARDA project (http://jarda.cdlib.org/). According to Theresa Salazar, curator of Bancroft Collections, Western Americana, the JARDA project demonstrates many of the benefits of collaboration.

Salazar notes that Bancroft curators sifted through millions of records to select the ones most appropriate for inclusion in the JARDA digital archive. "We have a huge collection of government records, as well as personal papers and over 7,000 photos. Our role was to select out of the big universe of documents those that were most useful to a large audience, those that make coherent sense and can stand on their own." At the same time curators worked with other libraries, looking at what they could contribute. For instance, she notes, "UCLA has a large collection of Manzanar documents, so we decided to not contribute a lot of material on that camp. Working with other libraries we came up with a well-rounded collection."

Curatorial collaboration and digitization benefit researchers who can then find many primary source documents in one convenient place, the CDL Web site. Instead of personally visiting many different libraries, students and academics can simply pore through the digital archives, using the CDL's unique inventories, or finding aids. While some people will still have to visit the actual library to examine some documents, others will find all they need in the archives. Collaborating on the project has also helped each of the libraries, says Salazar. "Some of the institutions, like the Bancroft Library, have a great deal of technical support available, but others don't," she says. "Those libraries can use the technical resources of the better-staffed institutions to get their materials included in this digital archive." Finally, she notes that working together also helps librarians discover the unique holdings of other libraries, information they can pass on to their patrons.

JARDA is the first thematic collection produced by the California Digital Library, but others are on the way. Currently in development are the California Cultures project, which will survey many groups that make up California's diverse population, and Museums and the Online Archive of California (MOAC), which will integrate more than 75,000 digital images from ten California museum collections. More information on that project is available at www.bampfa.berkeley.edu/moac/.

Featured