The Digital Library Triumvirate: Content, Collaboration, and Technology
A single point of access to digital materials for historical and cultural
scholarship, the Japanese American Relocation Digital Archive (JARDA) is a model
of a collaboratively constructed gateway to the documentation of an era.
Few events of World War II had as dramatic an effect within the United States
or upon a population as did Executive Order #9066, which resulted in the evacuation
and internment of thousands of Japanese Americans. The War Relocation Authority
(WRA), the agency created to assume jurisdiction over the evacuees, controlled
the relocation centers, alternatively labeled "relocation camps,"
"concentration camps," or "evacuation centers." These WRA
camps housed more than 120,000 Japanese Americans for over four years.
The experiences of the adults, young adults, and children—the majority of whom
were U.S. citizens—in internment camps during the war is richly documented in
the holdings of several California repositories. Among the 10,000 images and
20,000 pages of electronic transcriptions of documents and oral histories are
diaries, letters, drawings, and photographs from internees at the 11 camps.
Also included are camp newsletters, final reports, photographs, and other documents
relating to the day-to-day administration of the camps by the WRA.
In September 1998, the members of the California Digital Library's Online Archive
of California (OAC) identified the need to develop digital archives for "thematic
collections," and thus was born the Japanese American Relocation Digital
Archive (JARDA). As with several CDL projects, JARDA is supported by both university
and grant funds. JARDA development was supported by the U.S. Institute of Museum
and Library Services under the provisions of the Library Services and Technology
Act, administered in California by the State Librarian.
Focus on Collaboration
Academic digital library initiatives range from the digitization of specific
collections of materials to elaborate state, national, or international organizations
devoted to developing innovative technologies for providing information services.
Both modest and large-scale projects have high-quality digital content—and its
selection, design, management, preservation, and tools to ensure its discovery
and availability—at their core. A scan of the activities of the 20+ members
of the Digital Library Federation provides an informative overview of the scope
of digital library activities (see www.diglib.org/).
To that focus on high-quality content, the California Digital Library has added
a particularly strong focus on collaboration. From its inception in 1997 as
a "co-library" of the ten campuses of the University of California,
the CDL has embraced collaboration both within and well beyond the university
as essential to the pursuit of its goals. Collaboratively designed and built,
JARDA serves as a gateway to the personal and institutional documentation of
an era and also to scholarship mined from newly available digital cultural heritage
JARDA collaborators currently include the California Historical Society; the
California State Archives, California State University at Fullerton; the Japanese
American National Museum, UC Berkeley's Bancroft Library, UCLA's Young Research
Library Department of Special Collections, the University of the Pacific (UOP);
and the University of Southern California (USC).
These institutions, along with
nearly forty others, are all members of the CDL's Online Archive of California
of which JARDA is but one component.
the sites of the centers considered farming opportunities to assist in
the sustenance of the evacuees without drawing on the outside farm production.
In nearly all cases it required subjugation of untried farming land, clearing
of brush and rocks, building canals and irrigation ditches, turning over
—Robert B. Cozzen,
assistant director, War Relocation Authority, February 26, 1946. Photographer:
Dorothea Lange. Manzanar, California
Content: Primary Source
Materials To build JARDA, the CDL began with a survey of primary source materials
in California-based libraries and archives. The survey identified the heavily
used Japanese American internment collections throughout the state as a high
priority for digitization. Curators at all Online Archive of California (OAC)
repositories reported that Japanese American internment collections are among
the most-requested materials, but the fact that they are so widely scattered
has presented a serious discovery-and-access problem.
The JARDA gateway provides a single point of access for college and university
students and faculty, students and teachers in kindergarten through high school,
and the general public. Both the UCLA and UC Berkeley Libraries are actively
engaged in projects that have demonstrated the importance of incorporating primary
sources into the curriculum. These projects have confirmed the need to make
Japanese American Relocation collections more accessible to the community. The
OAC's statewide union database of finding aids provides the means to integrate
collections in a Web-accessible "virtual archive," forming the premier
source of digital documentation on the subject.
JARDA is the first thematic collection added to the larger Online Archive of
California. The strategy of building thematic "virtual collections"
(other California cultural communities are also under consideration) is a deliberate
strategy not only to make high-demand materials available in digital form, but
also to encourage new analysis and scholarship drawn from those materials to
flourish. The ideal digital library leverages collaboration to provide access
to high-quality content and also encourages, indeed enables, the creation of
new scholarship. The CDL's eScholarship (http://escholarship.cdlib.org/)
program of supporting innovation in scholarly communication is based upon that
belief. eScholarship is working with UC Press and JARDA participants to identify
new scholarly publications associated with Japanese American Relocation materials.
The CDL and its partners are enthusiastic about the scholarship that can be
supported by this single access point for geographically dispersed source materials.
For example, a history student sitting in her dorm room can now compare personal
memoirs, observations by WRA officials, and oral histories with the many photos
and sketches of camp life from different collections.
Oral history—memories, thoughts and motivations behind people's actions, the
details of day-to-day life—may elude other primary sources. It has been particularly
valuable in documenting perspectives often marginalized from the mainstream
The Japanese American National Museum's Life History Program has joined several
other excellent programs in the quest to recover Japanese American history through
oral history. No longer is history told about them; it is told by them. Moreover,
the anti-Japanese hysteria of World War II forced many families to destroy irreplaceable
photographs, letters, and other documentation. These historical records would
have been very valuable to researchers today. Oral history offers one way in
which some of this lost history can be recovered.
That night we all
felt as if we were in still having a nightmare. Obasan called and told
about what was happening in L.A. That night we all went to sleep wondering
what was going to happen to us. Little did I know then that one year from
then I would be in Heart Mt. Wyo. in a evacuation camp.
— From the diary
of Stanley Hayami, a Japanese American student from Los Angeles who attended
high school at the Heart Mountain concentration camp in Wyoming, Japanese
American National Museum
In addition to the JARDA Web site, access to the digital content will be available
through catalog records in UC's Melvyl union library catalog and by collection
inventories in the OAC. These inventories, or finding aids, are created with
an SGML document type definition called the Encoded Archival Description (EAD).
The JARDA electronic texts have been marked up according to the Text Encoding
Initiative (TEI) standard. The use of cataloging and structured text markup
standards reflects the CDL's strong belief in the use and promulgation of standards
to assist both technological innovation and long-term access and preservation.
By virtue of its reliance on markup standards, technology to store and present
structured text and images, and distributed production of materials, JARDA is
also a proving ground for building and managing the technological infrastructure
to support digital libraries. For example, the CDL has established image and
metadata standards that partners must meet. Creating a flexible architecture
for the JARDA collection, one that can manage different types of digital objects
and allow the possibility of federating the material with still other distributed
collections, including those outside California, is also a goal.
"Field with Stumps."
Photographs Rohwer Relocation Center. Yoshikawa Family Collection, University
of the Pacific.
Envisioning an Online Future
Of course the CDL vision of technologically federated collections of high-quality
research and source materials is shared well beyond California. The U.S.-based
Digital Library Federation (www.diglib.org),
of which the CDL is a member, is largely motivated by this vision. A similar
dream is arguably driving the National Science Foundation's multi-million dollar,
internationally connected Digital Library Initiative (www.dli2.nsf.gov/)
and the many research projects it supports.
In fact, the synergies that emerge from the interactions among these programs
and projects, even those of modest proportion such as JARDA, are grander than
what may be first inferred from the notion of federated library collections.
When their attempts—to coordinate interoperability between entire collections,
to investigate such possibilities as self-describing collections and visually
based search results and navigation, to provide access to the "dark matter"
buried in online multimedia collections not reached by the surface of the Web
and its search engines—are combined with fundamental network developments such
as Internet2, then the "digital library" domain actually emerges as
a defining force in the future of an online world.
Digital Library Standards
The grandparent of
digital library standards is the Machine Readable Catalog Record, aka
MARC format, which codifies the electronic description of books, journals,
and other library holdings so that those materials can be represented
in databases (see www.loc.gov/marc/
for more information). The electronic equivalent of paper-based library
catalog cards in many ways, MARC underlies all online library catalogs
and is the precursor to current efforts to efficiently share information
and interoperate systems.
But the pantheon of
digital library standards has grown significantly since the 1960s-era
origin of the MARC format. Digital library standards can be generalized
to include those related to digital information objects and those related
to the communication between the systems that hold the objects and the
users who seek them.
for digital libraries has come to depend upon, and assume the ubiquity
of, Internet standards such as TCP/IP and HTTP. In fact, a reading of
Tim Berners-Lee's intentions for the Web reveals that it was founded upon
a desire to provide access to research information, a decidedly library-like
goal. In addition, the digital library community has, for example, standards
for forwarding a user query from one system to another one with a different
native query language and to discover and then "harvest" the
descriptive records a system contains (see www.openarchive.org/
for more information on the Open Archives Protocol for Metadata Harvesting).
Of course, standards
for the creation and encoding of digital objects—for simplicity's
sake, think of either a "born" digital object or a digital replication
of a non-digital object such as a photograph, journal article, film, or
sound recording—determine how they can be used and how long they
will be usable. The challenges in the capture and encoding of the content
itself are often addressed in the form of "best practices" for
a project or institution. For example, the California Digital Library
has, or is developing, standards for digital content, both to assure its
quality and to increase the likelihood that the content will persist through
time and changing technologies. Where possible, the CDL—and most
other digital library initiatives—adopt or capitalize upon proven
or promising standards from related domains. This is especially true for
digital content in which structured-text standards such as SGML and XML,
and multimedia standards such as MPEG and TIFF, rule.
objects through descriptions of their content, or structure, or details
of their production, ownership, or limits on use accounts for a surprising
number of standards. Because these "metadata" structures and
records are necessary in order to discover, retrieve, exchange, and describe
digital content, they often receive more attention than the objects themselves.
Among digital library metadata standards—which include the MARC format
mentioned above and others like CORC, MOA2, Encoded Archival Description,
and VRA—the Dublin Core standard stands out. Dublin Core was proposed
as a standard for a set of descriptive elements that nearly all digital
objects had in common. Included are 15 basic descriptive elements, such
as the objects' creators, and dates of creation.
The attempt to create
a universal set of descriptors and the complexity of such a seemingly
straightforward task are typical of digital library standards. It is a
continuation of a rich tradition of noble intent to make information available
The Bancroft Library,
UC Berkeley's main library for special collections, has been a key member
of the California Digital Library (CDL) consortium since its inception.
With one of the largest collections of any library of paper records and
archival photos of the Japanese American internment experience, Bancroft
was a significant contributor to the JARDA project (http://jarda.cdlib.org/).
According to Theresa Salazar, curator of Bancroft Collections, Western
Americana, the JARDA project demonstrates many of the benefits of collaboration.
Salazar notes that
Bancroft curators sifted through millions of records to select the ones
most appropriate for inclusion in the JARDA digital archive.
have a huge collection of government records, as well as personal papers
and over 7,000 photos. Our role was to select out of the big universe
of documents those that were most useful to a large audience, those that
make coherent sense and can stand on their own." At the same time
curators worked with other libraries, looking at what they could contribute.
For instance, she notes, "UCLA has a large collection of Manzanar
documents, so we decided to not contribute a lot of material on that camp.
Working with other libraries we came up with a well-rounded collection."
and digitization benefit researchers who can then find many primary source
documents in one convenient place, the CDL Web site. Instead of personally
visiting many different libraries, students and academics can simply pore
through the digital archives, using the CDL's unique inventories, or finding
aids. While some people will still have to visit the actual library to
examine some documents, others will find all they need in the archives.
Collaborating on the project has also helped each of the libraries, says
Salazar. "Some of the institutions, like the Bancroft Library, have
a great deal of technical support available, but others don't," she
says. "Those libraries can use the technical resources of the better-staffed
institutions to get their materials included in this digital archive."
Finally, she notes that working together also helps librarians discover
the unique holdings of other libraries, information they can pass on to
JARDA is the first
thematic collection produced by the California Digital Library, but others
are on the way. Currently in development are the California Cultures project,
which will survey many groups that make up California's diverse population,
and Museums and the Online Archive of California (MOAC), which will integrate
more than 75,000 digital images from ten California museum collections.
More information on that project is available at www.bampfa.berkeley.edu/moac/.