Universities Working to Make Library Metadata Searchable on the Web

With a $4 million Mellon grant, Stanford Libraries is leading the shift to a "linked data" metadata environment.

keyboard with cord leading to books on a shelf

Since the 1960s, academic libraries have been using their own standards for the communication of metadata about resources in their catalogs. Originally designed for magnetic tape-based computers, machine-readable cataloging (MARC) standards are only understood by library systems. Failure to speak the language of the web has isolated libraries from the broader world of information developing there.

Determined to take advantage of the semantic web, Stanford Libraries is working with the libraries of Cornell, Harvard and the University of Iowa to continue the development of a "linked data" metadata environment.

Only libraries can understand what any of the MARC encoding means, explained Philip Schreur, associate university librarian for technical services at Stanford Libraries. "When a company like Google gets that data, it just sees an incomprehensible mass of free text. We are trying to shift to linked data so we can use well-articulated identifiers for things like people, subjects or dates. Then when people search for something on the web, they can actually identify what all those bits of data are and make the results much cleaner."

Libraries have been working on this effort for several years. "We have reached the point where we think we can now make the shift toward this new way of encoding the data," Schreur said.

With a $4 million grantfrom theAndrew W. Mellon Foundation, Stanford is also collaborating with the Program for Cooperative Cataloging (PCC) and the Library of Congress to expand the number of libraries implementing linked data. (PCC is a membership organization of U.S. libraries set up to develop cataloging procedures that libraries will abide by. It provides the community with a forum for the development of policy and training programs for member libraries.)

Stanford is developing a cloud-based sandbox environment that will allow the community to access, adopt and implement linked data. "We expect to have that sandbox up and running by Jan. 1 of next year," Schreur said.

The grant team also will focus on the creation of open source tools and policies to be adopted across the academic library community for transitioning to and implementing linked data.

The transition from MARC to linked data has been a struggle, Schreur added. "All of our very expensive internal systems make use of the MARC system. We buy a lot of data from vendors and they supply it to us in that format. So although it makes the data understandable on the web, the shift toward linked data is a very expensive shift to have to make. Many people and vendors are reluctant to do it just because of the expense involved."

There are many policy decisions to be worked through in the transition. "In the environment we have now, the data is contained and you can stamp it with an award of quality and everyone knows what it means," Schreur explained. "But if we are sharing data in an open environment, how do we assure that same level of quality and consistency to people who want to use the data?"

Among the advantages of linked data, Schreur said, will be access to many more international resources. "There is a lot of data created by libraries such as the national libraries of Germany and France, that once we make this shift, we will be able to make available. It really expands what resources the library can present to people."

About the Author

David Raths is a Philadelphia-based freelance writer focused on information technology. He writes regularly for several IT publications, including Healthcare Innovation and Government Technology.

Featured

  • three main icons—a cloud, a user profile, and a padlock—connected by circuit lines on a blue abstract background

    Report: Identity Has Become a Critical Security Perimeter for Cloud Services

    A new threat landscape report points to new cloud vulnerabilities. According to the 2025 Global Threat Landscape Report from Fortinet, while misconfigured cloud storage buckets were once a prime vector for cybersecurity exploits, other cloud missteps are gaining focus.

  • two large brackets facing each other with various arrows, circles, and rectangles flowing between them

    1EdTech Partners with DXtera to Support Ed Tech Interoperability

    1EdTech Consortium and DXtera Institute have announced a partnership aimed at improving access to learning data in postsecondary and higher education.

  • The AI Show

    Register for Free to Attend the World's Greatest Show for All Things AI in EDU

    The AI Show @ ASU+GSV, held April 5–7, 2025, at the San Diego Convention Center, is a free event designed to help educators, students, and parents navigate AI's role in education. Featuring hands-on workshops, AI-powered networking, live demos from 125+ EdTech exhibitors, and keynote speakers like Colin Kaepernick and Stevie Van Zandt, the event offers practical insights into AI-driven teaching, learning, and career opportunities. Attendees will gain actionable strategies to integrate AI into classrooms while exploring innovations that promote equity, accessibility, and student success.

  • illustration of a football stadium with helmet on the left and laptop with ed tech icons on the right

    The 2025 NFL Draft and Ed Tech Selection: A Strategic Parallel

    In the fast-evolving landscape of collegiate football, the NFL, and higher education, one might not immediately draw connections between the 2025 NFL Draft and the selection of proper educational technology for a college campus. However, upon closer examination, both processes share striking similarities: a rigorous assessment of needs, long-term strategic impact, talent or tool evaluation, financial considerations, and adaptability to a dynamic future.