Universities Working to Make Library Metadata Searchable on the Web

With a $4 million Mellon grant, Stanford Libraries is leading the shift to a "linked data" metadata environment.

keyboard with cord leading to books on a shelf

Since the 1960s, academic libraries have been using their own standards for the communication of metadata about resources in their catalogs. Originally designed for magnetic tape-based computers, machine-readable cataloging (MARC) standards are only understood by library systems. Failure to speak the language of the web has isolated libraries from the broader world of information developing there.

Determined to take advantage of the semantic web, Stanford Libraries is working with the libraries of Cornell, Harvard and the University of Iowa to continue the development of a "linked data" metadata environment.

Only libraries can understand what any of the MARC encoding means, explained Philip Schreur, associate university librarian for technical services at Stanford Libraries. "When a company like Google gets that data, it just sees an incomprehensible mass of free text. We are trying to shift to linked data so we can use well-articulated identifiers for things like people, subjects or dates. Then when people search for something on the web, they can actually identify what all those bits of data are and make the results much cleaner."

Libraries have been working on this effort for several years. "We have reached the point where we think we can now make the shift toward this new way of encoding the data," Schreur said.

With a $4 million grantfrom theAndrew W. Mellon Foundation, Stanford is also collaborating with the Program for Cooperative Cataloging (PCC) and the Library of Congress to expand the number of libraries implementing linked data. (PCC is a membership organization of U.S. libraries set up to develop cataloging procedures that libraries will abide by. It provides the community with a forum for the development of policy and training programs for member libraries.)

Stanford is developing a cloud-based sandbox environment that will allow the community to access, adopt and implement linked data. "We expect to have that sandbox up and running by Jan. 1 of next year," Schreur said.

The grant team also will focus on the creation of open source tools and policies to be adopted across the academic library community for transitioning to and implementing linked data.

The transition from MARC to linked data has been a struggle, Schreur added. "All of our very expensive internal systems make use of the MARC system. We buy a lot of data from vendors and they supply it to us in that format. So although it makes the data understandable on the web, the shift toward linked data is a very expensive shift to have to make. Many people and vendors are reluctant to do it just because of the expense involved."

There are many policy decisions to be worked through in the transition. "In the environment we have now, the data is contained and you can stamp it with an award of quality and everyone knows what it means," Schreur explained. "But if we are sharing data in an open environment, how do we assure that same level of quality and consistency to people who want to use the data?"

Among the advantages of linked data, Schreur said, will be access to many more international resources. "There is a lot of data created by libraries such as the national libraries of Germany and France, that once we make this shift, we will be able to make available. It really expands what resources the library can present to people."

About the Author

David Raths is a Philadelphia-based freelance writer focused on information technology. He writes regularly for several IT publications, including Healthcare Innovation and Government Technology.

Featured

  • interconnected glowing nodes and circuits in blue and green, forming a neural network on a dark background with a futuristic design

    Tech Giants Launch $100 Billion AI Infrastructure Network Project

    OpenAI, SoftBank, and Oracle have unveiled a new venture, Stargate, through which they aim to build a massive AI infrastructure network across the United States. The initiative, which was announced at the White House with President Donald Trump, has been described as the "largest AI infrastructure project in history."

  • Two figures, one male and one female, stand beside a transparent digital interface displaying AI symbols like neural networks, code, and a shield, against a clean blue gradient background.

    Report Makes Business Case for Responsible AI

    A new report commissioned by Microsoft and published last month by research firm IDC notes that 91% of organizations use AI tech and expect more than a 24% improvement in customer experience, business resilience, sustainability, and operational efficiency due to AI in 2024.

  • blue and green lines intersecting and merging in an abstract pattern against a light gray background with a subtle grid design

    Data Integration Market: Cloud Giants Down, AI Up

    "By 2027, AI assistants and AI-enhanced workflows incorporated into data integration tools will reduce manual intervention by 60 percent and enable self-service data management," according to research firm Gartner.

  • Two autonomous AI figures performing tasks in a tech environment; one interacts with floating holographic screens, while the other manipulates digital components

    Agentic AI Named Top Tech Trend for 2025

    Agentic AI will be the top tech trend for 2025, according to research firm Gartner. The term describes autonomous machine "agents" that move beyond query-and-response generative chatbots to do enterprise-related tasks without human guidance.